Guidelines for Transportation Management Systems
2. Maintenance Considerations and Activities
This chapter provides an overview of the issues and activities associated with the maintenance of traffic management centers and the ITS devices they control. It introduces the topic and examines various aspects of maintenance and how an Agency might plan maintenance to ensure operational effectiveness as the TMS is used in the future.
The dictionary definition of maintenance is: "the upkeep of property or equipment." There are a range of TMS components that require upkeep. These fall into three categories: control central, roadside, and communications. From a maintenance viewpoint, these categories make sense in that the work at the roadside requires differing equipment, techniques, procedures, and skills that are typically required at control central. Note that the communications system requires a different set of skills, unique from the others.
The property or equipment in a TMS that needs some form of maintenance includes, but is not limited to, the following:
This chapter describes the various types of maintenance, their requirements and the support activities that are needed. It describes how a maintenance plan can be developed and how maintenance services can be procured. It also discusses staffing estimates and describes the maintenance procedures for both central and roadside devices
The maintenance needs for ITS devices are very diverse. Because the range of potential maintenance actions is so broad, a wide variety of expertise and skills is required. A listing of preventive maintenance actions by device type is included in Appendix B. These detailed actions for each ITS device are typical examples. The various devices from differing manufacturers may have other maintenance requirements. The author of a maintenance plan should start with the manufacturer's procedures to ensure that needs of the particular equipment are satisfied and that any warranties are kept valid.
The maintenance concept is designed to articulate the essential reliability and performance measures necessary to meet stated operational concepts. Just as the concept of operations drives the system functional requirements, the maintenance concept drives the Maintenance Requirements. The use of concepts as part of the guidelines development is contained in Chapter 3, TMS Maintenance Concept & Requirements.
The development of a maintenance plan includes design and implementation procedures it is also life-cycle process. It needs to recognize that most systems are built incrementally and are expanded over time. Expansion can be both functional and geographical, both of which impact that maintenance planning. There is a need to provide feedback and assessment of the maintenance operations with each incremental deployment phase so that future phases build on and expand the system, rather than simply replace elements of the earlier phases. The development of maintenance plans within the life-cycle of the system in expanded upon in Chapter 4, Maintenance Considerations for the Life-Cycle of a TMC.A maintenance program provides a plan on what maintenance is, how it is performed, how it can be budgeted, and why it is needed. It is a document that describes the needs to persons outside the direct department with maintenance responsibilities and provides guidance to those within that department. Chapter 5 describes the development of a TMS maintenance program. It defines maintenance program planning, including objective, institutional issues and implementation procedures. This Chapter also describes the relationship with operations programs.
Maintenance is essential to keeping a TMS running in the manner for which it was designed. Lack of maintenance leads, almost inevitably, to escalating decline. Just as not painting a wooden house can lead to expensive structural damage, lack of TMS maintenance has severe consequences. In one location, lack of maintenance of the communications structure caused so many cameras to stop working that the program for video monitoring was abandoned and the investment lost. In another location, lack of maintenance of roadside emergency telephones was so poor that the majority did not work. This resulted in such negative public outcry that the system had to be abandoned.
Most maintenance activities can be grouped into one of three categories: preventive, responsive, and emergency.
Preventive maintenance consists of scheduled operations performed to keep the systems operating. Preventive maintenance includes mundane operations, such as cleaning camera housing faces and the front of DMS. It can include mechanical functions, such as greasing barriers for tollbooths and ramp control. In some cases, preventive maintenance requires sophisticated technology, such as optical testing equipment to ensure that the fiber-optic used in the communications system is operating within acceptable parameters. Preventive maintenance is initiated by a schedule.
Responsive maintenance refers to operations that are initiated by a fault or trouble report. The report can come either from a person or from software that is monitoring parts of the system. Most general faults fall into the responsive maintenance category. The table in Appendix A shows a list of typical trouble calls from the control center in Northern Virginia. Most of these calls are responded to by the maintenance crews within a few hours. However, some faults can require days or weeks to repair. Problems can occur in securing new parts, e.g., when new power connections are needed from the local utility company, etc.
Emergency maintenance is similar to responsive maintenance in that it is initiated by a fault or trouble report. However, in this case, the fault is more serious and requires immediate action. Events such as knockdowns, spills, exposed power supplies, and road blockages are clear examples of reports that may require emergency maintenance. There can also be operational emergencies, such as stuck barriers on dedicated HOV lanes or failed lane control signs, such as those indicating shoulder usage. These sorts of conditions often constitute emergencies that need to be dealt with quickly in order not to create additional hazards.
Agencies, when developing a maintenance program, should plan for each set of maintenance conditions. The various types of actions that the maintenance staff provides should be categorized to ensure the most efficient use of resources. To have staff or contractors available on a permanent basis is an expensive option. Typically, preventive maintenance is undertaken "loosely" as scheduled and responsive maintenance is performed as needed. Emergency maintenance nearly always takes precedent.
Although most public Agencies do not have formal policies concerning maintenance of their TMS, there are some policy issues that need to be considered in the design of a maintenance system. The first is creating a balance between preventive and responsive maintenance. More preventive maintenance will cause the equipment to be in a better state of repair, meaning it will likely break down less often, thus requiring less responsive maintenance. Consequently, it is generally judicious for an Agency to select preventive maintenance as its priority. This approach is supported by evidence indicating that to rely solely on responsive maintenance (i.e., waiting for things to break) is not good policy. After all, malfunctions are generally first noticed when a device is used, which is the worst time to detect a failure.
In cases where there are very large numbers of devices, it may be judicious to adopt a policy stating what percentage of specific devices should be operable. Although it is desirable to have all devices fully functional, in some cases there are inadequate resources to achieve this state. Examples could include the number of operating loops or the numbers of bulbs that need changing. Such policies can assist the maintenance staff in their scheduling efforts.
In general, an Agency's maintenance policies should be comprehensive and compatible with the broader policies of the Agency and its traffic program generally. Obviously, a traffic operation that is mission critical should receive greater maintenance priority than should a trivial system. For instance, barriers and gates associated with high occupancy vehicle (HOV) entries or various other flow control mechanisms should be given high priority. For these and other systems containing similar devices, there needs to be a contingency for the event of control failure.
For example, if a system has a series of barriers and signs that control the direction of flow on an HOV facility and the control system fails, it is unacceptable — and dangerous — for the facility to remain inoperative while programmers are trying to restart the control system. A suitable maintenance policy would mandate back operations that, firstly, ensure safety and, secondly, allow the facility to operate without a control system. This is typically accomplished by a technician physically driving through the system and manually closing the gates and setting the appropriate signs. Later, he or she reverses the procedure to open the facility in the opposite direction.
Other facilities, such as tolled elements, may wish to allocate other priorities, e.g., revenue collection. The original design for the Oakland Bay Bridge control system was based on a flow-and-revenue maximizing routine using micro-loops on the top of the bridge to control the signals at the access point. From a maintenance perspective, the priority in the Singapore electronic road pricing (ERP) system places a large premium on maintenance of revenue. The control equipment is housed in a series of small air-conditioned buildings throughout the controlled part of the network. The air conditioners are not always reliable — temperature monitors close down the ERP when the air conditioners fail. Their policy is to maximize revenue and thus backup air conditioners are used to keep the ERP operational. Backups, various levels of redundancy can all be used to support the policy mandated for the system. These often express themselves as graceful degradation as part of the reduced operations as systems fail. However, such operational processes need to have the equipment maintained and, most importantly, be regularly checked by exercising the components that come into play during a failure mode.
Other elements that may be adopted as policy issues include:
Quality-control programs are also tools that can be used within a maintenance program. The large systems that depend on traceability, such as ISO 9000, are probably not suitable for a maintenance department. However, Agencies could, as a matter of policy, adopt a range of quality control procedures that would enhance their maintenance programs. These would need to be adapted to each Agency's services but could include such items as:
Any maintenance policy should be developed within the framework of all the other policies that are in operation by the Agency. Several of the broader policy issues mentioned above are common in many State Departments of Transportation. Existing policies should be reviewed in light of the operations of a maintenance program. For example, some states have policies relating to working above moving traffic (say, a requirement that all tools and removable devices be tethered when no safety net is present). This type of policy affects the design of installed devices, the tools the maintenance crew uses, and the procedures that take place during maintenance tasks.
Maintenance policies should be regularly reviewed and made part of the maintenance program and incorporated into employee training.
A continuing problem for many government Agencies is recruiting and retaining personnel that possess the skills necessary to operate and maintain the sophisticated hardware associated with computer-based traffic systems. Proper maintenance of systems can require salary schedules higher than typical maintenance or electrician rates, which Agencies are often unable to pay. Accordingly, some Agencies have determined that the best alternate is to use outside contractors. These contractors can also be hired to supplement regular staff and to stock specialized spare parts during emergencies.
In some cases the Agency may have restrictions on hiring new staff. This usually means that the maintenance must be outsourced to an independent third-party company. States tend to be familiar with this process as it is often used for less technological work such as mowing, restriping, or bulb changing. Thus maintenance contracts are frequently in place, which allows TMS maintenance to be procured. However, TMS maintenance is different in that there is a range of technical skills that are not readily available. Furthermore, it is frequently the case that TMS maintenance needs to be performed in a more timely manner then responding to emergency maintenance calls. This requirement to respond within a prescribed number of hours is not common in more routine maintenance operations.
Another aspect of the procurement of maintenance is whether it is the control center, the roadside equipment, or the communications system for which plans are being prepared. Maintenance of the control center often requires unique software skills that force the Agency to use either the developer or another equally adept company to do the work to keep the system operational. Similarly, the communications network has a unique requirement for technical equipment for fault finding and repair of fiber-optic networks. Roadside maintenance operations require another set of hardware, such as bucket trucks and traffic diversion equipment. Therefore the Agency, when contemplating maintenance, may consider a combination of contracting and Agency personnel to address all aspects of the TMS. The table, below, summarizes some of these issues.
Some Agencies split the maintenance operations along the lines suggested in Table 2-1. For instance, it is fairly common to contract out the communications system on an as-needed basis. This makes sense as there are no required preventive maintenance procedures associated with fiber-optic networks. They either work or they are broken and to keep a technically qualified employee waiting for a communications failure is not a good use of resources.
It is rare, but there are examples of state Agencies keeping programming staff as full-time employees to develop and maintain control systems. Although this gives an Agency greater flexibility to make changes and monitor the work of contractors, the state is vulnerable if the employee resigns — this can be a significant problem when such skills command high wages during technology booms. Additionally, it is sometimes difficult to find a career path within an Agency for an employee who is operating in an area often staffed by younger persons who have recently learned the latest technology. Although assistance in software development has its advantages, it can be considered to be more advantageous for Agencies to use their limited maintenance staff resources in other arenas.
When deciding whether to contract out maintenance and, if so, which elements to outsource, the following should be considered:
There are several ways in which the Agency can slice the maintenance pie. The slices can be central, roadside, or communications. It can also be preventive, responsive, or emergency. Any combination of these options can be contracted out or performed by Agency staff. Each Agency will need to consider how the maintenance of these elements can best be performed for its circumstances.
An alternative approach is to attach maintenance requirements to the initial contract to purchase the goods or services. This is particularly applicable for the case of the central software where, although the Agency may have all the source code and the build environment, it may lack the software maintenance skills. Also, even if the Agency has the required skills, it will take considerable time for the staff to become familiar with the detailed design of the software. However, if the TMS installation is considered successful, the addition of more ITS devices of a similar type to those already installed should not require the programming expertise. In this case, the modifications need to be database entries that can be performed by non-programmers.
The use of extended maintenance agreements with the contractor or vendor can be problematic. In one case, a low-bidding vendor front-end loaded all the maintenance costs. By the time the maintenance period occurred, the vendor had gone out of business and the Agency had almost no money left to pay for maintenance by other vendors. To avoid this problem, the costs for maintenance need to be expressed in the scope of work as some percentage of the item costs that is large enough to ensure that the contractor will stay around to maintain the system. In one instance, the maintenance contractor was paid a fixed monthly amount for maintaining a series of traffic signals. Each month the signals were inspected for correct operation and the maintenance contractor was paid its maintenance fee proportional to the percent of equipment operating correctly. Thus the maintenance contractor routinely conducted its own inspection and ensured that everything was operating correctly prior to the Agency's inspection.
Management of contractors that are working on the preventive maintenance tasks has caused concern for some Agencies. Various Agencies can have huge numbers of devices spread over several hundred miles of road. Field inspection and verification of work completed is not always feasible. There is a need to keep track of the contractor, at least on some type of sampling basis, rather than wait for the monthly invoice to arrive and hope the work has been done. When structuring such maintenance contracts, it is advisable to require recordkeeping procedures ensuring that invoices can be verified against the work performed.
One method for solving this problem when responsive maintenance actions are involved is to require the contractor to record pertinent data on a control center database. The following types of data may be recorded:
Keeping the data current and available at the TMC allows the Agency to be prepared when claims are made and also permits tracking and control of inventory. An important maintenance activity is the prediction of failures and retention of spare parts. Ensuring that the contractor keeps this data up-to-date will facilitate these processes.
In the case of preventive maintenance, management of the contractor needs to be more flexible, since often the same crews are used. When no response calls are pending, the crews perform preventive maintenance, but are immediately reassigned when trouble reports are received. One solution is to allow the contractor remote access to the maintenance PC in the control center in order to log the next day's activities into the system. This makes the operators in the control center aware of the current activities. Since cleaning and repairs can be disruptive of traffic, coordination with the system operations staff should be a requirement. Another requirement should be to ensure voice communications between the operators and the crews at the roadside. This is important for both operational and safety reasons. Maintenance crews at the roadside can, for example, confirm that barriers rise and fall or that signs show the right messages.
Other examples of extended maintenance include Agencies paying fairly significant annual payments for hardware, in this case LED message signs that never failed and never had any maintenance performed. Thus an Agency contemplating maintenance contracts needs to anticipate the likely frequency of trouble calls and structure the contract differently for items likely to need little service.
When an Agency is structuring a maintenance contract for TMS elements, it should consider additional elements apart from the performance of preventive, responsive, or emergency maintenance. For example, a contract that is structured to maintain roadside devices needs to take into account inventory control and possibly the bar-coding of devices. Contracts to maintain the central software should address configuration management, back-up procedures, and disaster recovery. Addressing these types of issues will both structure the maintenance contracts and provide Agencies with a more integrated approach that is reflective of the complexity of the TMS components.
Outside contractors have been used successfully for operations and maintenance of traffic systems. For example, the New York State Department of Transportation has used outside contractors for operations and maintenance of INFORM since the inception of the system. Similarly, the Connecticut Department of Transportation uses outside contractors for operations and maintenance of its I-95 Incident Management System.
The contracting mechanism for many of the more sophisticated items, such as computer motherboards, hard drives, signal and DMS controllers, and communications equipment, is to require replacement under warranty for the first few years and then merely send the equipment back to the vendor for repair and replacement as-needed. This approach makes sense since much of this equipment is reliable, requires no preventive maintenance, and most of the units will work for years without a problem. However, this approach does rely on the equipment manufacturer to remain in business and, therefore, it is advisable to add a longevity criterion to the procurement terms. In other words, the supplier should be required to certify that it has been in this business for some minimum number of years and/or that more than a specified number of units have been installed. This approach also creates a need for a larger spare parts inventory to give the Agency more time to recover when a particular product no longer is available.
The types of contracting options that exist and which ones to consider for the various options are described in Chapter 7.An Agency assessment will be needed to identify the capabilities, services required to support all maintenance activities, and the tasks to support the maintenance program. For example, if it is required that 95 percent of all detectors should be operable at any one time, then this can be used in the assessment of whether this particular component should be contracted out or performed by in-house staff. There is a wide variation in these elements of a maintenance program. To develop a framework to assist the Agency in deciding to contract out each of the nine elements in the table, above, criteria conforming to the Agency's maintenance vision must be established. Stating that all the devices must work all the time does not help. Setting unrealistic goals will not assist the decision process. The reality is that many detectors are not working at any one time. Communications systems regularly fail and software hangs up. Within each contracting option a level needs to be determined that can be met both by a contractor and the Agency staff. When this is known, the decision concerning the best option to meet the maintenance program objectives can be made.
Maintenance software applications can be of particular value to TMS operations. The ranges of functions they can perform include:
Such software applications provide a series of screens that allow the operator to enter and view the current status of trouble calls and to view which ones are currently open. By using the repair data, comparisons can be made between equipment types, such as whether one camera manufacturer requires more repairs than another. These applications can hold inventory data and accept bar-code inputs. They provide location information for users. As systems grow, the number of devices that must be maintained can get quite large. The table, below, illustrates an approximate device count for the Northern Virginia Smart Traffic Center of VDOT in 2001. As shown, the number of devices is significant and knowing their location can become a problem. Devices move for a variety of reasons. The site can be abandoned; it can disappear as part of reconstruction or a maintenance activity. Equipment gets moved as repairs are made. In some cases, components of old equipment are used to repair other devices. Some Agencies have what is known as portable/permanent variable message signs (VMS). These are portable VMS's that are installed on a concrete pad with power and are used in that location during a seasonal period, e.g., to direct beach traffic. Knowing where everything is becomes a problem that can be addressed by some combination of inventory control software.
Support for the maintenance staff needs to include a variety of service resources and tools including administration, space, inventory storage, parking, and a variety of testing equipment.
The administration/support requirements for mainte-nance include staff hours to log actions that are taken, update inventory, keep track of contractor invoices, purchase orders, ensuring that training is performed, and making certain that responsive and emergency maintenance requests are addressed in a timely manner. In addition, if the maintenance is contracted, support is required to check invoices and monitor contractor performance.
Space is required in a secure environment to keep inventory. The value of 5-10 percent of the total installed hardware can be significant. Access to inventory may require some form of control. Space is also required for maintenance staff offices and for parking the maintenance vehicles that are likely to include a range of trucks and a cherry picker.
The specific tools required, in addition to the more obvious common tools used by technicians in workshops, could include: