Applying Archived Operations Data in Transportation Planning: A Primer
3. CONQUERING THE CHALLENGES OF USING ARCHIVED OPERATIONS DATA
Data, tools, and domain expertise are the three components required to effectively use archived operations data in planning. Without these three basic components, users of archived data will be unable to derive the meaningful insights that help to make informed decisions that intelligently move an agency forward (Figure 4).
Figure 4. Diagram. Components of an effective archived operations data archive.
Source: University of Maryland CATT Laboratory.
Data. Chapter 1 provided an overview of the various types of archived operations data. Data captured by an agency or third-party may include speeds, volumes, accidents, freight movements, transit schedule and ridership, agency assets, and others. However, simply knowing that this data exists and is in an archive someplace is not sufficient; stakeholder buy-in and understanding how to access the data also are needed.
Tools. Often overlooked, the tools that are available to agency staff are the most critical element in making use of archived operations data. These can include tools that fuse "siloed" data from disparate sources, the tools that fill in gaps ("missing data"), and those that identify or screen outliers. Other important tools support analytics and visualization that help the agencies "see" into the data—asking questions, identifying issues, deriving meaning from the data, and communicating those insights to others.
Domain expertise. With the right tools and the right data, anyone can review a data set and start to ask questions, but to ask the right questions, to interpret the resulting analysis, and to otherwise act on the information in the archive, domain expertise is needed. Domain experts include transportation professionals with specific technical skills in areas such as traffic operations, transportation planning, or statistical analysis.
There is a mixture of technical and political challenges within an institution associated with creating an effective archive. All can be overcome with varying degrees of effort. The remainder of this chapter focuses on some of the more prevalent institutional challenges associated with the data, tools, and domain expertise of an operations archive.
A common misconception about using archived operations data is that the technical challenges associated with the collection, fusion, and management of the data, along with the development of tools, are the greatest hurdles. In reality, these technical challenges are almost always easily solvable by qualified technical personnel. However, the institutional challenges (e.g., political, ideological, etc.) can create significant issues that delay and/or prevent operations data from being leveraged fully. The extent to which each of these issues may exist in each agency will vary greatly.
Technology policy. Many agencies have strict guidelines on what technologies can and cannot be used within its own enterprise. For example, some agencies use only Microsoft products and platforms, such as .NET and MS SQL; whereas, other agencies forbid open-source software for fear of being unable to find available support for a product or that misuse of the open-source technologies could lead to litigation.
Resource constraints. An agency may want to build and maintain their archive internally, but quickly find that they lack the capacity to do so. This may not necessarily be a lack of skills, but rather, a lack of skilled individuals. An agency who takes on a significant development effort with only one or two superior coders or minimal information technology (IT) support may find its program at a standstill if the specialized personnel leave the agency or are otherwise committed to other agency projects.
Storage and processing capacity. Technical barriers are common to agencies that are building archives in-house. Certain archives, especially those from connected vehicles or probe-based data sets, can grow at a surprising rate. The Florida Department of Transportation (FDOT), for example, collects data from approximately 5,000 speed/volume sensors. This sensor network produces the equivalent of 3.6 million data points every day and over 1.3 billion records each year. Sensor networks like the one FDOT uses produce only a fraction of the data generated by probe data sources. Using FDOT again as an example, the data the agency purchases from a third-party probe data provider produces nearly 34,000 records every single minute. That translates to 48.9 million records per day or 17.8 billion records per year. Due to State procurement and hiring policies, very few departments of transportation (DOTs) have dedicated and responsive IT teams that can plan out, install, and configure expensive hardware or cloud services to accommodate these massive archives for long periods of time.
Tools and accessibility. Even when hardware can be purchased and installed, developing the appropriate analytics software and databases that make the data easily accessible to end users can be a significant hurdle for agencies. For an agency to build successful tools independently, a healthy mixture of software engineers, architects, user interface and user experience design specialists, developers, and project managers is necessary. The tools will need to be maintained over time; therefore, complete documentation and staff that can be called upon over the course of many years to keep the tools up-to-date are needed. Because of the high barrier to entry and upkeep costs, many agencies are now choosing to either purchase existing tools, or leverage tools that other agencies or universities have already paid to develop. This effectively creates a pooled fund approach to software development and maintenance. This approach is becoming easier to use for those agencies who are unaccustomed to purchasing services and for those who have historically not adopted tools and products that have not been invented in-house or even in-State.
Networking bandwidth. Though less common, agencies may have issues with networking and bandwidth capacity. Large data sets that stream from various sources will need to be compiled and streamed to others. The ability to send large sets of data from field devices or to and from third-party providers can be difficult with existing infrastructure.
Security concerns. Working with agency security boards or committees can be another challenge that often seems insurmountable. Getting the appropriate permission to set up online data access or to deploy specialized software, for example, can range from 1 month to several years, or even be denied if there is not an advocate.
If present, most technical issues are solvable if there is an internal advocate. Network capacity can be expanded at relatively low costs. Other IT and hardware procurement issues can be bypassed by outsourcing to contractors or through the purchase of software-as-a-service or IT-as-a-service.
Figure 5. Photo. Technician.
Source: Thinkstock/Dynamic rank.
Internal resistance. Internal resistance may develop when: (1) trying to convince the owner of the data or archive to share it with others, (2) trying to convince those who have not traditionally relied on operations data to trust its quality, and (3) trying to convince potential users of the value of the operations data compared to traditional (or currently used) data sets.
Additional arguments have been used to overcome issues with sharing, trusting, and valuing operations data with varying degrees of success. There is the sensible argument, which debates the philosophical and ethical arguments of sharing data and pooling resources. There is the legal justification, which attempts to leverage interagency agreements or executive orders to change behavior. There is the funding argument, which attempts to pay an agency or business unit to switch their data collection and sharing practices in an effort to get past the "the cost is too great" excuse. There is the "shame" argument, which attempts to show how everyone else in the organization is sharing data and leveraging resources, thus isolating those individuals who are resisting. The problem with these tactics is that they do not address the "fear factor" that many individuals have about changing their behavior and sharing their data with others.
There is the "make a friend" tactic, which leverages personal connections to try to instill change. This tactic often is successful as it is communications-based. However, if only the leadership within an organization is convinced, but the individuals who daily work with the data are unconvinced, then as soon as the ally within the organization gets promoted, retires, or otherwise leaves the organization, the resistance returns.
Ultimately, every manager and analyst in every agency has questions that need answers and is looking for insights into their own data. If these managers and data stewards can be shown the power of certain visualization and analytics tools that they will gain access to as part of the archive, they usually will understand the value of the data, realize that it is worth the supposed risk of using the data, and make it available to others through sharing. This is the "build it and they will come" approach. By showcasing the tools and capabilities that will be made available when the operations data is archived and accessible, others will want to make use of the data and share it with others. Tools can be built in-house with some risk, as described later. However, existing tools and applications developed by third parties or other agencies also can be showcased. Leveraging existing tools to showcase the power of operations data often is the quickest, cheapest, and least risky approach to demonstrating capabilities, securing stakeholder buy-in, and getting a process up and running in as little time as possible.
Challenges in Changing Planning Methods and Products
The use of archived operations data may require planners to adjust their technical methods and modify their products. Historically, travel demand forecasting (TDF) models have been used not only to forecast future performance but also to estimate current performance where performance measurement data were lacking. With archived data, the use of models to "measure" current performance is no longer necessary. Therefore, products that formerly relied on models to derive current conditions now have to be converted to use measured data. Switching from models to data for these applications requires planners to develop a strategy. Some of the issues that need to be addressed include:
United States Department of Transportation - Federal Highway Administration