Office of Operations
21st Century Operations Using 21st Century Technologies

Data Quality White Paper

1.0 Introduction

Traffic data collection, within the context of transportation operation and management, is becoming an increasingly valuable asset for today's transportation arena. Significant traffic data have been generated from Intelligent Transportation Systems (ITS) technologies in recent years. The data have been widely utilized in managing system operations and providing information on traffic conditions. However, public and private users are finding that the utilization and operation of the data is an increasingly difficult task since the data are collected with different levels of accuracy and resolution, and data formats are incompatible. Furthermore, the problem worsens as the amount of data continues to grow. The quality of data in data collection, operation, and management efforts has resulted in the underutilization of data and increased utilization costs. Various problems were identified in recent research efforts regarding the quality of data for transportation operations, planning, traffic congestion information, transit and emergency vehicle management, and/or commercial truck operations. [Battelle, Traffic Data Quality Workshop Proceedings and Action Plan. 2003, Prepared for Federal Highway Administration: Washington D.C., Battelle, Traffic Data Quality Measurement. 2004, Prepared for Federal Highway Administration: Washington D.C., Quiroga, C., K. Hamad, R. Brydia, R. Rajbhandari, R. Benz, and S. Sunkari, Transportation Operations Data Needs and Recommendations for Implementation. 2007, Prepared for Texas Department of Transportation and the Federal Highway Administration: Austin, Texas., Turner, S., Defining and Measuring Traffic Data Quality: White Paper. 2002, Prepared for Federal Highway Administration: Washington D.C., Turner, S., Quality Control Procedures for Archived Operations Traffic Data: Synthesis of Practice and Recommendations. 2007, Prepared for Federal Highway Administration: Washington D.C., Turner, S., Defining and Measuring Traffic Data Quality: White Paper on Recommended Approaches. Transportation Research Record, 2004(1870): p. 8.].

Data quality has been questioned since the earliest stages of traffic data collection. Since a variety of ITS applications and various travel information systems have unique data requirements, the matter of data quality has become more urgent. Furthermore, in the last few years, this intricacy has been made more complex due the emergence of private services which are providing traffic information services to the public. Turner [Turner, S., Defining and Measuring Traffic Data Quality: White Paper on Recommended Approaches. Transportation Research Record, 2004(1870): p. 8.] gave a definition of data quality as "the fitness of data for all purposes that required it. Measuring data quality requires an understanding of all intended purposes for that data". Traffic data has different meaning(s) to different consumers and the intended uses of data should be considered and understood when designing, implementing and operating data collection systems and applications.

Traditional data collection systems may not assure the quality of data that satisfy the state-of-the-art transportation applications. There are urgent needs that the specific data quality measures should be considered for each traffic data application. This paper investigates the data quality measures for transportation data and presents an overview of the requirements for the implementation of a real-time information program.

1.1 Background

Section 1201 of the Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users (SAFETEA-LU) requires the Secretary of Transportation to establish a Real-Time System Management Information Program (RTSMIP) to provide, in all states, the capability to monitor, in real-time, the traffic and travel conditions of the major highways of the United States and to share that information to improve the security of the surface transportation system, to address congestion problems, to support improved response to weather events and surface transportation incidents, and to facilitate national and regional highway traveler information. Section 1201 also requires the establishment of data exchange formats to facilitate the exchange of information.

The purposes of the RTSMIP are to (1) establish, in all States, a system of basic real-time information for managing and operating the surface transportation system, (2) identify longer range real-time highway and transit monitoring needs and develop plans and strategies for meeting the needs, and (3) provide the capability and means to share the data with State and local governments and the traveling public. RTSMIP will provide the capability to monitor the real-time traffic and travel conditions of the major U.S. highways. Furthermore, RTSMIP will share that information to improve surface transportation system security, address congestion, improve response to weather events and incidents, and facilitate national and regional highway traveler information.

This proposed program requires establishing minimum parameters and requirements for States to make available and share traffic and travel condition information via real-time information programs and also involves general uniformity among the real-time information programs to ensure consistent service to travelers and other agencies. Information sharing specifications and data exchange formats were developed by the Federal Highway Administration to accelerate the sharing of traffic and travel-condition information. Interim guidance was published to engage the transportation community on an appropriate course of action to simplify the exchange of real-time information program content.

Satisfying data quality and data accuracy requirements is a key step in the implementation of real-time information programs. The data quality should be considered in advance of developing congestion management and traveler information system applications that rely upon data from various sources. Specifically the data quality requirements should be defined by each application. For example, some applications such as High Occupancy Toll (HOT) operations and other congestion and value pricing applications require higher accuracy and more rapid availability of data in comparison to other applications.

1.2 Project Objective and Scope

The objective of this paper is to investigate the data quality measures and how they are applied in existing systems. This paper explores the relevance of the data quality measures that were defined in a report entitled "Traffic Data Quality Measures" and presents an overview of the requirements for the implementation of a real-time information program.

Specifically, this paper focuses on the real-time travel information applications within six primary interfaces (traffic management information, maintenance and construction management, transit management and information, information service provider information, parking information, and emergency management information) and their associated applications as identified in the publication of "Interim Guidance on the Information Sharing Specifications and Data Exchange Formats for the Real-Time System Management Information Program".

1.3 Organization of Paper

This paper is organized into six chapters. The second chapter provides a review of previous studies on data quality measures. The third chapter presents an overview of the utilization of data quality measures in public and private sectors. Chapter four discusses the data quality measures for real-time travel information applications. Chapter five investigates the proposed real-time information program. Finally, chapter six provides a summary of the findings, the conclusions of the research effort, and recommendations for further research.