Considerations of Current and Emerging Transportation Management Center Data

Chapter 2. Emerging Data Sources

The private sector is constantly evolving and innovating—trying to create the next data product or service that will give businesses a competitive advantage. While it is difficult to predict the future, this section includes data that is becoming available now or will likely become available in the next several years. Not every private sector provider wants to reveal its anticipated products or schedules, but table 1 provides a list of newer and emerging data sources that not all agencies have used yet.

Table 1. Newer and emerging data sources.
Data	Potential Planning/Decisionmaking Use
Crowdsourced¹ Incident and Congestion Data	Crowdsourced incident and event data can help agencies quickly identify all sorts of safety and congestion issues on the roadway without having to invest in additional sensors, cameras, or other costly data from third parties. Preliminary studies have shown Waze crowdsourced data to be comparable in speed and quality to public safety computer-aided dispatch data, and is particularly useful in rural areas where intelligent transportation system coverage is usually limited. Agencies are beginning to leverage this data as a supplemental event/incident data sourced.
Roadside BSM Collection	Some modern roadside equipment can now collect basic safety messages (BSM) sent from equipped vehicles. While the percentage of vehicles equipped to broadcast BSM is currently very low, this data has the ability to eventually be highly useful at signalized intersections, work zones, and other locations where split-second decisions have safety implications. Planners, policymakers, and engineers can mine this data (with other datasets) to understand driver behavior and to identify roadway safety improvements.
Realtime Trajectory Data	Trajectory data is similar to origin-destination (O-D) data, except that it also provides waypoints (breadcrumb trails). For any trip, data is being continually relayed back to the operations center or advanced traffic management system platform that tells operators which routes travelers are taking to get to their destinations, how fast they are moving, and if that route is normal or abnormal for that type of vehicle, time of day, etc. Operators can use the data to conduct after-action reviews for significant events to understand the impacts of events and road closures. When archived, this data can augment or completely replace traditional trip and O-D studies used by planners and modelers.
Crowdsourced Mapping Data	Directional, navigable mapping data created and edited by the public can be used to supplement (and sometimes improve upon) State centerline files. When used in a planning environment, this data can more quickly update maps—often faster than commercial providers that typically update their maps quarterly with significant lag between when new roads are completed and when added to the system.
Probe-Based Speed Data	Speed and travel time data from vehicles using navigation systems aids in the study of congestion trends, identification of problem locations, before and after evaluations, and project prioritization.
High-Resolution Map Data and Other Asset Management Systems	Companies are now touting their high-resolution mapping data, which includes extremely detailed information (down to the centimeter accuracy) about where curbs are located, how high the curbs are, where road markings are located, and where and what is on a road sign. Planners and policy makers use this data to understand infrastructure degradation, life cycle, and maintenance needs over time.
Wi-Fi/Bluetooth Re-identification	Installation of Wi-Fi and Bluetooth equipment at intersections or other decision points helps to better understand travel times and travel patterns on key corridors and arterials. The resulting data is a point-to-point travel time between the two sensors that is often more accurate than traditional sensors and/or probe data.
Credit Card Transactions (from Point-of-Sale Vendors)	Point-of-sale vendors are starting to sell data related to where credit card transactions are taking place, the commodities purchased, by whom, and more. This data can be used in realtime by transportation management centers (TMCs) to understand travel patterns, deviations from normal travel patterns, and during and after emergencies (like hurricanes and snow storms) as a surrogate measure of power failures and when businesses are open or closed. Planners can also archive the data and use it as a surrogate for origin-destination studies, trip analytics, and activity-based models.
Connected Vehicle Data from Telematics Providers	This data includes direct-from-vehicle measures and warnings, such as heavy-breaking events, traction-control engagement, wiper use, emissions data, temperature data, rollover and/or collision data, and even seatbelt use (in commercial vehicles). This data is available today from telematics providers in over 5 million vehicles in the United States. This data can be used by TMCs to supplement existing incident detection systems, to alert operators about adverse weather conditions such as slippery roads, and as a pre-event warning system. The data can then be archived and used by planners for many activities, including safety studies, congestion analytics, performance measures, emissions studies, and as a surrogate for ground-based weather data.
Realtime Turning Movement Data	When operators respond to incidents or lane closures, realtime turning movement data can help them understand what percentage of vehicles are taking alternate routes and the effectiveness that different communication strategies have in changing travel behavior. When archived, this data supplements trip surveys and can inform long-term planning decisions.
High-Resolution Signal Data	Some signal manufacturers are beginning to equip signal systems, communications, and logging equipment that collect and distribute highly precise signal phase and timing (SPaT) data, actuator data, and more. High-frequency collection of this information can facilitate signal-retiming efforts, determine where congestion is occurring, and provide better understanding of the signal retiming impacts.
Air Quality Sensors	Cities are working with the private sector to deploy air quality sensors to allow management of traffic to minimize congestion, increase safety, and improve air quality.
Roadway Weather Predictions	While basic National Weather Service prediction data has been available for many years, these predictions cover wide areas and tend to focus more on air temperature and precipitation averaged over a region. Several companies now offer ground-based (i.e., at the street level) 48-hour weather predictions that are updated every hour. This information can be used to optimize winter weather response operations; improve snow event readiness; reduce staffing, fuel, and chemical costs; pinpoint treatment applications; and generally keep the roads safer and less congested.
Computer Aided Dispatch Data	Computer-aided dispatch data is generated by public safety agencies as part of their call intake and dispatch operations. This data can be used in realtime by TMCs to improve incident awareness or to inform them of incidents they would not have been apprised of otherwise. This allows TMCs to respond to incidents more efficiently and effectively and reduces clearance time and the potential for secondary incidents.

¹Crowdsourced incident and congestion data is any data willingly and intentionally generated and reported by the public who receives something in return for their contributions. For the purposes of this document, it refers to travelers reporting incidents and congestion in realtime as they come across it using smartphone apps or similar technology. [ Return to Table Note 1 ]

Examples of companies that supply the data in table 1 are included in table 2 along with the sources of their data.

Table 2. Where the private sector obtains its data.
Data Provider	Sources Of Data
Speed Data	Global Positioning System (GPS)-equipped vehicles and cell phones. Fleets of trucking companies with telematics. Cell phone navigation users. Embedded vehicle navigation systems. Apps installed on cell phones. Privately owned sensors installed on the agency right-of-way.
Crowdsourced Congestion and Incident Data	Waze application and Google/Android navigation app users. Public data inputs.
Fleet Management Data	Telematics equipment installed on vehicles rolling off the assembly line and aftermarket installations of equipment.
Origin-Destination Data	Location-based services on cell-phone applications. Credit card point-of-sale machines. Breadcrumb trails from navigation apps. Telematics providers.
Crowdsourced Map Data	Crowdsourced from a community of public users.
Ride-Sharing Location Data	Their own applications installed on driver and rider cell phones.
Mapping Data	Vehicles equipped with 360-degree cameras, GPS light detection and ranging (LiDAR), and other sensors and data collection equipment.
Parking Data	Third-party parking management systems, vehicle fleets with cameras/sensors.
Incident Data	Curated from their own traffic management systems, media data feeds, agency data feeds, computer-aided dispatch (CAD) feeds, crowdsourced, etc.
Road Weather Data	Specialized and hyperlocal weather conditions and predictions are produced from a mix of satellites, ground-based radar, ground sensors, and more. pecialized algorithms and data processing then produce more accurate ground-based weather predictions. Operators use this to understand the weather as expected and then experienced by the driver.
Law Enforcement or Emergency Services CAD	Third parties (like Motorola, Hexagon, TriTech, etc.) typically provide software CAD solutions to the law enforcement community, and the data within the CAD is entered directly by dispatchers using a keyboard, mouse, and/or touch screen.

Differences and Similarities in Collection, Processing, and Aggregating Data

The public sector has historically relied on deploying physical infrastructure on public rights of way to collect data to support transportation operations. Traditional intelligent transportation systems (ITS) equipment such as sensors, vehicle counters, and cameras allowed agencies to get a better understanding of system conditions. Agencies were responsible for the procurement and installation of these sensors along with ongoing maintenance. This was both a blessing and a curse. Agencies were accustomed to procuring physical things—things that could be owned, tagged, inventoried, and held. Once installed, the agency was fully responsible for care and maintenance, including routine calibration, physical repair, power, communications, etc. If maintained and calibrated, these sensors and devices could provide relatively high-quality data, but only in areas where they were deployed.

The private sector, with rare exception,¹ has been limited in its ability to deploy its own equipment on the public right-of-way. Seeing opportunities in the Internet of Things and wireless technologies, the private sector instead focused on these emerging technologies and the opportunities they presented to potentially add value to agency operations.

As wireless and mobile technology continued to develop and smartphone penetration skyrocketed, the private sector was able to develop new ways of obtaining data of a quality similar to that of public agencies, but on a larger scale, covering most of the roads in the Nation, and doing so without deploying much (if any) infrastructure.

Table 3 summarizes the operations-related uses for each data type and how public sector agencies could obtain the data or an approximation of that data without purchasing it directly from a third party.

Table 3. Public sector alternatives to obtaining third-party data.
Data	Data Uses and Public Sector Data Alternatives
Credit Card Transactions	Uses: For origin-destination (O-D) studies, understanding if businesses are opened/closed, and for determining which business have electricity. Public Sector Alternative: Agencies would need to conduct surveys or review census data for O-D studies. Power failure data can be collected from utilities if they are willing to share that information.
Third-Party Connected Vehicle Data	Uses: For event detection, warning of potential safety hazards, weather conditions, and more. Public Sector Alternative: While basic safety message (BSM) data can be obtained from equipped vehicles, agencies still need to install dedicated roadside equipment to collect and send the BSM data back to an operations center. This means that agencies are still responsible for maintaining physical devices and that data can only be collected in specific locations where roadside units are installed. Additionally, the BSM data is fairly limited in scope compared to what third parties can pull directly from the vehicle at any location on the roadway and transmit back to a central location via a cellular connection.
Realtime Trajectory Data	Uses: Realtime route and detour analysis, evacuation monitoring, signal performance measures, O-D analysis, after-action reviews, etc. Public Sector Alternative: Agencies would need to conduct surveys or review census data for O-D studies. Floating car runs have been used by many agencies to replicate trajectory data; however, floating car runs are not representative of traveler route choices, and they are almost never done in realtime. Aerial photography studies have been purchased by agencies and used for some of these applications; however, aerial studies only cover a limited geography, and the resulting data is not provided to the agency in realtime.
Realtime Turning Movement Data	Uses: Detour and alternate route adherence, traffic signal performance measures. Public Sector Alternative: The public sector can only collect realtime turning movement data through the deployment of large quantities of sensors (inductive loops, side-fired microwave/radar), license plate recognition, toll-tag readers, high-definition signal controller data, and/or Bluetooth/Wi-Fi re-identification sensors.
Wi-Fi/Bluetooth Re-identification	Uses: Travel times, traveler information, route selection, signal performance measures. Public Sector Alternatives: Unlike many of the other data sources in this list, Bluetooth re-identification sensors are equipment that agencies can still procure and install by themselves at signals and other locations along the roadside. This data is similar to probe-based speed data, toll-tag speed data, and license plate recognition data. While most companies sell the sensors directly to the agency for installation, some companies also rent equipment or provide Bluetooth data as a service.
High-Resolution Signal Data	Uses: Automated traffic signal performance measures (ATSPM), signal operations, signal retiming, volume counts, turning movement analysis. Public Sector Alternatives: With some exceptions, the private sector typically owns and aggregates the data that is collected directly from the traffic controller; however, some companies have leveraged the open source ATSPM modules in proprietary systems that are sold back to agencies with enhanced user interfaces and data management services. Other companies sell aftermarket products that can be installed in signal cabinets that collect and aggregate this data in a cloud for resale back to agencies.
Roadside BSM Collection	Uses: Intersection safety, work zone safety, event detection, warning of other potential safety hazards, weather conditions, and more. Public Sector Alternatives: While BSM data can be obtained from equipped vehicles, agencies still need to install dedicated roadside equipment to collect and send the BSM data back to an operations center. This means that agencies are still responsible for maintaining physical devices and that data can only be collected in specific locations where roadside units are installed.
Crowdsourced² Incident and Congestion Data	Uses: Incident/event detection, traveler information. Public Sector Alternatives: Agencies have historically gathered event/incident information from the monitoring of closed-circuit television (CCTV) feeds, listening to radio systems, interfacing with law enforcement computer-aided dispatch (CAD), and/or service patrols. While many of these existing detection sources are considered superior to crowdsourced data for major events/crashes, smaller events (like disabled vehicles, debris, etc.) are typically better sourced via the crowd and cover a wider geographic area than what a state transportation agency might normally cover.
Crowdsourced Mapping Data	Uses: Base-mapping for traveler information, asset management/asset identification, routing. Public Sector Alternatives: Almost every agency is responsible for mapping roads already. However, many agencies only produce centerline map files. For these agencies, some crowdsourced mapping products, though not authoritative, may actually have better map attributes, assets, and even be more up to date.
Probe-Based Speed Data	Uses: Traveler information, congestion analytics, performance reporting, problem identification, before-and-after studies, etc. Public Sector Alternatives: Some agencies do collect probe-based speed data through the deployment of toll-tag readers, license plate re-identification, Bluetooth re-identification, etc. Other agencies have attempted to mimic third party probe data by outfitting maintenance vehicles (or transit vehicles) with Global Positioning System equipment. However, no agency is currently able to collect speed/travel time data at the same scale and with the same number of probes as the private sector can.
High-Resolution Map Data (LiDAR or Similar) and Other Asset Management Systems	Uses: Asset management, precision navigation, identification of maintenance issues. Public Sector Alternatives: A few agencies have purchased their own light detection and ranging (LiDAR) equipment and are collecting and storing point-cloud data on their own. The private sector, however, is providing services that collect and manage the point cloud data on behalf of the agency, and distill the data down to more manageable asset attribute information, like the location of signs, reflectivity, pavement conditions, etc.
Roadway Weather Predictions	Uses: Understanding expected road temperature, moisture, snow cover, wind speed, visibility, etc. Public Sector Alternatives: While many agencies have deployed road weather information system (RWIS) stations, these stations only provide current weather conditions and only directly at the location of the RWIS station. Only a couple of agencies have the luxury of an in-house meteorologist on staff that can help predict weather for the agency. Even these dedicated meteorologists can struggle with producing surface weather predictions for all road segments.
CAD Data	Uses: Faster incident detection, incident detection on roads that are not normally covered by CCTV or the agency, understanding who has been dispatched to an event, assessing the severity of the event. Public Sector Alternatives: CAD is typically generated by public agencies; however, it is usually managed and owned by public safety instead of transportation. The closest equivalent to CAD that is already owned by transportation agencies is event and notification data from advanced traffic management systems.

²Crowdsourced incident and congestion data is any data willingly and intentionally generated and reported by the public who receives something in return for their contributions. For the purposes of this document, it refers to travelers reporting incidents and congestion in realtime as they come across it using smartphone apps or similar technology. [ Return to Table Note 2 ]

Processing and Analytics Differences

Public agencies have kept pace with the latest relational database storage and analytics capabilities necessary for the effective usage of existing agency data—like volume and speed sensors, advanced traffic management system (ATMS) event records, etc. For example, a large State agency might deploy 15,000+ traffic flow sensors on their highways that record volume and speed data every 20 seconds. Most agencies can easily handle this level of data storage. Even though this seems like a significant amount of data, it comes nowhere close to the quantity of data produced by the tracking of Global Positioning System (GPS)-equipped cellular phones, light detection and ranging (LiDAR), or other CVs applications.

As the public sector continued to invest in relational database technologies to store its sensor and ATMS data, the private sector had to fast track its own, newer storage and analytics capabilities to keep up with development and demands of the wireless and mobile markets, which were creating massive streaming datasets that would not fit into existing relational database management systems. This meant the private sector needed to develop big data analysis and storage capabilities well beyond those of most public sector agencies.

Now that public agencies are tapping into these big-data streams, some are finding that their own relational databases are ill-equipped to process, host, and leverage the data to its fullest potential. Agencies are faced with several choices: 1) adopt or develop the same big data capabilities as the private sector and invest in hardware storage or cloud services, 2) outsource the hosting of these datasets to the private sector or other third parties, or 3) not attempt to ingest raw data, but instead leverage application programming interfaces (APIs) or data summary and insights services from the data providers without the need to download and manage large datasets.

The remainder of this chapter will describe these datasets in more detail—giving an overview of potential applications, data providers, pros and cons, etc. Later chapters will discuss business models, the value of agency data, and contracting guidance.

Crowdsourced Incident and Congestion Data

Description

Crowdsourced incident and congestion data is any such data willingly and intentionally generated and reported by members of the public who get something in return for their contributions. Crowdsourced transportation data does not include probe-based data. Instead, crowdsourced data refers to travelers reporting incidents and congestion in realtime as they come across it using smartphone apps or similar technology.

Applications of Crowdsourced Incident and Congestion Data

Many applications can use crowdsourced incident and congestion data in the same way as agency incident and congestion data. For example, crowdsourced data can help manage traffic in realtime by providing TMCs with awareness of new incidents and congested spots. TMCs can dispatch field units to incident scenes more quickly, or implement congestion mitigation strategies in response to congestion reports. In addition, crowdsourced data applies to realtime operations and can also be archived and used in planning and performance management efforts. Table 4 shows a few agencies that are making use of crowdsourced data and applications today.

Table 4. Sample list of agencies using crowdsourced data.
Data	Application	Peer Contact
Waze Event Data	Early detection of hazards and events.	Massachusetts and Florida Departments of Transportation (DOTs).
Waze Navigation Guidance	Agency "pushes" road closures, detour routes, or preferred routes to the crowd to control and influence traffic.	Port Authority of New York and New Jersey.
SeeClickFix Data or Other Citizen Reporting Apps	Early detection of maintenance issues.	Utah and New Hampshire DOTs.
Twitter Messages	Pre-event planning. Pre-event warnings. Social sentiment analysis.	Metropolitan Area Transportation Operations Coordination Program (National Capital Region), Iowa and D.C. DOTs.
911 Phone Calls	Event detection, responder deployment, and quick clearance support.	California, Virginia, and Wisconsin DOTs.

General Attributes

Latency: Generally, crowdsourced incident and congestion data has a low latency. Because there is little to no verification, data can be available the moment travelers enter it. Waze data supports the ability to identify disabled vehicles, debris, smaller incidents, and incidents off State-monitored roadways significantly faster than public agencies or first responders.
Details: Crowdsourced data may be less detailed than data agencies generate because travelers are asked to enter only basic information associated with an incident or congestion in an effort to balance the benefit of useful user-generated information with the safety of the app users (i.e., to avoid distracted driving).
Quality and Coverage: The number of participants in the crowdsourcing effort relates directly to penetration, quality, and coverage of crowdsourced data. Due to this association, crowdsourced data generally seems to be more prevalent and accurate in dense urban areas, highly traveled corridors, and regions with a high level of technology acceptance.

This diagram shows examples of four Waze event types (Weather/Hazard, Accident, Road Closure, and Jams), and subtypes listed below each category. Weather/Hazard subtypes listed are: Construction, Pot holes, Car on shoulder, Car stopped on road, object in road, weather conditions (fog, flood, hail, rain, etc.), animals on shoulder, road kill, and hazard on shoulder. Accident subtypes listed are: major and minor. Road Closure subtype is listed as road closures. Jams subtypes listed are: moderate traffic, heavy traffic, stand-still traffic, and light traffic.

Figure 2. Diagram. Examples of Waze event types and subtypes.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Data Availability

There are multiple platforms that TMCs can leverage to glean data from the crowd. Some platforms are focused on transportation while others are primarily social networks. Some agencies mine data from social networks by evaluating contents of posted messages and finding key indicators of traffic-related information. For example, some agencies used machine learning algorithms to recognize typical message constructs that may identify an incident. These algorithms can assign a level of confidence to the results by comparing the location information, time, and other attributes across other social networks and agency systems. Other agencies simply monitor social media messages from trusted partners or users (like the media, public safety departments, etc.).

Many States and municipalities are starting to encourage their citizens to use apps that allow citizens (the crowd) to take pictures of maintenance issues (like potholes, damaged lights, signal issues, curb/sidewalk issues, etc.) and send them to the agency (along with geo-located photos, attributes, etc.). A database can manage the citizen reports, which is turning into a "ticketing system" for many smaller agencies.

Pros and Cons

As shown in figure 3, crowdsourced data tends to provide broader coverage than individual agencies may be able to afford with limited resources and jurisdictional responsibilities. Crowdsourced data may be generated anywhere people travel, encounter issues, and have internet access. This means users can report on interstates, State and county roads, local roads, neighborhoods, etc.

This figure shows a map of Virginia. Shaded areas represent events reported through Waze. Boxes highlight two areas where Waze events provide more data than Virginia DOT events.

a) Waze events.

This figure shows a map of Virginia. Shaded areas represent events reported through the DOT. Boxes highlight two areas where Waze events provide more data than Virginia DOT events.

b) Virginia Department of Transportation events.

© 2018 I-95 Corridor Coalition

Figure 3. Screenshots. Red boxes highlight the coverage differences in the Waze data versus the Virginia Department of Transportation data
Source: I-95 Corridor Coalition. n.d. Closing the Realtime Data Gaps Using Crowd-Sourced Waze Event Data. Unpublished technical report.

Crowdsourced data can be timelier than agency-generated data due to the higher probability of travelers coming across an incident or encountering congestion than an agency detecting it via its sensors, cameras, and field patrols. A recent I-95 Corridor Coalition (ICC) study compared the amount of time it takes for events to appear in the Waze data feed compared to the amount of time it takes the agency to identify the incident and put it into their ATMS or 511 platform.² Agencies usually detect major collisions on interstates sooner than Waze. However, Waze identifies smaller events, debris, and disabled vehicles before the ATMS—sometimes by 15 minutes or more. These smaller events do have the potential to become larger or cause secondary incidents if not dealt with in a timely manner. Table 5 shows the results of the ICC study, indicating the average length of time Waze reported an event before the DOT ATMS reported it when both reported the same event (includes events in California, Florida, and Virginia).

Table 5. Comparison of Waze versus department of transportation event reporting.
Type of Event	Avg. Time a Waze Event was Reported efore DOT Reporting	Percentage of All Waze Events that Were Included in the DOT's ATMS Logs
Freeways/Ramps Crashes	3 Minutes	40 percent
Primary/Secondary Crashes	3 Minutes	12 percent
Freeways/Ramps Disabled Vehicles	14 Minutes	37 percent
Primary/Secondary Disabled Vehicles	16 Minutes	4 percent

Source: I-95 Corridor Coalition. n.d. Closing the Realtime Data Gaps Using Crowd-Sourced Waze Event Data. Unpublished technical report.
ATMS = advanced transportation management system. DOT = department of transportation.

As a general observation, official entities do not verify crowdsourced data because any user can report anything. Crowdsourced data providers usually have built-in mechanisms for a "selfmoderating" community, where other travelers can confirm or reject reports, therefore providing some level of confidence that the reported incident or congestion is real. As previously noted, Waze data feeds provide confidence and reliability scores. Confidence scores range from 0 to 10 and measure how other drivers react to the report (a higher score indicates positive feedback from other Waze users). Reliability scores measure the experience level of the person reporting the event. The more a user contributes and the more they receive positive feedback from other Waze users, the higher their score. Many agencies use a combination of the confidence and reliability scores to determine whether it is worth acting on a particular event. Figure 4 shows how these scores are displayed and filtered in one traffic management system.

This screenshot shows the University of Maryland Center for Advanced Transportation Technology Laboratory Regional Integrated Transportation Information System (RITIS) Website. The screen depicts Transportation System Status Traffic map tab. The user may filter the map data displayed based on Options for Waze Incidents and Events, Incidents and Events sub-layers, and a Hide layer list. When selecting an incident or event, a dialog box shows the type of incidents.

© 2019 University of Maryland Center for Advanced Transportation Technology Laboratory

Figure 4. Screenshot. Many agencies filter out Waze data that have a lower reliability level.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Use Cases for Crowdsourced Incident and Congestion Data

Use Case: Regional Integrated Transportation Information System (RITIS) Waze data integration. RITIS ingests realtime Waze data for the entire country. State DOTs and other public agencies use RITIS to fuse data from multiple providers into a single, common operational picture, or as a data aggregator of third-party data feeds like Waze. Formatting Waze and other third-party data into standardized data feeds enables incorporation of data back into a realtime ATMS platform. Figure 5 shows the RITIS map with Waze data overlaid. Users are able to filter the display of Waze events based on the reliability score. Several agencies then access the RITIS data application programming interface and re-ingest the Waze data (along with other third-party incident data feeds) in a single stream.

© 2019 University of Maryland Center for Advanced Transportation Technology Laboratory

Figure 5. Screenshot. Realtime Waze data integrated into the Regional Integrated Transportation Information System platform.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

The Waze smartphone app screen consists of several icons indicating different types of events (traffic, police, crash, hazard, map chat, map issue, place roadside help, camera, and closure). Clicking an icon sends a report on the specific incident type. Copyright Waze, 2019.

Source: Waze.
Figure 6. Screenshot. Waze smartphone app.

The RITIS platform also archives Waze data at the national level. Operators can use the RITIS archived Waze data analytics module to understand where to stage field responders or plan service routes by analyzing the location of incident hot-spots by time-of-day and day-of-week.

Google-owned Waze provides a smartphone app that allows drivers to report incidents, congestion, and other traffic-related events (figure 6). Over the last several years, Waze has partnered with a number of public agencies (cities, States, and regions) across the world to provide crowdsourced data in exchange for agency-generated planned event information. Waze calls this its Connected Citizens Program (CCP).

Waze provides three primary types of notification data: alerts, jams, and irregularities. Waze provides CCP partners with extensible markup language (XML)/JavaScript Object Notation (JSON) files. Data providers and users must overcome challenges with both data size and quality to transform raw Waze data to reliable information that can help DOTs make data-driven decisions.

Data Size: Waze generates a substantial number of new notifications on a daily basis. The Center for Advanced Transportation Technology (CATT) Laboratory at the University of Maryland has been ingesting data from Waze for several years in its RITIS realtime data fusion platform. On an average day, CATT Lab's RITIS platform receives approximately 27,000 Waze events across 12 States (approximately 54,000 events if including jams). This is significantly more than the number of ingested, stored, and visualized events across all States,³ as shown in table 6. For example, Waze could result in a 400-fold increase in the number of events in Massachusetts. As such, ingesting only the desired Waze events by applying filtering will be essential to increasing the usability of the data. As shown in figure 7, the average number of Waze events is highly correlated to the vehicle miles traveled (VMT) in each State.

Table 6. Waze versus department of transportation events per day (excluding jams).
State	Avg. Waze Events per Day	Avg. DOT Events per Day
CA	283,889	3,184
DC	777	16
FL	17,210	1,895
IA	810	114
MA	5,613	14
PA	9,171	70
VA	9,168	681

Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Chart depicts number and types of Waze events by State.

Figure 7. Chart. Waze events and vehicle miles traveled by State (excluding jams).
Source: I-95 Corridor Coalition. n.d. Closing the Realtime Data Gaps Using Crowd-Sourced Waze Event Data. Unpublished technical report.

Data Quality: Waze's data quality can be susceptible to redundancy, completeness, and unreliable information such as the generation of false-positive events.

Redundancy: Waze users on occasion report the same event using different event types and report the event at different locations and times. Although Waze has built-in mechanisms to reduce event duplication (e.g., confidence and reliability scores), Waze data still contains duplicative events. The variation in locations and times is likely a result of a delay in entering an event into the Waze application, while the different event types may be a result of the actual event being unclear or a user selecting an incorrect event type. It is essential to cluster duplicative Waze events and consolidate interchangeable event types to increase the quality. Due to the data sharing between DOT TMCs and Waze, replication of DOT events within the system can result in further duplication within the Waze data. These events generally have identical spatial and/or temporal attributes, such as timestamp and location. Although Waze data does not cause quality issues, either exclusion or consolidation of these events should take place after data ingestion.
Completeness: While Waze data frequently covers more roads and geographies than ATMS, there is no guarantee that the public/crowd will report every incident. Therefore, some incidents may go unreported, making the data incomplete. While not as significant an issue as missing data, some incidents can remain open or active for too long in the Waze data feeds after they have officially been cleared. This is true for incidents, disabled vehicles, debris, construction, etc.
Reliability: Waze data can be difficult to validate. False-positive or insignificant events may be either incorrectly reported or unverifiable when a TMC attempts to validate that the event occurred. Common examples include a vehicle pulled over on the shoulder for less than 1 minute, a fender bender with no damage to the vehicles, or a small animal identified as roadkill. Events with short durations and/or lower confidence and reliability scores are more likely to be false-positive events. Authenticated matching procedures for comparing Waze events against information received by TMCs will help organizations automate a method for validating Waze data and better understand elements associated with reliable event reports.

Example Waze Incident Data: Waze supports the types and subtypes of user-generated alerts shown in Table 7:

Table 7. User-generated alert types and subtypes supported by Waze.
Alert Type	Alert Subtype
ACCIDENT	ACCIDENT_MINOR ACCIDENT_MAJOR NO_SUBTYPE
JAM	JAM_MODERATE_TRAFFIC JAM_HEAVY_TRAFFIC JAM_STAND_STILL_TRAFFIC JAM_LIGHT_TRAFFIC NO_SUBTYPE
WEATHERHAZARD/HAZARD	HAZARD_ON_ROAD HAZARD_ON_SHOULDER HAZARD_WEATHER HAZARD_ON_ROAD_OBJECT HAZARD_ON_ROAD_POT_HOLE HAZARD_ON_ROAD_ROAD_KILL HAZARD_ON_SHOULDER_CAR_STOPPED HAZARD_ON_SHOULDER_ANIMALS HAZARD_ON_SHOULDER_MISSING_SIGN HAZARD_WEATHER_FOG HAZARD_WEATHER_HAIL HAZARD_WEATHER_HEAVY_RAIN HAZARD_WEATHER_HEAVY_SNOW HAZARD_WEATHER_FLOOD HAZARD_WEATHER_MONSOON HAZARD_WEATHER_TORNADO HAZARD_WEATHER_HEAT_WAVE HAZARD_WEATHER_HURRICANE HAZARD_WEATHER_FREEZING_RAIN HAZARD_ON_ROAD_LANE_CLOSED HAZARD_ON_ROAD_OIL HAZARD_ON_ROAD_ICE HAZARD_ON_ROAD_CONSTRUCTION HAZARD_ON_ROAD_CAR_STOPPED NO_SUBTYPE
MISC	NO_SUBTYPE
CONSTRUCTION	NO_SUBTYPE
ROAD_CLOSED	ROAD_CLOSED_HAZARD ROAD_CLOSED_CONSTRUCTION ROAD_CLOSED_EVENT NO_SUBTYPE

Source: Waze, Google Developers Web page. Available at: https://developers.google.com/waze/data-feed/overview, last accessed March 17, 2019.

Waze data is in either XML or JSON format.

An example of a weather hazard notification is below in the XML format with a description of each element as shown in table 8.

<item>
<pubDate>Thu Nov 26 14:02:29 +0000 2015</pubDate>
<georss:point>45.02395420471421 7.670893079148089</georss:point>
<linqmap:uuid>9fd1ee98-7b56-37e9-a2d4-72e9478dd838</linqmap:uuid>
<linqmap:magvar>6</linqmap:magvar>
<linqmap:type>WEATHERHAZARD</linqmap:type>
<linqmap:subtype>HAZARD_ON_ROAD_CONSTRUCTION</linqmap:subtype>
<linqmap:reportDescription>
scambio di carreggiata causa lavori dalle 00:00 del 16 novembre 2015 alle 23:59 del 21 gennaio 2016
</linqmap:reportDescription>
<linqmap:city>Torino</linqmap:city>
<linqmap:country>IT</linqmap:country>
<linqmap:roadType>4</linqmap:roadType>
<linqmap:reportRating>0</linqmap:reportRating>
<linqmap:reliability>10</linqmap:reliability>
</item>
<item>
<pubDate>Thu Nov 26 14:02:26 +0000 2015</pubDate>
<georss:point>45.02395420471421 7.670893079148089</georss:point>
<linqmap:uuid>ed06a695-53ee-347c-a6eb-133bf8746880</linqmap:uuid>
<linqmap:magvar>6</linqmap:magvar>
<linqmap:type>WEATHERHAZARD</linqmap:type>
<linqmap:subtype>HAZARD_ON_ROAD_CONSTRUCTION</linqmap:subtype>
<linqmap:reportDescription>
chiusura notturna causa lavori di manutenzione dalle 23:00 alle 05:30, solo nei giorni
feriali dalle 23:00 del 9 novembre 2015 alle 05:30 del 5 dicembre 2015
</linqmap:reportDescription>
<linqmap:city>Torino</linqmap:city>
<linqmap:country>IT</linqmap:country>
<linqmap:roadType>4</linqmap:roadType>
<linqmap:reportRating>0</linqmap:reportRating>
<linqmap:reliability>7</linqmap:reliability>
</item>

Table 8. Weather hazard event notification elements.
Element	Value	Description
pubDate	Time	Publication date.
georss:point	Coordinates	Location per report (Lat long).
linqmap:uuid	String	Unique system ID.
linqmap:magvar	Integer (0359)	Event direction (Driver heading at report time. 0 degrees at North, according to the driver's device).
linqmap:type	See alert type table	Event type.
linqmap:subtype	See alert subtypes table	Event subtype depends on parameter.
linqmap:reportDescription	String	Report description (supplied when available).
linqmap:street	String	Street name (as is written in database, no canonical form, may be null).
linqmap:city	String	City and state name [City, State] in case both are available, [State] if not associated with a city (supplied when available).
linqmap:country	String	See two letters codes in http:// en.wikipedia.org/wiki/ISO_31661.
linqmap:roadType	Integer	Road type (see road types table in the appendix).
linqmap:reportRating	Integer	User rank between 16 (6 = high ranked user).
linqmap:jamUuid	String	If the alert is connected to a jam jam ID.
linqmap:Reliability (new)	0-10	How reliable is the report, 10 being most reliable. Based on reporter level and user responses.

Here is the same example in JSON format:

{"country":"IT", "roadType":1, "magvar":258, "subtype":"",
"reportRating":0, "reliability":6, "reportDescription":"blocco  del traffico per alcuni veicoli nella ZTL (Zona Traffico Limitato) Non possono circolare  Veicoli per iltrasporto persone Dal lunedì al venerdì, dalle ore 8 alle ore 19  veicoli benzina Euro", "location":{ "x":7.6800935614336545, "y":44.9991565694201 }, "type":"WEATHERHAZARD", "uuid":"39d9dc07bd743b35ba6b833f5cbd1ce1", "pubMillis":1448546704610}, {"country":"IT", "magvar":0, "subtype":"ROAD_CLOSED_EVENT", "city":"Nichelino", "street":"Via Fenestrelle", "reportRating":0, "reliability":9, "reportDescription":"lavori", "location":{"x":7.627331910061528,"y":45.00419885851123}, "type":"ROAD_CLOSED", "uuid":"1064e72c0d3b332d95c61dcab524aa5c", "pubMillis":1446918728242},

Example Waze Speed/Congestion Data: Waze generates traffic jam information by processing the following data sources:

GPS location points sent from users' phones (users who drive while using the app) and calculations of the actual speed versus average speed (on specific timeslot) and free-flow speed (maximum speed measured on the road segment).
User-generated reports shared by Waze users who encounter traffic jams. These appear as regular alerts and affect the way we identify and present traffic jams.

Table 9 describes traffic jam parameters, which are also in XML.

Table 9. Traffic jam parameters.
Element	Value	Description
pubDate	Time	Publication date.
linqmap:type	String	TRAFFIC_JAM.
georss:line	List of longitude and latitude coordinates	Traffic jam line string (supplied when available).
linqmap:speed	Float	Current average speed on jammed segments in meter/second.
linqmap:length	Integer	Jam length in meters.
linqmap:delay	Integer	Delay of jam compared to free flow speed, in seconds (in case of block, 1).
linqmap:street	String	Street name (as is written in database, no canonical form (supplied when available).
linqmap:city	String	City and state name [City, State] in case both are available, [State] if not associated with a city (supplied when available).
linqmap:country	String	Available on EU (world) server (see two letters codes in https://en.wikipedia.org/wiki/ISO-31661).
linqmap:roadType	Integer	Road type (see road types table in the appendix).
linqmap:startNode	String	Nearest Junction/street/city to jam start (supplied when available).
linqmap:endNode	String	Nearest Junction/street/city to jam end (supplied when available).
linqmap:level	0-5	Traffic congestion level (0 = free flow 5 = blocked).
linqmap:uuid	String	Unique jam identifier.
linqmap:turnLine	Coordinates	A set of coordinates of a turn only when the jam is in a turn (supplied when available).
linqmap:turnType	String	What kind of turn it is: left, right, exit R or L, continue straight, or NONE (no info) (supplied when available).
linqmap:blockingAlertUuid	String	If the jam is connected to a block (see alerts).

An example of a traffic jam notification is below in the XML format.

<item>
<pubDate>Sun Nov 29 12:57:44 +0000 2015</pubDate>
<linqmap:uuid>52cf216f-799e-3b62-9b72-5cb6a15e9c67</linqmap:uuid>
<linqmap:type>Medium</linqmap:type>
<georss:line>
40.680629 -74.004695 40.681749 -74.005537 40.682689 -74.005947

40.683742
-74.00628 40.684477 -74.006569 40.685214 -74.006994 40.686049
-74.007391         
40.688904 -74.009512 40.690987 -74.011508 40.700833  -74.015145
<linqmap:speed>15.3629673206283</linqmap:speed>
<linqmap:length>2433.0</linqmap:length>
<linqmap:delay>61</linqmap:delay>
<linqmap:endNode>Hugh L. Carey Tunnel</linqmap:endNode>
<linqmap:street>Hugh L. Carey Tunnel</linqmap:street>
<linqmap:city>New York, NY</linqmap:city>
<linqmap:country>US</linqmap:country>
<linqmap:roadType>3</linqmap:roadType>
<linqmap:level>1</linqmap:level>
<linqmap:turnType>NONE</linqmap:turnType>
</item>

Use Case: Metropolitan Area Transportation Operations Coordination (MATOC) TweetDeck. Figure 8 is from the MATOC program in the National Capital Region. The MATOC operations center actively monitors Twitter feeds through a software application called TweetDeck (https://tweetdeck.twitter.com/). Through this application, the MATOC operations personnel are able to get information from other agencies, the media, and even the public who may prefer tweeting rather than using Waze. Monitoring these feeds requires additional diligence and practice (figure 9). Operators must configure their TweetDeck to report out on users, hashtags, and other information deemed relevant to the operator. It took several months of effort to refine their settings before the feeds were providing the level of information they desired. They must also tweak their settings every few months to ensure everything continues to run smoothly. Despite the initial effort involved, the operators like the system so much that they dedicated an entire media wall to the feeds so that it is always visible.

Screenshot shows how data from the MATOC twitter monitoring system displays data. This portion of the screen is showing Twitter data for Traffic, Transit, Public Safety, and Emergency Management. The information for each event is summarized here and the hashtags are given so the complete message can be viewed by clicking the link. Copyright 2019 Metropolitan Area Transportation Operations Coordination.

Figure 8. Screenshot. A small portion of the Metropolitan Area Transportation Operations Coordination Twitter monitoring system.
Source: TweetDeck screenshot provided courtesy of the Metropolitan Area Transportation Operations Coordination Program.

Individuals working the MATOC operations room. Monitors and panels display information such as the TweetDeck, weather maps, roadmaps, etc. Copyright 2019 Metropolitan Area Transportation Operations Coordination.

Figure 9. Photo. The Metropolitan Area Transportation Operations Coordination operations room with TweetDeck prominently displayed on the upper-right media wall panel. Source: Metropolitan Area Transportation Operations Coordination.

The following provide direct access to Twitter feeds:

The Twitter API (part of the Twitter Developer Platform).
Through the Twitter application or website.
Through a number of third-party Twitter applications (like TweetDeck) and data mining tools.

The Twitter API and examples of data from tweets are available at hhttps://developer.twitter.com/en/docs/basics/getting-started.

Many agencies, however, will not directly integrate Twitter feeds into their ATMS platform, but will use third-party viewers of data for situational awareness. Figure 10 shows an example of a traffic alert from the Denver Police Department.

A screenshot shows an example crash alert from the Denver Police Department. The Twitter message is displayed to describe the alert, and a photo shows a crash alert banner announcing the type of event.

Figure 10. Screenshot. The Denver Police Department uses the #Traffic hashtag to alert the public (and other agencies) about traffic events.
Source: Twitter/Denver Police.

Use Case: Pushing Data back to Crowdsourced Users. Crowdsourced data can go two ways. Several agencies are now using Waze's CCP to provide agency data back to Waze users. This can come in the form of providing the location, times, etc. of road construction projects, special events, or collisions (figure 11). However, the Port Authority of New York and New Jersey (PANYNJ) is also using Waze to influence routing and driver behavior.

Screenshot shows text describing an example event. The event text includes: description, direction, Start date and time, end date and time, and the name of the event. Copyright: 2019 Waze.

Figure 11. Screenshot. Example of a crowdsourcing event.
Source: Waze.

PANYNJ leverages Waze's website for agency partners to create realtime road closures when they need traffic to divert away from specific roads leading up to a terminal. When traffic congestion increases on a particular route, PANYNJ tells Waze about the road closure, which then prohibits Waze user's navigation systems from directing them onto that roadway.

A realtime closure can indicate temporary closure of a segment in one or both directions to all Waze users. When the realtime closure is active, the affected segment will be marked with red-and-white stripes, and Waze will not route any traffic through or onto the segment. Waze instructs users to report a realtime closure only when a road closure is complete in one or both directions.⁴

There are several possible ways to prevent Waze from routing traffic over a segment. Table 10, excerpted from Waze's technical support documentation, explains the options.

During a realtime closure, Waze will not route to a destination on the closed segments, nor partway through the closed segment, even if that is the closest segment to the destination. Instead, it will pick a stop point on the next closest segment. If the closed segment is much longer than the part of the road actually closed to all traffic, this can result in directing users to a nearby street even though they should be able to drive to the destination. For this reason, if a closure is very localized and going to persist for more than 1 week (e.g., a bridge replacement), it might be worth the effort to edit the closed segment.⁵

If the user is on a closed segment, Waze will find a route that begins in the closed segment and find the shortest distance to a segment with no closure.

Table 10. Realtime closure options for Waze Connected Citizens Program agency members.
Method	Vehicles Affected	Takes Effect	Ends	Traffic Data	Guidance
Realtime closure	All	Immediate	Expires	Kept	Preferred option for temporary (even long-term) one-way or two-way closure. Visible to drivers. Immediate effect. Automatically removed when it expires.
Road direction change	All	Tile update	Permanent	Kept	Use only for permanent change in direction from two-way to one-way.
Time-based segment restriction	Some	Tile update	Optionally expires	Kept	Use only where the restrictions (time of day/day of week) are permanent, or where certain vehicle types are allowed or prohibited. Example: No passenger cars on weekdays.
Time-based turn restriction	Some	Tile update	Optionally expires	Kept	Use where travel on the segment is allowed, but turns onto the segment are temporarily forbidden or else permanently forbidden at certain times of day, days of week, or for certain vehicle types. Example: No left turn 4:00pm-6:00pm.
Permanent turn restriction	All	Tile update	Permanent	Kept	Use where a turn onto the segment should be permanently forbidden for all vehicles.
Conversion to a penalized road type	All	Tile update	Permanent	Kept	Penalties make routing less likely, but are not absolute. Vehicles with a destination on the segment will be routed onto the segment. Normally private road is used.
Disconnection	All	Tile update	Permanent	Lost	Use only if the disconnection is permanent. All turn data is lost.
Deletion	All	Tile update	Permanent	Lost	Use only if the road is permanently closed. All data is lost.

Source: Wazeopedia, "Realtime closures." Available at https://wazeopedia.waze.com/wiki/USA/Real_time_closures, last accessed March 17, 2019.

Use Case: Utah DOT Crowdsources Maintenance. The Utah DOT (UDOT) worked with SeeClickFix to customize a solution for citizen reporting of roadway maintenance issues. SeeClickFix is an application that allows citizens to take pictures of maintenance issues and send them to the agency (along with geo-located photos, attributes, etc.). In addition to collecting citizen requests, the solution acts as a service-request management system that has helped to improve agency efficiencies and transparency. The Utah Click'n Fix website (figure 12) even prints out service reports to show how the agency is managing requests over time. When more than one citizen service request comes in for a single issue, consolidating and handling the requests as one saves time on maintenance issues. This consolidated work means that employees no longer have to write 50 emails or make 50 phone calls in response to 50 separate citizen service requests. Utah citizens can report issues through UDOT's smartphone app, which empowers citizens and the DOT to resolve maintenance and safety issues more quickly (figure 13).

The UDOT website provides this widget to let users report and display roadway maintenance issues or make service requests. This includes instructions for contacting the agency and the ability to display a map with locations and descriptions of events throughout the State. Copyright: 2019 Utah Department of Transportation

Figure 12. Screenshot. Utah Department of Transportation's Click'n Fix widget accompanies the Click'n Fix application.
Source: Utah DOT, Contact Website. Available at: https://www.udot.utah.gov/main/f?p=100:pg:0::::T,V:376.

Illustration shows three smartphone screens for the UDOT Click 'n Fix app. The main screen provides buttons for different operations within the app. The Issues screen shows either a list of issues or a map with the location of issues. The third screen shown is a New Report screen where users can specify the type of issue, the location, and a description of the issue. From here you can submit the report to UDOT.

Figure 13. Illustration. Utah Department of Transportation's smartphone app empowers citizens and the agency to resolve maintenance and safety issues more quickly.
Source: Utah Department of Transportation, SeeClickFix Website. Available at: https://seeclickfix.com/pages/case-studies/utah-dot.html.

Roadside Basic Safety Message Data

Description

CVs and infrastructure can use special communications protocols, such as dedicated short-range communications (DSRC), 5G (the fifth generation of wireless technology), and others, to exchange basic safety messages (BSM) and infrastructure messages (figure 14). These communications protocols allow vehicles and infrastructure to exchange messages within 1,000 meters approximately 10 times per second primarily to support safety applications, and secondarily to support mobility applications. Under BSM Part 1, vehicles communicate their size, position, speed, heading, acceleration, and brake system status to each other and to the infrastructure.⁶ BSM Part 2 includes additional data elements, such as weather data and vehicle status data. Similarly, connected infrastructure can transmit its status to vehicles or perform action in response to received BSM data. Although poorly defined, infrastructure message format does include some potential message types that represent digital descriptions of the roadway (e.g., signal phase and timing (SPaT) messages or MAP type messages). In addition to these, infrastructure messages could include information regarding speed limits, especially as they change in work zones and school zones; dynamic message sign information; and other advisories.

This illustration depicts signal communications between various connected vehicles and between vehicles and fixed sensors in an urban setting. Lines drawn between the various sensors on vehicles or on the ground indicate the wide-ranging extent of communications.

Figure 14. Illustration. U.S. Department of Transportation connected vehicle and connected infrastructure concept.
Source: U.S. Department of Transportation, Intelligent Transportation Systems Joint Program Office. Available at: https://www.its.dot.gov/itspac/october2012/PDF/data_availability.pdf.

Applications

BSM data has many potential uses, including:⁷

Vehicle platooning (speed harmonization, cooperative adaptive cruise control, etc.).
Queue warnings.
Intelligent traffic signal system.
Transit signal priority.
Mobile accessible pedestrian signal system.
Emergency communication and evacuation.
Work zone alerts.

Attributes

BSM is broadcast approximately 10 times per second, which is sufficient for most basic safety applications. However, many roadside unit (RSU) providers also have other data transmission capabilities that can have latency as low as 20 ms and as high as several seconds depending on the application.

Details

BSM Part 1 provides only core information needed for immediate safety applications:
- Vehicle size.
- Position.
- Speed.
- Heading.
- Acceleration.
- Brake system status.
BSM Part 2 promises additional data elements, such as:
- Recent braking.
- Path prediction.
- Throttle position.
- Differential GPS.
- Stability control.
- Exterior light status.
- Wiper status.
- Ambient temperature.
Infrastructure messages can contain information such as:
- Current signal phase and residual time.
- Signal request and status.
- Pedestrian status.
- MAP information – digital representation of the intersection.
- Air quality.
- Roadway friction information.
- Traveler information.

Data Availability

Current data coverage is limited to pilot sites and test sites, at least in terms of data that is available to TMCs. As connected vehicles and infrastructure deployments proliferate, the coverage will most likely increase as well.

Many vehicle manufacturers and original equipment manufacturers (OEMs) are building out CV functionality. However, it is not clear what data may be available for agencies to consume beyond existing pilot deployments and test sites. OEMs are building their own cloud services or partnering with other third parties that are based on the concept of bundling connected data and selling it to interested parties, including the public sector.

For example, at least one company acts as a neutral third party that provides the service of collecting and transmitting CVs data to interested parties using APIs.

At this time, the United States Department of Transportation (USDOT) has provided access to a number of CV and infrastructure data sets from pilot implementations and test sites through its public data portal (table 11).

Table 11. Example connected data sets.
Example Connected Data Source	Description
Wyoming Connected Vehicle (CV) Pilot	Basic safety message (BSM) from Wyoming CV Pilot project.
Belle Isle Road Weather Demonstration	Road weather observations collected by several CVs over a period of several months in Belle Isle, Michigan.
Minnesota Department of Transportation Mobile Observation	Data from instrumented snowplows and light-duty pickups.
Intelligent Network Flow Optimization (INFLO) Prototype	Small-scale demonstration of INFLO prototype system in Seattle, Washington, including 21 vehicles exchanging BSM with roadside units (RSU) and transmitted to the transportation management center.
Multi-Modal Intelligent Traffic Signal Systems Study	Study that collected vehicle trajectories by capturing BSM from CV via RSUs. This study also exchanged signal phase and timing message with connected vehicles in the intersection.

Source: United States Department of Transportation, Intelligent Transportation Systems Joint Program Office. Available at: https://www.its.dot.gov/data/search.html.

On the infrastructure side, a number of CV communication and network companies provide connected infrastructure functionality; e.g., traffic signals, critical hazard alerts, and roadway weather conditions. Third-party companies have developed applications that utilize infrastructure data to support safety and mobility applications.

Pros and Cons

CV and infrastructure data allows drivers to be safer through better information regarding vehicle surroundings and environment. It also allows agencies to disseminate important and tailored information directly to drivers and vehicles in a much more integrated fashion than was traditionally possible using dynamic message signs (DMS) or highway advisory radio (HAR).

CV and infrastructure data can be very large and rapid, which provides a challenge for agencies to process it in realtime and use it to guide decisions or manage congestion. It is unclear what level of aggregation (if any) is necessary for data to be useful in each application.

The current level of market penetration for CVs and connected infrastructure is low, therefore, it has minimal use for TMCs focused on realtime operations at this time. However, deployment of this technology will continue across the country with more data becoming available every day.

Use Cases for Roadside Basic Safety Message Data

Use Case: Wyoming Connected Vehicle Pilot. As part of the Wyoming CV Pilot, the University Corporation for Atmospheric Research (UCAR) has implemented a Pikalert system (figure 15) that combines vehicle-based measurements with traditional weather observations to provide alerts to CVs traversing the corridor. Pikalert consists of several components:

Vehicle data translator.
Enhanced maintenance and decision support system (EMDSS).
Motorist advisory and warning (MAW) application.

This illustration shows the vehicle-based measurements with traditional weather observations to provide alerts to connected vehicles moving through a snowy mountain location. A remote Data Processing Center receives data from a remote Doppler radar, a weather satellite, a local Enhanced maintenance and decision support system, and connected vehicles. The vehicles provide temperature, pressure, velocity, brake status, steering, traction control, wiper status, and headlight status. The Data Processing Center then can then interpret the data and send warnings to approaching vehicles. Copyright 2019 University Corporation for Atmospheric Research.

Figure 15. Illustration. The Pikalert concept.
Source: University Corporation for Atmospheric Research, Research Applications Laboratory, "Promoting Vehicle Safety, Mobility, and Environmental Efficiency" Web page. https://ral.ucar.edu/solutions/promoting-vehicle-safety-mobility-and-environmental-efficiency.

Vehicle Data Translator

The Vehicle Data Translator consists of three stages:

Stage 1: CVs with a controller area network bus (CANBus) and aftermarket sensors provide data from several sources. The data elements include barometric pressure, windshield wipers settings, headlight status, ambient air temperature, speed and heading, adaptive cruise control, location and elevation, hours of operation, anti-lock braking system (ABS) and brake status, stability and traction control, yaw/pitch/roll, accelerometer, steering angle, and differential wheel speed. Data is checked for quality; sorted by time, road segment, and grid cell; and published as parsed mobile data (figure 16).
Stage 2: Ancillary data such as that from radar and road weather information system (RWIS) is collected. The quality of this data is checked and published as basic road segment data.
Stage 3: Variables are inferred using data from the previous two stages. For example, wiper activity in combination with weather radar, satellite, and temperature data can indicate precipitation type and intensity. Similarly, headlight status, wiper activity, and RWIS information can define visibility measurements, and ABS and traction control and weather radar can indicate pavement conditions.

Flow diagram depicting the vehicle Data Translator architecture.

RWIS = road weather information system

Figure 16. Diagram. Vehicle Data Translator Architecture.
Source: University Corporation for Atmospheric Research, Research Applications Laboratory.

Enhanced Maintenance and Decision Support System. EMDSS incorporates CV data into a forecast and decision process. Traditional sensors and equipment-generated data is enhanced and supplemented by CV data that provides more robust coverage and more detailed information about conditions on the entire corridor. This combined information supports maintenance operations and provides the opportunity for a proactive approach in handling adverse weather conditions. This proactivity improves safety and provides an opportunity for more effective use of limited resources by targeting the most critical problem spots before issues arise.

Motorist Advisory and Warning Application. MAW capitalizes on the rich output of vehicle data translators to provide travelers with hyperlocal and near-realtime road weather information, as well as accurate 24-hour forecasts of road weather conditions.

Realtime and Archived Trajectory Data

Description

Trajectory data is time-stamped location data from vehicles, cell-phones, or other GPS-enabled devices moving throughout a network. This is sometimes referred to as "bread-crumb trail" data.

Figure 17 illustrates an example of trajectory data where each red dot represents a ping from the vehicle as it moves from point 0 to point 14.

This map shows numbered points (from 0 to 14) along a road that mark specific locations where a ping is sent from the vehicle as it moves along the road. Copyright 2017 Google.

Figure 17. Map. Example trajectory data for a single trip.
Source: Google.

The system can make data anonymous in different ways. For example:

Rotation of the unique vehicle identifiers on a set time interval ensures that vehicles cannot be tracked for multiple days in a row.
Trips may be "clipped" when entering or exiting residential neighborhoods to keep from pinpointing home addresses.

Because of the privacy protocols, probe vehicle data providers add 1 to 2 days of latency prior to delivery to agencies. However, technology and capabilities are improving each day with anticipation of this data set being available in near-realtime in the next year or two. The data has many application areas for both planning and operations.

Applications of Realtime Trajectory Data

Trajectory data is a new dataset with a great deal of potential for operations management. Realtime uses of trajectory data include:

Realtime traffic pattern analysis
- Operators can look at individual vehicle trips to evaluate corridor demands.
- In case of an incident or congestion, realtime trajectory data can show the effectiveness of implemented detours or self-detouring patterns.
- Evaluate impacts of special events that may result in creation of a temporary significant trip origins or destinations and route utilization.
Multi-modal system utilization
- Trajectory data can show where and when mode transitions occur and provide operators with ability to influence traveler decisions based on network utilization patterns.

In addition to realtime uses, trajectory data is also valuable in operations planning. For example, planners can use trajectory data as follows:

Trip patterns between jurisdictions
- Traditional O-D analysis to identify trip origins and destinations, work versus leisure travel, etc.
- Waypoint analysis to determine if traffic in a specific area (State, county, traffic analysis zone, business center) originated in the same area, neighboring area, or another more-distant location. This can identify whether certain corridors are mainly local travel or pass-through corridors, etc.
- Analyze trip clusters to evaluate effectiveness of existing transit service or identifying areas in need of new transit service.
- Analyze trip patterns that may impact critical freight corridors or ports.

Attributes

For each individual trip, trajectory data usually includes:

Unique device/vehicle identifier.
Unique trip identifier.
Departure time and location (trip origin).
Periodic waypoints during the trip, including:
- Latitude/longitude.
- Timestamp.
- Instantaneous speed/heading.
- Identifier of road segment for waypoint.
Arrival time and location (trip destination).

Identification of each origin, destination, and waypoint uses latitude/longitude pairs and timestamps. Some providers "snap" these latitude and longitude points to a particular road segment. Collection of waypoints at varying intervals depends on the probe vehicle type and location and can occur from once per second up to once every 5 minutes.

Data Availability

Trajectory data is relatively new to the market, and only a few companies are providing access to this information. At least one company provides analytics services on top of other third-party data providers and location-based service providers; however, those services may not provide direct access either to individual trips or to the raw trajectory data. Ride-hailing companies are also starting to make limited O-D datasets available to select researchers in limited metropolitan areas. These data sometimes have heavy restrictions on use, limit the O-D analysis to larger geographic zones, and do not provide route choice analysis. As the demand for this data rises and technology improves, other companies with access to "probe-like" data will probably offer similar data sets.

Pros and Cons

A huge benefit of the trajectory version of O-D data is the ability to construct an entire trip based on origin, destination, and the route taken during the trip. No other data set on the market has the potential to provide such rich information about route and mode choice. Trip data from third parties is more cost effective than travel diaries or roadblock studies. However, there is a 1 to 2 day lag between when the data is collected and when it is available to an agency. This makes the data particularly useful for after-action reviews (AARs) and otherwise understanding the greater impacts of operations decision on the traveler. In the near future, this information will be available in realtime or near realtime, thus making it more applicable to realtime signal control, ramp metering, dynamic routing, and more.

One of the challenges is the size of the data set. For example, data for the relatively small State of Maryland for one year of data includes nearly 100 million individual trip records, and over 7 billion waypoints. Analysis of data of this size requires dedicated information technology (IT) infrastructure and expertise—either in-house or provided by a third party.

This illustration is a fictitious map showing a trip route. Waypoints collected as a vehicle moves along the route are indicated with dots, although these waypoints do not always lie directly on the route due to GPS errors and noise. The errant waypoints are automatically snapped to the route so the corrected waypoints lie directly on the specified route. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 18. Illustration. Snapping waypoints to routes can sometimes be a challenge.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Latitude/longitude pairs identify most waypoints, and only a select few providers pre-map the waypoints to TMC-segments or OpenStreetMap (OSM) segments. This means that data users may need to be able to conflate those locations to an underlying map to be able to map the trips to specific roadways and corridors (figure 18). Because latitude/longitude pairs may not be exactly accurate due to GPS errors and noise, in dense urban areas or dense roadway networks, subsequent waypoints may appear to jump from one road to another and back as the probe vehicle traverses the network. Collecting waypoints at longer time intervals makes it more difficult to determine the exact route taken between one ping and another. This presents a challenge when attempting to construct a trip using those waypoints. A benefit of machine learning algorithms is that they allow users to snap the waypoints to the network with a high degree of accuracy. However, implementing machine learning algorithms requires understanding data science, geographical information systems (GIS), and how machine learning works.

Use Cases for Realtime and Archived Trajectory Data

Use Case: Impact of New Tolling on Travelers and Mode Choice. The Virginia Department of Transportation (VDOT) recently implemented high occupancy tolling on I-66 leading into the District of Columbia. These variable tolls can range from $4 to $47 per trip. Realtime and historic trajectory data can show the impacts of these tolls on route choice. VDOT is using trajectory data to understand which routes motorists took to get into the District prior to the tolling going into effect. They are then evaluating how motorists' routes changed after implementing the tolling. A future application could be to evaluate—in realtime—how changing the rates throughout the day is affecting trips, which arterials are taking the brunt of the extra traffic, and the exact routes that commuters are taking to avoid the tolls.

Use Case: Impact of Construction on Travelers. Several DOTs and metropolitan planning organizations (MPOs) leverage O-D (trip) data with trajectories and routing details for realtime operations. Transportation systems management and operations (TSMO) groups can use this data to understand the impacts of work zone management practices, develop different traveler information and communication strategies, establish more effective signal timing plans, and support freight operations. These same agencies can use the trajectory data to support before-and-after studies to show how the finished construction projects have changed route choice.

Use Case: Maryland DOT Mid-block Signal Timing Analysis. Trajectory data has the power to support signalized arterial applications. Collection of waypoints throughout a trip enables collection of true travel times within cities among a very large number of routes. Whereas standard probe data provides average trip speeds from intersection to intersection, trajectory data can provide mid-block to mid-block travel times, allowing agencies to quickly understand turning movement travel times and overall signal performance (figure 19).

The Origin-Destination data suite provides true travel time midblock analyses and gives mid-block to mid-block travel times. The screen shows the total trip travel time, and the time to traverse various segments along the specified route. The metric is travel time and the parameters specified in this example are data range selection based on year, time of day, days of the week, and months. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 19. Illustration. The Center for Advanced Transportation Technology Laboratory's Origin-Destination Data Suite uses trajectory data to conduct midblock analyses.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

The Maryland DOT is funding the development of a mid-block travel-time analysis suite using INRIX trajectory/trips data. This tool will provide a ranked list of all turning movements in a city, allowing signal operators to understand the user delay cost associated with all turning movements in a city (figure 20). The graph in the lower right shows how many trips made it through this particular intersection (left turn) within the first cycle, second cycle, and third cycle.

The screenshot shows a ranked list of all turning movements in an example for a specified zip code for a date range. At the top of the screen is a table with ranked intersections. Intersection information is listed in a table for each intersection, and includes the Intersection name, approach, movement, volume, user delay cost, average travel time and various travel time percentiles. On the bottom left of the screen is a map showing the region. Each intersection selected from the table appear on the map. On the bottom right of the screen is a summary graphic that shows data for a selected intersection, and shows how many trips made it through this particular intersection (left turn) within the first cycle, second cycle, and third cycle. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory

Figure 20. Screenshot. The Center for Advanced Transportation Technology Laboratory's Origin-Destination Data Suite provides ranked intersection movements by zip code and date range based on trajectory data.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

INRIX trip data comes in several formats with multiple output files that include metadata (table header definitions, source data, device types, waypoint details for each trip, origins, destinations, etc.). While the entirety of the files is too expansive to include in this document, figure 21 is an example output from one of these many files describing trips and the waypoints along a particular trip. This cropped file shows only eight waypoints of a much longer trip. The expanded file shows thousands of additional trips and waypoints.

Screen capture of a table depicting data feeds with information for Trip ID, Waypoint sequence, Capture date, Latitude, Longitude, Segment ID, Zone Name, Frc, Device ID, and Raw speed.

Figure 21. Screenshot. Example data feeds.
Source: INRIX.

INRIX trip data shows actual trajectories and waypoints for individual trips, which allows an agency to conduct true route analysis, travel-time analysis, and more for individual trips. The data is quite voluminous because it contains individual trip data, and third parties have developed a number of tools to help agencies better utilize the data.

Use Case: Maryland DOT Identifying Travelers Impacted by a Project. Leveraging trajectory data enables agencies to perform outreach and education prior to major construction projects, detours, etc. Figure 22 shows how the Maryland DOT-funded O-D Analytics suite shows where trips that passed over a very specific road segment originated. The same tool also shows the destination of those same trips.

The O-D Analytics suite illustrates a heatmap drawn over a roadmap of the Washington Metropolitan area that illustrates the origin of trips that traveled on a specific road segment. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 22. Screenshot. The Origin-Destination Analytics Suite illustrates the origins for trips that passed over a very specific road segment.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Crowdsourced Map Data

Description

Traditionally, public and private sector entities generated map data for specific uses. In transportation, most agencies purchase map data from one of the major map data providers. However, over the last several years, crowdsourced maps have become more prevalent and more viable for use in transportation operations. Public apps and websites create and update crowdsourced maps in near realtime or on a small delay if the contributor community needs to verify the changes. Most of the time, these maps are freely available (figure 23). Private sector map providers still use crowdsourcing to improve their maps, but the map data is still proprietary.

This screenshot shows an example of a map created with crowdsourced data. The emphasis is on transportation operations. Copyright 2019 OpenStreetMap contributors

Figure 23. Screenshot. Example OpenStreetMap.
Source: OpenStreetMap.

Applications of Crowdsourced Map Data

Crowdsourced map data allows agencies to have access to more accurate and more frequently updated map data than available with traditional maps. It does not necessarily change the way TMCs use maps. Instead, many TMCs use crowdsourced map data in their ATMS or for their traveler information systems.

Attributes

Crowdsourced map data usually consists of the same or similar data elements as any traditional map, including roadway networks, landmarks, businesses, parks, etc. In general, crowdsourced map data is not a different data product, but rather a differently created data product.

Latency and Frequency:Updates to crowdsourced map data occur more frequently than traditional map data because crowdsourced mapping relies on constant input from customers in the field. Crowdsourced map data can sometimes be updated in realtime, allowing users to attach images, reviews, and other information. If not updated in realtime, data updates usually occur with low latency, which typically is the result of community validation or another type of crowdsourced verification to ensure the information is accurate.
Details:The level of detail in crowdsourced maps varies. For example, raw open source data may contain only basic map data, such as roadway geometries, landmarks, etc. Similarly, another company provides realtime routing; therefore it contains a high level of detail related to roadways and speed limits, although it may be lacking when it comes to landmarks and non-transportation assets and information. Other map providers have a diverse customer base and provide many services, including directions and routing by mode (walking, biking, driving, transit, etc.), landmark and business information, traffic conditions, etc. As a result, their data tends to have a high level of detail in many different contexts; however, not all of it is available for purchase by agencies and TMCs.
Quality and Coverage:Similar to crowdsourced incident and congestion data, quality and coverage of crowdsourced map data relates to the number of participants contributing information. This means that urban and densely populated areas usually result in good data quality and coverage, while less-populated areas may have lower quality and coverage. Due to this, crowdsourced map data is considered a supplemental data set rather than on its own, unless it is for a specialized purpose, such as a routing application.

Data Availability

There are several crowdsourced map data providers including free, open-source options such as OSM and other private-sector providers. Many of these providers use large user communities that contribute map information, including GPS traces, link creation and identification, landmark and asset information, etc. Others may use technology deployed in vehicles to analyze surroundings and update map information such as speed limit information, restrictions, roadway changes, etc.

While each crowdsourced map data provider offers similar data elements, they collect data in different ways. A large community of global map contributors edits the OSM data. They collect and add map data elements and layers and publish them under the open source license. This approach enables data to be updated frequently and accessible to anyone interested in using the map. Map users are free to enhance the OSM map by adding information to it, with a caveat that users need an open source license to publish added data. While, conceptually, the OSM model appears to be ideal for rich and accurate mapping, the open source caveat often results in users not being willing to contribute their additions to the open source community, since their data is either proprietary, valuable to the contributor, or private in nature.

OSM data is truly free to use. It can be downloaded fully by any agency with few restrictions. While many agencies also think of other maps as being free, one may not actually download the raw basemap data from many common map providers. OSM also includes user-generated map data that goes beyond roads. For example, contributors may have mapped out the locations of nearby items of interest, such as trees, fire hydrants, public restrooms, potholes, etc.

OSM data have some limitations, which may or may not affect traffic operations. Because the OSM system depends on volunteers to produce and edit maps, data quality and consistency can vary from location to location around the world. Metadata can sometimes be lacking, which makes it difficult to know if certain layers are current. The data is also not completely authoritative. While the road network was derived from U.S. Census data, making it fairly trustworthy, other layers (e.g., points of interest, location of trees, etc.) may be less authoritative.

Because of the OSM limitations listed above, a number of third-party map distributors have entered the market to address these limitations. These distributors enhance raw OSM maps and provide additional data elements, layers, and hosted services at a cost.

Most crowdsourced map data providers offer several different ways to obtain and use this data. TMCs are able to purchase raw map data, such as shapefiles and associated attributes, and integrate that data into their native ATMS. Alternatively, TMCs can procure hosted services in forms of tiling services or other map services. These hosted services allow TMCs to embed links to an external map service or to build maps in realtime by querying tiling services hosted elsewhere.

OSM data (figure 24) are available as a topological data structure consisting of four primary elements: nodes, ways, relations, and tags. This data is stored in a primary database that hosts all edits and is the primary source of all OSM data output formats. In addition to raw data, users are able to obtain individual GPS traces submitted by contributors.

Note that figure 24 illustrates the data that is used to create the information on the mapping or image tiles. The map is not just pictures that can readily be integrated into a GIS platform.

Screen capture of XML code depicting of crowdsourced OpenStreetMap data.

© OpenStreetMap contributors.

Figure 24. Screenshot. Example of OpenStreetMap data in extensible markup language format.
Source: OpenStreetMap. Data is available under the Open Database License.

Pros and Cons

The greatest benefit of crowdsourced map data is in its update frequency. Crowdsourced maps do not have to rely on expensive and infrequent satellite sweeps or physical path traversing. Instead, map users can contribute changes and additions to existing maps to keep them updated in near realtime.

OpenStreetMap copyright and license states: If you alter or build upon our data, you may distribute the result only under the same license.

Figure 25. Screenshot. License rules for OpenStreetMap.
Source: OpenStreetMap.

As with other crowdsourced data, there is some level of unreliability due to the possibility of users intentionally or unintentionally providing erroneous data. However, these map user communities are usually quick to "self-heal" through change moderation or independent verification by other users.

One provider often prompts its users to answer several questions about locations, businesses, and landmarks (e.g., "is there a wheelchair accessible ramp?") to validate and enhance existing data.

In cases of crowdsourced map data that also requires an open data license, the license may force agencies to share any derived data in the public domain. This might violate other licenses and agreements, and the data might not be wanted. For example, the OSM license requires users who derive data from OSM to publish that data back under Open Data Commons Open Database License (figure 25). This may be undesired if derived data is of sensitive or proprietary nature.

The above licensing rule should not restrict agencies from using OSM data. Agencies can still create derivative products, such as additional map layers, without having to share those back. It is only when an agency directly edits the OSM files/database that those data need to be re-shared with the community. This subtle difference in understanding the licensing terms allows third parties to leverage OSM data to create their own derivative products and services that they then sell back to agencies and other companies.

For some specific applications in transportation, crowdsourced map data may be not be sufficiently accurate by itself. For example, agency asset-management divisions may need to know details about individual assets that the user community creating the map might not focus on. They could still leverage a crowdsourced map for understanding the road network, but they might have to create their own supplemental layers for certain assets like guardrails, signs, etc. if the user community had not already created those layers.

Use Cases for Crowdsourced Map Data

Use Case: Virginia DOT 511 System. VDOT 511 traveler information system (figure 26) uses MapBox and OSM as their primary map data. VDOT overlays a number of data elements on top of OSM base data, including event locations and details, DMS, color-coded segments based on realtime probe vehicle speed, Waze-reported events, weather conditions, etc.

MapBox builds additional detail on top of OSM and then provides that product and associated services at a cost (figure 27).

This screenshot gives a map of Virginia with major highways. Using the icon layers to the left of the map, users can display or turn off display of various icon layers that show, for example, road work, message signs, events, weather closures, and many other features.

Figure 26. Screenshot. The Virginia Department of Transportation 511 home page.
Source: Virginia Department of Transportation, 511 Website. Available at: http://www.511virginia.org/.

This map shows part of New York City and the location of trees within the city. The tree locations are color coded based on the tree diameters. Copyright 2019 OpenStreetMap contributor Eden Halperin.

Figure 27. Screenshot. Example of a MapBox map visualizing trees in New York City.
Source: Mapbox, Gallery, Website. Available at: https://www.mapbox.com/gallery/#map-0.

This screenshot shows the layout of the Waze map editor screen for data input. Users can report map errors or contribute new information (e.g., new roads, new infrastructure such as red-light cameras, etc.). The input screen shown here lists address, road type, restrictions, direction, speed limit, look, and elevation. Copyright 2019 Waze.

Figure 28. Screenshot. Inputs panel on the Waze Map Editor.
Source: Waze.

Use Case: Port Authority of New York and New Jersey Road Closure Application. PANYNJ uses the Waze crowdsourced map as part of its online road-closure application to indicate which roads are closed. This helps to keep traffic from being routed onto those roadways. Waze collects crowdsourced map data in addition to collecting crowdsourced congestion and incident data as a basis for their maps (figure 28). Each Waze user can report map errors or contribute new information (e.g., new roads, new infrastructure such as red-light cameras, etc.) using the Waze Map Editor. This makes Waze map very accurate and timely when it comes to the roadway network. However, Waze map is not available for purchase by third parties at this time, and therefore it is of limited use to TMCs.

Waze allows users to use a map editor to suggest changes in existing map data, and if the user is a known map editor in their area, or other users verify the suggested change, that change becomes permanent in the new map. Waze has defined different user levels based on their level of engagement on the platform. Users logged into the Waze app collect points as they drive more miles and as they report events and have those events validated by other users. Classification of Waze users into different map editor levels depends on the amount of driving they do and the number of edits they submit. Initially, users can only make edits within 1 mile around routes they traversed. Trustworthy Waze users gain the ability to provide edits more broadly based on integration and validation of their map edits. In addition to regular Waze users, different geographic regions have dedicated Waze Map Editors verified by Waze as trusted contributors who can validate other editors' contributions and provide guidance.

General Transit Feed Specification (GTFS)

One of the byproducts of Google Maps was the generation of Google Transit Feed Specification (later released as General Transit Feed Specification). GTFS was created as a way for Google to collect agency transit data (schedules, routes, fares, etc.) and display them on the Google Maps. Agencies saw this as a great opportunity to provide transit data to a wider customer base and built their systems to provide GTFS feeds and further enhance Google Maps, opening up new opportunities to provide value-add to their customers.

Use Case: California's Bay Area 511 System. 511 SF Bay is the multimodal traveler information system managed by a partnership of agencies led by the Metropolitan Transportation Commission, the California Highway Patrol (CHP), and the California Department of Transportation. The system provides traveler information for Bay Area travelers via the web or phone. 511 SF Bay uses Google Maps online to show a range of realtime and static traffic, carpooling, transit, parking, and bicycling information.

Google Maps (figure 29) is an example of a proprietary map that uses Google-generated map data in combination with user-contributed data. Google collects location information using their own mobile devices, devices running the Android operating system, and devices running Google location services-powered apps. In addition to this, Google has invested in a fleet of Google StreetView vehicles that traverse roadways (and, recently, trails) and generate images of the locations on the map as part of their StreetView service (figure 30). Users of Google Maps are frequently asked to contribute to map data by validating data collected by Google (such as verifying the location a user navigated to is correct, or verifying that hours of operation for a business are correct), as well as contributing supplemental data such as images of the business or landmark, reviews of services, etc. Due to the significant size of the Google Maps user base, it tends to have one of the more accurate and up-to-date map data sets. Still, updating brand new roads can take a little time—sometimes longer than a DOT would like. While agencies can purchase Google Maps for use in their TMCs, it can be expensive and may limit an agency in what maps can be used for and how they can be modified or enhanced.

The Google Maps platform provides three different products: maps, routes, and places. Each product consists of a number of capabilities available via APIs (figure 31). For example, routing products allows users to generate directions for different modes of transportation, distance matrices that provide travel times and distances for locations, and road traveled during a trip.

This Google map shows Manhattan and parts of New Jersey. Google maps can give directions, provide short facts about a location, and show various features of interest. Zoom features allow users to focus on any level of detail. Copyright 2018 Google.

Figure 29. Screenshot. Google Map.
Source: Google.

Google maps view of a street looking along the direction of travel. The roadway has an overlay indicating it is 5th Ave. Copyright 2018 Google.

Figure 30. Screenshot. Google StreetView.
Source: Google.

Screen capture of a stream of XML code. Copyright 2019 Google.

Figure 31. Screenshot. Example routing result code from Google Maps.
Source: Google.

Probe-based Speed Data

Description

Over the last several decades, agencies often used floating car data to supplement speed data collected by the traditional static sensors. Use of floating cars was intermittent, tedious, and time consuming, which proved to be only marginally useful. Over the last 10 years, the private sector expanded on this concept by using technology built into vehicles and smartphones to transform a large percentage of travelers into probe vehicles. As the number of probes rose, the accuracy and reliability of probe speed data increased to the point where day-to-day transportation operations and planning efforts realized significant benefits.

Probe vehicle data comes from vehicles and people equipped with embedded GPS devices (in their vehicles or smartphones) and provides speed and travel time information. Planning and operations use this information now that it can be aggregated and anonymized.

Another way to generate probe vehicle data is using toll tags, as the Florida DOT (FDOT) does (figure 32). Toll operators use toll tags to identify and re-identify vehicles as they traverse the toll facility and use that information to calculate speed and travel times between re-identification points.

This diagram shows that Florida DOT AVI Units send information along a path that makes the information available to FDOT Users. The AVI Unit information path follows this order: AVI Unit to Roadside CPU to AVI server to OOCEA Travel time server to RTMC server and then made available to FDOT users. Toll operators use toll tags and re-identify vehicles as they traverse the toll facility and use that information to calculate speed and travel times between re-identification points. The process reads toll tags then transmits toll tag reads, computes travel times, and presents travel times to FDOT Users. At each stage, the data is archived as either toll tag reads.

Figure 32. Diagram. Toll tag travel time calculation.
Source: FHWA, iFlorida Model Deployment Final Evaluation Report, 2009. Available at: https://ops.fhwa.dot.gov/publications/fhwahop08050/chap_4.htm.

Applications of Probe Vehicle Data

Probe vehicle data has many applications, including, but not limited to:

Monitoring realtime congestion.
- Detecting and identifying incidents.
- Issuing traveler information.
- Conducting work zone monitoring and impact analysis activities.
- Detecting the end of the queue.
- Comparing realtime speed information to historical trends.
- Identifying recurring and non-recurring bottlenecks.
Performance management
- Evaluating performance metrics over time: travel time, buffer time, reliability, planning time, and associated indices.
- Incorporating data into dynamic performance management dashboards.
- Investigating user delay cost.
- Meeting Federal performance reporting requirements, including Moving Ahead for Progress in the 21st Century (MAP-21) third performance measure rule (PM3) reporting.
- Evaluating worst bottlenecks in a region for a period of time.
- Studying trends, including special event, holiday, and seasonal movements.
- Exploring the impacts of capital investments – prior to, during, and after completion of the project.
Planning and Research.
- Identifying problems.
- Prioritizing projects.
- Performing safety analyses.
- Implementing public participation/information campaigns.
- Conducting before and after studies.
Traveler information.
- Providing realtime travel time information on DMS.
- Delivering network performance information.
- Distributing special event and holiday guidance.

Attributes

Probe vehicle data is different from traditional ITS sensor data because it is link-based (probe) rather than point-based (sensor). This means that calculation of speed and travel time occurs over some distance on the roadway. Each speed/travel time record is associated with a timestamp and geographic link identifier as well as some form of confidence score that provides insight into the reliability of each data record. In the early days of probe vehicle data, providers exclusively used traffic message channel codes to identify geographic links associated with speed and travel time data. However, in recent years, data providers have increased granularity and coverage of their measurements to be able to provide data in sub-segments and outside of the traffic message channel network. Those smaller links vary across providers as they use proprietary technology and aggregation methods in data generation. Links can be as short as a couple hundred feet.

Latency: Probe vehicle data generally has a low latency over a frequency ranging from 10 to 30 seconds. Providers aggregate this data and make it available in feeds, usually within 1 to 2 minutes after collection in the field. This latency is sufficient for most operations and planning purposes.
Details: Probe vehicle data is consistent when it comes to the level of available details. Most providers collect speed, travel time, and quality data per segment of roadway.
Quality and Coverage: Quality and coverage of probe vehicle data has continuously improved over the last decade that it has been available. Providers are constantly adding probes and improving collection and aggregation techniques to reduce latency and increase accuracy and coverage. Because providers collect data using different sources, some differences in data quality in different regions and on different road classes do exist. Some providers may have better quality data on arterials and in urban areas, but may have deficiencies in rural areas. Overall, providers are comparable in terms of quality and coverage, with slight differences being relevant in specialized applications.

To get the best available data and largest possible coverage, agencies sometimes purchase data that may be complementary to each other from multiple providers. Many agencies have developed data use agreements that include quality expectation clauses that ensure that providers adhere to minimum quality requirements and maintain a competitive market. To achieve this, agencies often employ independent validators to analyze data and compare it to ground truth data to ensure it complies with the minimum required accuracy and quality standards.

Data Availability

Private sector probe vehicle speed data providers work with their partners and service users to collect billions of GPS data points aggregate them into speed and travel time data records based on underlying map segmentation. Transportation Network Companies (TNCs) collect similar data, but are not currently selling this information to public agencies for operations purposes. This data is limited to areas where TNCs operate and have significant penetration, which is limited to large metropolitan areas.

Additionally, many toll facility operators collect probe speed data using toll tag readers. For example, the Florida Turnpike Enterprise collects toll tag data and conflates it to their custom link-based map used in conjunction with their sensor data to monitor congestion and detect any issues related to traffic flow on the toll road.

The Florida Turnpike Enterprise uses toll tag reads from different gantries to determine average link speed on their toll facility. It uses its own roadway segmentation that does not conform to any specific standard, but works well with their internal operations map. Then they calculate speed and travel time across those links based on toll tag identification. The agency collects toll tag data and conflates it to their custom link-based map used in conjunction with their sensor data to monitor congestion and detect any issues related to traffic flow on the toll road.

Pros and Cons

The primary benefit of probe vehicle data is ubiquitous coverage of the roadway network. In comparison to traditional ITS sensors deployed on a limited subset of the network, probe vehicle data can be collected anywhere there are equipped vehicles or devices. The current state of technology shows that probe vehicle data covers a large and growing percentage of all roadway networks. Figure 33 shows an example from Georgia, where the triangles represent the location of sensors deployed by the Georgia DOT in Atlanta. Roads with color on them are receiving probe-based speed measurements from a private sector probe-based speed data provider.

This screenshot shows a map of various travel routes in Atlanta, Georgia. Along the highlighted routes, triangles are plotted and represent the location of sensors deployed by the Georgia DOT in Atlanta. Roads with color on them are receiving probe-based speed measurements from a private sector probe-based speed data provider. On colored roadways, the data provider is actively conducting probe-based speed measurements. Clicking a triangle brings up an information box with road location, time and date, type of sensor, average speed, traffic volume, and number of lanes. A graph shows the volume and speed of traffic over 14 hours. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 33. Screenshot. Probe data provides ubiquitous coverage.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Probe vehicle data has demonstrated a high level of accuracy and quality on highways, especially in metropolitan areas. However, the quality and accuracy of probe vehicle data on arterial roadways are not as high due to the nature of interrupted flow. However, providers have been working on improving data on the arterial roadway network and have shown some improvements over the last several years.

One of the major challenges associated with probe vehicle data is that it does not have associated volume information. So while speed and travel time information may be accurate, it is not clear how many vehicles may be experiencing that speed and travel time. This information is still mainly generated by the static ITS sensors. In recent months, there has been emerging research providing methods to approximate volumes for probe vehicle data. While not 100 percent accurate, these methods will continue to improve and be a viable option for operations and planning use.

Use Cases for Probe-Based Speed Data

Use Case: Capital Investment and Project Selection. Agencies have a responsibility to identify best uses of limited funds and resources to improve safety and mobility. Previous project selection and capital investment decisions required significant analysis and research influenced by political pressure or vocal groups. With the availability of probe vehicle data, agencies have an opportunity to identify necessary projects based on insights from data.

For example, agencies can use probe vehicle data to generate a list of the most congested spots in a region (municipality, county, State, or multi-State region). Probe vehicle data helps to identify bottlenecks, sets of consecutive roadway segments where speeds drop below a certain threshold and remains below that threshold for some period. Properties of each bottleneck include its average length, duration, intensity, and frequency and pattern of occurrence. Significant and repeating bottlenecks can be good indicators of issues that can be resolved through implementation of improved operational strategies or capital investments. Figure 34 shows bottleneck ranking using INRIX data.

This shows an example bottleneck ranking metric table for a specific highway over a two month period. This gives location data and the specific time and impact of the event. This includes the ability to display a specific event on a location map, also included. a time spiral is also included that shows all bottlenecks, the type of event, and the maximum queue length at the location. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 34. Screenshot. Bottleneck ranking.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Once a particular project location is identified, probe vehicle data can be used to analyze conditions prior to, during, and after implementation of the solution to determine the actual impact of the investment (figure 35). Animating speed data over time shows changes in congestion patterns. Probe vehicle speed data can be combined with other data sets, such as traffic volumes, traffic type classifications (commercial or passenger), and estimated value of travelers' time, to determine user delay costs (figure 36) and cost changes resulting from project implementation. Similar analysis can evaluate major event impact.

This screenshot shows two maps covering the same area before and after a major winter storm. The map on the left shows the relative traffic speed as a percentage. During the storm a number of routes show comparative speed below 50 percent. A text box for a selected road shows the road name, intersection, travel direction, speed, historic average speed, and comparative speed. The map on the right shows the same roads a week after the weather event. Routes are all at 70 percent or above for travel speeds. A text box for a selected road shows the road name, intersection, travel direction, speed, historic average speed, and comparative speed. 2019 I-95 Corridor Coalition.

© 2019 I-95 Corridor Coalition

Figure 35. Screenshot. Trend map shows a comparison of performance before, during, and after a major event.
Source: I-95 Corridor Coalition, Real-Time Traffic Incident Management Information System User Group Presentation. Available at: www.i95coalition.org.

This screenshot shows the PDA user delay cost estimate due to congestion following a bridge collapse on the I-85 Bridge in Atlanta Georgia. This information is presented as a color-coded table of cost and delay impacts. As an example, two rows are highlighted and show User delays increased by approximately 20 percent on the Thursday and Friday following the collapse. The Cost impact for the collapse for the same two days shows a user delay cost of $7.2 million for Thursday and $7.8 million for Friday. Typical cost is 5 to 6 million dollars for a typical Thursday. Also, the afternoon rush started two to three hours earlier than normal. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 36. Screenshot. User delay cost resulting from a major event.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory

Use Case: Performance Management and Reporting. Probe vehicle data can help agencies evaluate mobility performance of the transportation system at varying geographic levels, from hyperlocal to regional. Probe vehicle speed and travel time data can provide the ability to calculate valuable metrics such as travel time index, buffer time index, and planning time index, which provide insight into system performance in comparison to average conditions, historical conditions, or ideal free-flow conditions. These metrics can help an agency evaluate its operations and identify areas for improvement.

Realtime dynamic performance dashboards use these metrics (figure 37) to identify issues in realtime as well as for inputs into daily, weekly, monthly, quarterly, and annual reports to the legislature, decision makers, and the public. Figure 38 shows an example performance report.

The Travel time widget shows an example of a Pennsylvania turnpike closure. The widget displays travel times to different locations in both directions. Displayed information includes velocity as differential from average travel speeds, the current velocity, and the historic velocity for the particular corridor. It also includes estimated travel time as differential from average travel time, the currently estimated travel time, and the historic travel time. Copyright 2019 Pennsylvania Turnpike Commission.

Figure 37. Screenshot. Example of a performance dashboard travel time widget.
Source: Pennsylvania Turnpike Commission.

Figure 38. Screenshot. Example performance report.
Source: Maryland Department of Transportation, 2015 Maryland Mobility Report. Available at: https://www.roads.maryland.gov/OPPEN/2015%20mobility%20report%20draft_highres_for%20website1.pdf.

Use Case: Traveler Information and Holiday Traffic Forecasting. Archived probe vehicle data provides insight into travel patterns, especially as they relate to holiday traffic. Agencies can use these historical patterns to forecast impacts and inform the traveling public accordingly. For example, the Baltimore Metropolitan Council uses probe vehicle speed and travel time data to identify best days and times for travel during Thanksgiving week and publishes that information on their website to inform the traveling public (figure 39).

This screenshot shows the days from Tuesday before Thanksgiving to the Monday after Thanks giving. For each day, a banner states times to avoid travel (or the best days to travel), and insights into locations where traffic might be heaviest and general messages related to travel. Copyright 2016 Maryland Department of Transportation.

Figure 39. Screenshot. Example of an Interstate travel forecast for the Baltimore, MD region during the week of Thanksgiving in 2016.
Source: Baltimore Metropolitan Council.

Use Case: Multiple Uses Across Planning and Operations. State DOTs, MPOs, and cities are using probe-vehicle speed data for a range of planning and operations purposes including traveler information and travel time generation, before and after studies, project prioritization, and performance measurement.

Maryland DOT, Arizona DOT, and Pennsylvania DOT are examples of organizations that use probe speed data from INRIX for traveler information and travel time generation, before and after studies, project prioritization, and performance measurement.

INRIX primarily works with freight operators, fleet vehicles (such as taxicabs, United Parcel Service, FedEx, etc.), individual OEMs, and their custom app users to collect location data over time. INRIX aggregates individual probe readings per roadway segment to calculate segment-based speed and travel time in near realtime. In addition to speed-readings and travel time calculations, INRIX provides confidence scores that implicitly indicate the number of probes used to generate data and the level of modeling/imputing/archived data used to supplement low probe count. INRIX makes this segment-based speed and travel time data available to TMCs as a realtime data feed, archived raw data dump, or as part of the analytics package that includes both data and a set of data analysis tools.

INRIX provides speed, travel time, and confidence score values per segment of the road at a frequency as low as once per minute. They offer several different roadway segmentation patterns:

Traffic message channel code.
- Traffic message channel standard segmentation that splits road network into segments of varying length from tenth of a mile to several miles (figure 40 and figure 41).
Extreme Definitions (XD).
- INRIX proprietary segmentation scheme that provides broader and more granular coverage than traffic message channel codes (figure 42 and figure 43). There is no relation between XDs and traffic message channels because they represent different segmentation patterns.

The screenshot shows a table of INTRIX TMC data. The table columns are TMC, Type, Road Number, Road Name, First Name (Road), Linear TMC, Country, State, County, Zip, Direction, Start Longitude, Start Latitude, End Longitude, End Latitude, and Miles.

Figure 40. Screenshot. Example of INRIX TMC-based metadata.
Source: INRIX.

This shows an example of TMC data as INTRIX code. Each line of code contains the TMC code, speed, average, reference, delta, score, travel time minutes, and congestion level.

Figure 41. Screenshot. Example speed and travel time data.
Source: INRIX.

The confidence score indicates the quality of speed and travel time data. The score can be 10, 20, or 30.

Ten usually indicates historical data or based on road reference speeds.
Twenty represents medium confidence based on realtime data across multiple segments and/or based on a combination of expected and realtime data.
Thirty indicates a high-confidence measurement that signifies realtime data for a specific segment.

Note that the data depicted in figures 42 and 43 is not relevant per se; rather, the purpose of these illustrations is to indicate typical data type and configuration.

Figure 42. Screenshot. Extreme Definition metadata.
Source: INRIX.

This shows an example of TMC data as Extreme Data code. Each line of code contains the Segment code, type, speed, average, reference, score, c-value, travel time minutes, and speed bucket.

Figure 43. Screenshot. Example speed and travel time data expressed in extreme definition format.
Source: INRIX.

North Carolina DOT, FDOT, and Georgia DOT use probe speed data from HERE for these activities (figure 44).

This map shows HERE speed data represented by color-coded roads using three colors to represent speed, travel time, and confidence value per segment of the road. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 44. Screenshot. HERE speed data in Orlando, FL.
Source: INRIX.

HERE works with smartphone manufacturers and cell service providers, as well as OEMs, to collect location data over time. HERE and INRIX data formats are very similar, with slight differences in how they define data quality measures. HERE also provides data access to TMCs via realtime feeds, archived raw data dumps, and analytics platforms that include both data and analysis tools.

Similar to INRIX, HERE provides speed, travel time, and confidence value per segment of the road at frequencies as low as once per minute (figure 45 and figure 46). HERE data use several segmentation patterns as well:

Traffic message channel code – same standardized segmentation that INRIX uses.
Sub-segments:
- Proprietary HERE segmentation that is more granular than traffic message channel codes, but also dynamically defined. This means that sub-segment measurements are not necessarily available at every polling interval, but are values assigned only when meeting specific speed conditions
- Sub-segments can also include per lane data breakdown as well.

HERE probe data also comes with a quality measurement in the form of a confidence factor.

A confidence factor of 0.0 to 0.5 (including 0.5) represents lower confidence for reference speed measurements.
A confidence factor of 0.5 to 0.7 (including 0.7) represents medium confidence where measurements are a combination of historic and realtime data.
A confidence factor of 0.7 to 1.0 (including 1.0) represents high confidence measurements based on realtime data for that specific segment.

This screenshot shows an example of the per-lane data code.

Figure 45. Screenshot. Example HERE traffic message channel per-lane data.
Source: HERE.

This screenshot shows an example of sub-segment per-lane data code. This screenshot shows an example of the per-lane data code.

Figure 46. Screenshot. Example HERE sub-segment per-lane data.
Source: HERE.

In conjunction with the ICC, Maryland DOT is currently testing the use of data from the private sector vendor, TomTom, for a range of planning and operations functions. TomTom collects probe speed data from its navigation system users, which include both passenger and freight vehicles. Unlike INRIX and HERE, both of which are present globally in some way, TomTom has a strong presence outside of the U.S. and generally larger coverage in Europe. Similar to INRIX and HERE, TomTom collects segment-based speed and travel-time data and provides this data to TMCs as realtime data feeds.

Similar to INRIX and HERE, TomTom provides segment based speed and travel-time at frequencies as low as once per minute (figure 47). TomTom data uses two segmentation patterns as well:

Traffic message channel code – same standardized segmentation that INRIX and HERE use.
OpenLR – Open source dynamic location referencing system using binary format.

Image depicts a screen filled with code based on TomTom data.

Figure 47. Screenshot. Example TomTom data.
Source: TomTom.

High-Resolution Asset Area

Description

The parallel development of higher fidelity sensors, LiDAR, and efficient machine learning algorithms has enabled companies to begin collecting asset data at a high resolution and in realtime. Today, many passenger vehicles are equipped with powerful sensors, and many high-end connected and semi-automated vehicles have high-definition cameras and LiDAR. Companies have found ways to use those sensors and cameras to collect information about a vehicle's environment and use that data to update agency asset-management systems with the latest and greatest information.

For example, a vehicle can use machine learning to process an incoming image from an on-board camera like those found on some higher-end semi-autonomous vehicles and recognize a speed limit sign. Not only does the vehicle recognize that an object is a speed limit sign, but it can also determine the listed speed on the sign and compare it to the stored value (either in a remote or local database) and determine if the sign has changed. The use of these cameras on CVs is a form of crowdsourcing that can be used to support asset management.

Specialized vehicles can measure other asset states, such as pavement cracks, potholes, pavement markings, signs, etc. As vehicles collect data on these assets, they can compare them over time to determine if they are deteriorating at an expected, slower, or faster rate.

Applications of High-Resolution Asset Data

High-definition asset data can update asset information for maintenance, operations, and planning purposes. For example, detecting the existence of a new sign or change in a sign can provide update information for internal maps and keep the asset inventory up-to-date for maintenance and operation purposes. Similarly, pavement conditions can be stored and analyzed over time to determine the rate of deterioration to prioritize maintenance investments.

Attributes

Data attributes vary based on technology and specific asset focus, but generally, data includes location, timestamp, and asset information. Asset information can be available as point clouds, images, deviation measurements, etc.

Latency: Certain types of asset collection methods can be close to realtime. For example, camera identification of specific features using image processing and machine learning is quick and is available shortly after. On the other hand, LiDAR or three-dimensional scan data can require extensive processing of raw data to obtain a useful level of information for asset management.
Details: High-resolution asset data generally has a high level of detail. These methods involve collection of high-resolution images or billions of point measurements transformed into very high quality asset information.
Quality and Coverage: Quality of high-resolution data is associated with high cost, which affects the level of coverage possible. For example, LiDAR scans can be expensive and limit the extent of coverage.

Point Clouds and Asset Mapping. Point cloud data is sometimes considered a subset of high-resolution asset data. It is the underlying data on which some (but not all) high-resolution asset data is built. Point cloud is becoming a more popular way to identify and provide insights into assets on and adjacent to the road (or the condition of the road itself). Sometimes referred to as LiDAR data or three-dimensional scanner data, point cloud data collects extremely precise x,y,z coordinates that can be combined with timestamps, intensity, Red, Green, Blue (RGB) values, and other attributes to generate three-dimensional views of assets (signs, guard rails, pavement cracking, trees, etc.). A scanned point cloud for even a small area often consists of billions of points. At this scale, many agencies struggle to store and process these point clouds to generate meaningful asset information.

Transformation into mapped assets makes point clouds useful, as shown in figure 48. Companies specialize in extracting meaningful asset information from LiDAR point clouds. For example, one company extracted features such as lanes, guardrails, barriers, signs, street lights, drainage, and other assets and create a mapped asset database used by maintenance and operations at FDOT. Other companies are producing high-resolution maps and developing asset management solutions for DOTs now—all based off of point-cloud data, which they store, transform, and manage for the agency.

The first of three point cloud images. This image depicts point cloud data that illustrates the road and objects off the side of the road, with objects displayed as different colors, but too many points to show specific and useful information. Copyright 2017 Yang Gao et al.

a) Original point cloud data.

The second of three point cloud images. This image of the same area is filtered to show only the point cluster of pavement markings. This displays as white lines on a black background. Copyright 2017 Yang Gao et al.

b) Point cluster of pavement markings.

Third of three point cloud images. This image is filtered to show only vectorized pavement markings as purple lines on a black background. Copyright 2017 Yang Gao et al.

c) Vectorized pavement markings.

Figure 48. Illustrations. Examples of point clouds and Asset Mapping.
Source: Yang Gao et al. (2017). Automatic extraction of pavement markings on streets from point cloud data of mobile LiDAR. Measurement Science and Technology. 28 085203. Available at https://doi.org/10.1088/1361-6501/aa76a3.

Data Availability

A number of data providers offer high-resolution asset data using new technology and artificial intelligence. Some example providers include the following:

A private sector data provider has marketed a new capability to detect signs using vehicle camera systems and use machine learning algorithms to determine the sign type and contents to inform the vehicle as well as update internal maps to account for the latest asset information.⁸
Iowa State University has developed a method using end-to-end deep learning to identify specific objects with a focus on automated vehicles (figure 49). They used convolutional neural networks to identify and classify objects in near realtime using vehicle cameras. This method can recognize and inventory static assets either using crowdsourced data from the public or specially equipped agency vehicles.
Another private-sector firm uses a system that consists of a series of displacement lasers mounted on a custom-designed semi-trailer that collects pavement structural response at highway speeds. These measurements provide pavement conditions in response to loads to determine if pavement is overloaded or if there is a need for overlay.
One vendor has developed laser scanners that can be used to quickly (1 million points per second) generate high-resolution point clouds and create measurements, profiles, sections, contours, volumes, etc.⁹

This is a photo of an intersection in an urban setting. Objects have rectangles drawn around them with labels indicating the type of objects. This image shows five vehicles labeled as cars and one object labeled as a pedestrian. Copyright 2018 Anuj Sharma.

Figure 49. Illustration. Deep learning object identification presented.
Source: Anuj Sharma, presentation to the 2018 TRB Annual Meeting Session P18-21443.

Pros and Cons

The primary benefit of the new technology and data collection methods for high-definition asset data is that data can be available in near realtime. Previously, asset data collection required significant effort and cost, leading to infrequent use. Now asset data collection occurs at a much higher frequency and by multiple vehicles, making the data available in nearly realtime.

Another challenge in using sensors, LiDAR, and cameras to collect asset data is that there could be situations in which collected data is inaccurate. For example, a database may claim that there is a speed limit sign at a particular location, but as the vehicles pass that location, a truck in an adjacent lane obstructs the sensor's view of the speed limit sign. In this situation, the sensor, camera or LiDAR may claim removal of the speed limit sign since it was not visible.

Similarly, some sensors, and many cameras, do not perform as well in adverse weather conditions such as rain, snow, and fog. In this case, data collected by a mobile sensor may be inaccurate or unreliable.

While data is valuable, it may be difficult to handle in its raw format. For example, until processed, point clouds or contours may not be useful or actionable for an agency to provide valuable information. Agencies rarely have the resources or expertise to process massive raw point cloud data sets, so they have to rely on third parties to process that data and package it for use in operations and maintenance.

Wi-Fi and Bluetooth Re-identification Data

Description

This technology provides true, measured travel times and O-D data between two locations. Wi-Fi and Bluetooth re-identification data became a valuable source of actionable information as smartphone penetration increased to the point where almost every driver, passenger, bicyclist, and pedestrian has a smartphone that is equipped with Wi-Fi and/or Bluetooth. In recent years, many vehicles also have Wi-Fi or Bluetooth transceivers as well. Wi-Fi and Bluetooth data is primarily location data collected by static Wi-Fi and Bluetooth scanners that can identify Wi-Fi or Bluetooth devices at different locations and use that information to infer speed and travel time information, and if deployed extensively throughout a network, can also be used for travel pattern and O-D analysis.

Applications of Wi-Fi and Bluetooth Re-identification Data

Some of the key applications of this data set include:

Travel time calculation.
Realtime queue warning and information.
Traffic signal optimization.
O-D analysis.
Work zone management.

Attributes

Most Bluetooth and Wi-Fi data consists of a media access control address (MAC address), a unique device address that identifies it as it moves through the network. For example, a vehicle is detected by one sensor that records the vehicle's MAC address. As that vehicle passes a second sensor, the MAC address is captured again. The distance between the two devices divided by the time between re-identification provides speed and travel times.

Data Availability

Several private companies provide this data. Some providers offer Bluetooth roadside sensors that detect Bluetooth devices in realtime to approximate travel time. Another company provides Bluetooth devices that can be used to detect passing vehicles and pedestrians to obtain O-D, travel time, and trip pattern information. At least one company offers remote configurable devices that identify both Bluetooth and Wi-Fi devices and can determine travel patterns in various weather and traffic conditions. Another provides small portable devices as well as complete pole-mount sensor solutions for temporary or permanent data collection.

Pros and Cons

One of the primary benefits of Bluetooth and Wi-Fi data is that it provides direct insight into individual trips between each sensor. Sensors are relatively inexpensive and lightweight, so deployment to address a core data need in an area is easy to accomplish.

The down side of these sensors is that they still require power and communications back to the agency. They are a physical device that needs to be physically deployed and maintained. Data is only collected where sensors have been deployed, making them significantly more costly to deploy on a wide scale when compared to probe data. Data can only be collected from vehicles and Wi-Fi/Bluetooth-equipped devices that are turned on and actively broadcasting their signal. Data filtering needs to be built into the device or the agency to remove anomaly data from vehicles that pull over into retail areas and do not return to the road network for an extended time. Finally, Wi-Fi pollution and general interference can result in distortion or data drops.

Credit Card Transacton Data

The All Hazards Consortium and the Department of Homeland Security use data from credit card point-of-sale machines to detect which businesses are open or closed following significant events (like a hurricane, snowstorm, or terrorist attack). The credit card swipe data serves as a surrogate measure of power outage information. Using the crowdsourced credit card swipes is an easy way to globally detect whether services are being restored properly in an area—which helps traffic operations, first responders, and maintenance crews understand where to focus their response and recovery efforts.

Applications of Credit Card Transactions Data

The All Hazards Consortium's FLEET OPEN/CLOSED Service is designed to help locate open places of business that provide fuel, food, medications and medical supplies, retail stores, and hotel rooms during a prolonged power outage within a city, county, State, region, or across the United States. The basic benefit will include the ability to find fuel, food, pharmacies, and hotel locations during power outages. This information is for commercial use only and is not available to the public, ensuring that non-operational users do not overload the system. Besides this capability, the system can be used to support collaboration with electric companies regarding details on outage areas and where critical infrastructure exists that can be coordinated within the response coordination cycle with the electric sector liaison or contact person.

The service is available 24 hours per day with electric updates made multiple times every day. This provides the latest information in near realtime. Navigating this site is simple and enables users to zoom in on a particular region, State, or city. This information is encrypted, secure, protected, and restricted to users only.

Attributes

There is no personally identifiable information in these data feeds. The data only has the number of transactions and time of each transaction. There is also positioning information (addresses and business names) that tie the transactions back to a specific location.

Pros and Cons

The data is only useful during major events that affect power in a region. Therefore, the agency is paying for data not used on a regular basis. The agency must also manipulate the data into a surrogate for power outages. For example, one can keep a record of transactions for all businesses over time, and then provide warnings to users when zero transactions are occurring during a specific time when businesses should be open and operational.

The data provider requires additional levels of security and credentialing to be placed on users of this data. For example, the provider may require a specific level of encryption and a radio frequency identification (RFID)¹⁰ card for users who want to see the data. This is a relatively minor technical barrier, but can be an operational barrier if agencies need to purchase new computer equipment with card readers and RFID badges for employees.

Connected Vehicle Data From Third Parties

Description

Many vehicle manufacturers assembled millions of vehicles with embedded CV technology (figure 50). Data from these providers can include wiper use, headlight use, heavy braking, traction control, fuel consumption, emissions, travel speed, acceleration and deceleration, and more. Example applications include using wiper activation data for rain and micro-level weather predictions, using heavy braking events for end-of-queue or debris detection, or using traction control engagement events for detecting slippery or icy road conditions. These predictions and detections allow agencies to pre-position maintenance and response crews to mitigate the impacts of events.

This shows an aerial view of traffic data represented colored points indicating different event types collected from connected vehicles. Some specific locations are labeled with text and include: Fast wiper use by 300 vehicles, six heavy braking events, abnormal fuel consumption, red signal in 2.5 seconds, high percentage of abnormal freight routing, 23 traction control engagements, and rollover. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 50. Illustration. Third-party, connected-vehicle events from the Washington, D.C. region.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Data Availability

Telematics providers have been making this data available to consumers and freight logisticians for years. However, some mapping and location data companies are on the cusp of making this data available to DOTs as part of their regular incident data feeds. Millions of these vehicles are operating on the Nation's roadways—more vehicles than in any CV Pilot in the United States today—and are reporting events precisely as they happen, exactly where they happen, without any need for installed roadside infrastructure. The data comes to a central server via wireless, cellular technologies embedded in vehicles. The data is anonymized, cleaned, and then provided as an incident feed to agencies and automobile manufacturers.

Applications of Connected Vehicles Data

Surface Weather Event Detection. When a driver turns on windshield wipers, this becomes a surrogate for the arrival of rain and (usually) the beginning of a period with wet and slippery pavement. Plotting exactly where and when dozens of vehicles turn on windshield wipers allows not just knowledge of where it is raining, but realtime calculation of the movement of weather at the micro-level; this will enable predictions about where it will hit next. Furthermore, reporting heavy braking, traction control engagement, travel speed, acceleration, and deceleration to TMCs without latency allows smart networks to react almost reflexively as major flow disruptions occur (e.g., to warn vehicles to slow down early).

Early Event Notifications. Air bag deployments, rollover indicators, and extreme deceleration can immediately alert agencies of a potential incident on the roadway in advance of other detection technologies (like a 911 call, service patrol identification, etc.).

Attributes

Data attributes vary from provider to provider, but look very similar to agency incident data. There is usually an event type, location, time stamp, etc.

Pros and Cons

Because this data is automatically collected over any road network that has access to cellular data service, the DOT can quickly gain enhanced situational awareness on parts of their road network that would otherwise lack coverage. CV data can also alert the DOT of potentially dangerous events and surface conditions long before a scene becomes a true incident. This could allow the agency to be proactive instead of reactive—for example, in responding to slippery surface conditions noticed by traction control engagements in multiple vehicles. Alternatively, the data could potentially be voluminous and distracting to operators—much like crowdsourced data. Therefore, some additional filtering and automated integration with the ATMS is generally recommended.

Realtime Turning Movement Data

Description

Realtime turning movement data describes how many vehicles are taking a particular exit, turn, or are traveling through an intersection. Historically, this data was collected only via deployment of an extensive sensor network on exit ramps and at intersection approaches and exits. However, third parties are now directly measuring or estimating turning movements from a mix of probe data and other sensing technologies.

Applications for Realtime Turning Movement Data

Realtime turning movement data allows transportation agencies to determine driver compliance rates with routing advice. With the combination of high-resolution signal data and the commercial vehicle CV data, a transportation agency can realize greater benefits from integrated corridor management (ICM) concepts that link freeway operations and signalized arterial operations. Freeway and arterial operations will be able to coordinate better through shared information about network flows, rerouting, and the impacts of incidents and events. Agencies could suggest alternate routes and actually measure what percentage of vehicles are taking their advice (or making up their own paths).

Attributes

The attributes vary between data providers, but one provider gives the percentage of traffic taking Exit A versus B at each intersection or turning left versus right versus going straight on arterials.

Pros and Cons

The main drawback is likely that realtime turning movement data is new and the larger transportation community has not fully tested and validated it. Agencies that purchase this data will be able to derive new operational insights, but, since some of the data may be faulty, they need to perform some level of validation to ensure realistic decisions. Despite this caution, there is significant promise for this type of data.

High-Resolution Signal Data

Detailed SPaT plus sensor actuation data has a resolution of 1/10 of a second (10 Hz). This means that all of the fundamental metrics needed to understand how each signal is operating and how it is loaded with traffic will be available to the decision support system in realtime. A new class of small and relatively inexpensive "aggregator" devices makes it possible to interface with the controller inside each cabinet. These devices not only bundle sensor data streams and ship them to the TMC but also receive and implement commands coming from the TMC. This capability provides a low-cost connection for DSS to control precisely the phases and timings of large networks of signals at once. (These devices work with traffic signals of any age or technology, with or without a hardwired communication network, giving it broad usefulness for deployments later nationwide.)

Description

Traditionally, snapshot view retiming methods managed signalized corridors that required manual data collection and modeling. As a result, daily counts were not representative of changing conditions throughout the day. This gave rise to signal timing that was not responsive to conditions.

New data collection technology built into controllers generated high-resolution signal data recorded over 150 different controller events at a very high frequency. This high-resolution data allows operators to evaluate signal system performance in realtime and to provide strong support for accurate optimization of the system to manage traffic flow.

Applications of High-Resolution Signal Data

The primary application of high-resolution signal data is in measuring the performance of a signal system in realtime to be able to optimize the system and adapt to the changing conditions in an automated manner. This realtime optimization takes the place of traditional signal operation modeling and estimation. From a planning perspective, archived high-resolution signal data can provide insights in seasonal variations and general changes over time. Similarly, this enables the use of high-resolution data to evaluate queues, turn movement counts, and volume to capacity ratios. In addition to mobility improvement applications, high-resolution signal data also provides safety benefits. It allows operators to detect and record red light and right-turn-on-red violations. Examples of the performance measures supported include:

Percentage of Arrivals on Green. Whereas re-identification provides effective metrics for a corridor containing several signals, high-resolution controller data provides detailed information on the performance of each signalized intersection. In addition to mining the high-resolution data for streamlining maintenance, the sensor actuation data combined with SPaT can be used to judge the health of signal coordination by measuring the percentage of vehicles that arrive on the green phase. The Purdue Coordination Diagram can cascade to all signals in a corridor and provides an effective visual of the quality of progression.
Capacity Utilization at Intersections. The inability of a vehicle to clear a light on a single cycle evokes driver dissatisfaction and complaint calls. Techniques to monitor capacity utilization, namely the frequency of occurrence of split failures, are available through high-resolution data. These metrics indicate when a movement on a phase is over capacity and requires more green, or if multiple phases are failing simultaneously, it indicates the need for capital improvement or demand management.
Red-Light and Back-of-Queue Collision Reduction. Waze and others in the United States are already piloting efforts to share realtime SPaT data with third-party traveler information providers and navigation systems to alert users of impending red lights and green lights.

Attributes

High resolution signal data includes more than 150 different elements broken into active phase events, active pedestrian phase events, barrier and ring events, phase control events, overlap events, detector events, preemption events, coordination events, and cabinet/system events. All events are timestamped.

Each collection of events comprises a detailed list of signal and detector states. Collection of controller events at 10Hz frequency means that operators have visibility into the state of the signal and intersection at a very fine granularity.

Pros and Cons

The primary benefit of high-resolution signal data is that it provides many data points at a high frequency, which allows operators to understand the realtime conditions at the intersection and quickly react to or even predict congestion and poor signal performance. Agencies have identified a number of previously unattainable performance measures calculated using this data.

Additionally, high-resolution signal controllers are inexpensive and can be phased into existing systems, allowing agencies to budget for and deploy high-resolution systems more quickly and affordably. The down side of this data set is that, at such a high frequency, it requires appropriate expertise and infrastructure to store and process.

Air Quality Sensor Data

Description

Air quality sensors are capable of identifying primary pollutants, generated directly by source, as well as secondary pollutants resulting from chemical reactions in surrounding air. The sensors collect a variety of measurements, including nitrogen dioxide, ozone, carbon monoxide, hydrogen sulfide, sulfur dioxide, and others. These sensors often can collect additional data elements such as sound and vibration, temperature, and in some cases light intensity. These low-cost sensors provide a hyperlocal view of air quality that can change from block to block and intersection to intersection in a dense urban area.

Applications of Air Quality Sensor Data

Air quality sensor data has several uses, especially in urban areas. For example:

Evaluating the level of air pollutants and providing operators with the ability to direct non-motorized travelers away from poor air quality areas to areas with better air quality, thereby encouraging or discouraging bicycle and pedestrian activity in particular areas.
Determining air quality at intersections for traffic signal system control. Air quality can become an input variable in timing plans to limit the level of pollution from too many vehicles or heavy-pollutant vehicles idling for too long at a signalized intersection. In effect, air quality-influenced traffic signals can indirectly address congested intersections while also considering the environmental impact factor of vehicles in the congestion.
Sound and vibration measurements used in conjunction with air quality measures to identify heavy vehicles moving through an urban area. This can help determine and direct urban freight movement patterns over delivery areas and inform effective parking utilization strategies.

Attributes

Sensors record timestamps and air concentration measures for a variety of pollutants, chemicals, temperature, pressure, and in some cases sound and vibration, as well as light intensity.

Data Availability

Many different vendors manufacture and sell air quality sensors. Several organizations provide localized air quality data. For example, the City of Chicago is equipped with over 100 air quality sensors as part of the Array of Things¹¹ collaborative effort between the Urban Center for Computation and Data (joint initiative of Argonne National Laboratory and University of Chicago) and the City of Chicago along with a number of other universities across the country. As part of this project, the city and University of Chicago have published the collected data for public consumption.

Pros and Cons

Air quality sensors are affordable, but they also provide hyper-local measurements, which means that in order to get a good aerial or regional insight into air quality, an agency would need to deploy a network of air quality sensors. In the case of transportation operations and planning applications, air quality sensors can provide valuable supplemental information, but are rarely sufficient on their own.

Roadway Weather Predictions

Description

While basic National Weather Service prediction data has been available for many years, these predictions cover wide areas and tend to focus more on air temperature and precipitation averaged over a region. Several companies now offer ground-based (i.e., at street level) 48-hour weather predictions that are updated every hour. This information can be used to optimize winter weather response operations; improve snow event readiness; reduce staffing, fuel, and chemical costs; pinpoint treatment applications; and generally keep the roads safer and less congested. Farmers have utilized the weather prediction services of some of these companies for years, and now, seeing a similar operational need, they have branched out into the transportation domain. These companies are able to produce specialized and hyperlocal weather conditions and predictions from a mix of satellites, ground-based radar, ground-sensors, and most recently using connected vehicles' sensors (often from agency-owned plow and maintenance vehicles), and more. Specialized algorithms and data processing then produce more accurate ground-based weather predictions. Operators use this to understand the weather as expected and then experienced by the driver.

Application of Weather Predictions

Weather can quickly and dramatically impact safety and mobility on roads. It can impact visibility. High winds can blow trucks off of roads and bridges. Icy roads can significantly impact vehicle performance. Table 12 shows the impacts on roads, traffic, and operations decisions.

Table 12. Weather impacts on roads, traffic, and operational decisions.
Road Weather Variables	Roadway Impacts	Traffic Flow Impacts	Operational Impacts
Air Temperature and Humidity	N/A	N/A	Road treatment strategy (e.g., snow and ice control). Construction planning (e.g., paving and striping).
Wind Speed	Visibility distance (due to blowing snow, dust) Lane obstruction (due to wind-blown snow, debris)	Traffic speed Travel time delay Accident risk	Vehicle performance (e.g., stability). Access control (e.g., restrict vehicle type, close road). Evacuation decision support.
Precipitation (Type, Rate, Start/End Times)	Visibility distance Pavement friction Lane obstruction	Roadway capacity Traffic speed Travel time delay Accident risk	Vehicle performance (e.g., traction). Driver capabilities/behavior. Road treatment strategy. Traffic signal timing. Speed limit control. Evacuation decision support. Institutional coordination.
Fog	Visibility distance	Traffic speed Speed variance Travel time delay Accident risk	Driver capabilities/behavior. Road treatment strategy. Access control. Speed limit control.
Pavement Temperature	Infrastructure damage	N/A	Road treatment strategy.
Pavement Condition	Pavement friction Infrastructure damage	Roadway capacity Traffic speed Travel time delay Accident risk	Vehicle performance. Driver capabilities/behavior (e.g., route choice). Road treatment strategy. Traffic signal timing. Speed limit control.
Water Level	Lane submersion	Traffic speed Travel time delay Accident risk	Access control. Evacuation decision support. Institutional coordination.

Source: Federal Highway Administration. Road Weather Management Program. How do Weather Events Impact Roads? Available at: https://ops.fhwa.dot.gov/weather/q1_roadimpact.htm, last accessed March 22, 2019.

Having accurate weather predictions on surface conditions can help agencies alert motorists of unsafe driving conditions, allow for pretreatment, and inform ongoing operational strategies that can dramatically affect safety and mobility.

Attributes

While each data and service provider is different, common data elements and attributes of road-weather prediction providers can include:

Pavement and/or bridge temperatures.
Air temperature.
Wind speed and direction.
Visibility.
Precipitation type, rate, and/or accumulations.
Salinity content of the moisture on the roadway.
Ice/frost warnings.
Comparison of realtime conditions to predicted conditions.

Data Availability

This data is available from the private sector. One company has invested heavily in its weather solutions, which provide pavement and other weather predictions and data for maintenance and operations response. Originally developed for the agricultural industry, these weather solutions come in many forms, including a web-based client, mobile application, and data feeds. This company has even integrated some of its weather prediction products into other products and solutions, like signal systems applications and analytics.

Another company offers a solution that leverages existing DOT (or other) closed circuit television (CCTV) cameras and image-processing technologies. Leveraging thousands of already-deployed cameras, the company processes images from these cameras to detect precipitation, cloud cover, fog, and other environmental conditions, then merges that information with other National Weather Service predictions and observations. Web-based analytics provide alerts to operators when specific conditions are detected.

A few agencies have invested in in-house, full-time meteorologists. UDOT is one example, maintaining four full-time meteorologists in their Operations center 24 hours per day, 7 days per week. To reduce costs, these staff are allowed to provide meteorological services to other parts of the country in addition to their Utah geographies.

Coverage varies by vendor. At least one provider leverages existing CCTV infrastructure, and therefore, certain attributes and alerts are more focused on areas where cameras have already been deployed. Another provider generally covers the entire country and all roads.

Pros and Cons

Realtime and predicted road weather information can be extremely beneficial to agencies operating in areas where winter weather, flooding, or other events can dramatically affect mobility and safety. The disadvantages to this data are relatively minimal and include cost, geographical resolution, and operator impacts. For example, while these services are higher resolution than most National Weather Service predictions, they are still limited to 1-kilometer square resolutions. This type of resolution is generally good enough for most maintenance applications. Also, high-quality road weather data can be relatively expensive; however, the benefit-cost can be significant, reducing maintenance costs, excess materials used, and fuel consumption and optimizing resource management. It can also have a significant impact on mobility, traveler information, and safety.

Use Cases for Roadway Weather Predictions

Use Case: Indiana DOT. The Indiana DOT uses the Iteris ClearPath weather prediction software to help better manage the deployment of their snow plows and sand/salt trucks. With approximately 1,000 trucks in its fleet, the agency estimates that they can save over $750,000 in material costs alone ($750 per load of salt) if they can reduce their deployments by one trip. The use of more accurate weather predictions can help achieve this on multiple occasions throughout the year. This type of cost savings (which is conservative in that it only estimates the costs of materials, not operator time, fuel, equipment costs, or the reduction in crashes and safety impacts) quickly justifies the cost of the weather prediction service.

Use Case: UDOT. UDOT has taken a different approach to predicting road weather conditions. As noted above, beginning in 2002, the agency employed its own full-time, in-house meteorologist in their traffic operations center (TOC). Unlike most weather reporting, the UDOT team does not provide a percentage chance of precipitation. Instead, they predict whether it will or will not snow, rain, etc. on different sections of roadways.

Established under the TOC's Traffic Management Division, UDOT's weather program has two main components:¹²

Four staff meteorologists stationed in the TOC provide year-round weather support.
An ITS component, which manages approximately 70 RWIS stations and expert systems such as bridge spray systems, high wind alerts, and fog warnings.

The meteorologists provide services like:

Forensic meteorology services (e.g., risk management, filling requests for data made through the Utah Government Records Access and Management Act (GRAMA)).
Forecasting services to UDOT maintenance, construction, and TOC personnel.
RWIS and weather training courses.

Year-round, long-term weather forecasts are provided and used mainly for planning materials, staffing, and equipment. They provide pre-storm, during-storm, and post-storm weather forecasts to the maintenance managers, area supervisors, and local garages. They also provide forecast services for road rehabilitation and avalanche safety. Construction engineers and contractors also receive weather forecasts from the meteorologists for new construction and renovation projects.

A 2008 research study was conducted to estimate the cost-effectiveness of the program on winter maintenance costs.¹ The study estimated the value and additional saving potential of the UDOT weather service to be 11–25 percent and 4–10 percent of the UDOT winter maintenance costs, which include the costs of labor and materials. Based on this program's cost, the benefit-cost ratio was calculated at more than 11:1.

In addition to maintenance and construction, the meteorologists provide weather forecasts to TOC divisions including:

Signal systems.
ATMS.
Incident management team.
Traffic management.
Department of public safety.

They issue weather forecasts twice a day or more frequently as needed when conditions are changing rapidly.

Additional Use Cases: The Federal Highway Administration's (FHWA's) Road Weather Management Program has documented 27 additional use cases for road weather operations.¹⁴

Computer-Aided Dispatch (CAD) Data

Description

First responders and public safety agencies use CAD systems to identify the location of an emergency, the type of emergency, and then dispatch units to the scene. Generally, a 911 operator or a dispatcher who receives a call will enter basic information into a computerized system that pushes that information out to responders' mobile data terminals. The system can also be used by dispatchers to visualize field unit assignments to different calls. Some CAD systems are integrated with automated vehicle location (AVL) systems as well as telecommunications systems to allow for quicker information sharing and communications.

Traditionally, CAD systems have been closed secure systems used exclusively by public safety personnel to manage field operations. However, most traffic incidents are first reported via cell phone calls to a 911 center. For example, up to 88 percent of incidents in Virginia were first discovered by the Virginia State Police. This means that public safety agencies are often aware of a traffic incident well before a TOC. By the time the DOT becomes aware of an incident, either via a courtesy or procedural call from the partner public safety agency, or by detecting it using CCTV or other ITS equipment, it could be 5–10 minutes or even longer after the incident occurred. That initial time after the incident is critical in achieving efficient and effective medical response and quick clearance.

Over the last 15 years, many agencies have tried to integrate information from CAD systems into TMC operations to provide quicker incident detection and more efficient response. Some initial efforts focused on co-location of public safety and TMC personnel, but even in those instances, the communication and information sharing were inconsistent and limited. In the more recent past, agencies have begun integrating CAD systems data directly with the ATMS in a TOC to automate data exchange and reduce latency between when an incident occurs and when the TMC becomes aware of it. However, even these system integration efforts have encountered challenges related to information overload, sensitivity, and security.

Applications of Computer-Aided Dispatch Data

CAD data is mainly used to identify incidents more quickly and to identify incidents that a TMC would not otherwise be aware of either due to lack of coverage or jurisdiction. TMCs have integrated CAD data into their systems is several different ways:

TMC/Public Safety Co-Location
- Instead of direct system integration, co-location of staff sometimes allows TMC operators to interact directly with public safety staff and exchange information.
- This type of information exchange is highly dependent on interpersonal relationships and space configuration, and therefore varies in effectiveness.
Read-Only Integration
- In a read-only arrangement, TMCs may have a scrolling screen that just shows CAD messages as they arrive.
- Alternatively, ATMS may flash an alert when a CAD message arrives and allow the operator to create an ATMS event based off of CAD data and manage it separately.
- This is the most common type of integration and it exposes two key challenges:
  - Operators can be overwhelmed by alerts and start ignoring them, especially if there are a lot of false-positives.
  - Once a TMC operator creates a related event in the ATMS, the two records continue to diverge as CAD data changes, and the operator may or may not include those changes.
Two-Way Integration
- Two-way integration is the least common and most complex way to integrate CAD data. This method includes two-way data exchange between ATMS and CAD systems.
- Generally, CAD events arrive to the TMC and then TMC operators are able to modify and add information to the event, which then gets sent back to the public safety CAD system.
- While this type of integration may provide most flexibility and the richest data set, it can also be complex to implement and cause many jurisdictional issues with operators from different agencies entering conflicting data or overwriting each other's changes.

Attributes

Latency: Information is input into a CAD system in realtime as it occurs. As a call arrives to the 911 system, the dispatcher inputs relevant data into the system. This data is then pushed out to relevant field personnel and partners. However, information flowing to partners has to traverse several networks and firewalls between agencies. In addition to time for information to traverse the path between systems, some agencies and CAD systems have limitations when it comes to sharing mechanisms that prevent asynchronous information sharing. These systems may rely on some type of timer to dump and consume data, maybe once a minute or once every several minutes.

Details: CAD data varies in level of detail and structure depending on agency and vendor. Generally, CAD data includes basic information including incident type, timestamp, and dispatched unit information. Some CAD data feeds also contain information about location, actions taken during the incident, caller information and status, etc.

An example incident record coming from a CAD system to a TMC looks like this:

<Incident>
<Authorization></Authorization>
<IncidentId>121150898573862915</IncidentId>
<TrackingNumber>2018-00494892</TrackingNumber>
<IncidentCreatedTimestamp>2018-11-02T23:01:38</  IncidentCreatedTimestamp>
<IncidentEffectiveTimestamp>2018-11-02T23:01:47</  IncidentEffectiveTimestamp>
<IncidentLastModifiedTimestamp>2018-11-02T23:01:47</  IncidentLastModifiedTimestamp>
<IncidentAddress>123 EXAMPLE HWY </IncidentAddress>
<IncidentStreetPrefix></IncidentStreetPrefix>
<IncidentHouseNum>123</IncidentHouseNum>
<IncidentStreetName>EXAMPLE</IncidentStreetName>
<IncidentStreetSuffix>HWY</IncidentStreetSuffix>
<IncidentPostDirectional></IncidentPostDirectional>
<Building></Building>
<Apartment></Apartment>
<Community>EXAMPLE TOWN</Community>
<ExtendedLocation>EXAMPLE BUSINESS  NAME</ExtendedLocation>
<IncidentLatitude>38.89512195241237</IncidentLatitude>
<IncidentLongitude>-77.03649619898232</IncidentLongitude>
<IncidentLocationAreaName>962</IncidentLocationAreaName>
<MsagIntersection>EX ST,EXAMPLE  TOWN</MsagIntersection>
<MsagIntersection>US EXAMPLE HWY,EXAMPLE TOWN</MsagIntersection>
<MsagZip></MsagZip>
<MsagState>MD</MsagState>
<MsagOddEven>EVEN</MsagOddEven>
<MsagC1>962</MsagC1>
<MsagC2>1107902</MsagC2>
<MsagC3>500126 &amp;  502333</MsagC3>
<MsagC4>502214 &amp;  503724</MsagC4>
<MsagC5></MsagC5>
<MsagC6></MsagC6>
<MsagC7></MsagC7>
<MsagC8>EXAMPLE TOWN</MsagC8>
<MsagC9></MsagC9>
<MsagC10>EXAMPLE COUNTY</MsagC10>
<MsagC11>6, NORTHERN</MsagC11>
<MsagC12>96-NORTH</MsagC12>
<MsagC13>COUNTY</MsagC13>
<MsagC14>12-02-15</MsagC14>
<MsagC15></MsagC15>
<MsagC16></MsagC16>
<MsagC17></MsagC17>
<MsagC18></MsagC18>
<MsagC19></MsagC19>
<MsagC20></MsagC20>
<IncidentEventRefName>71 PREMISE  CHECKS</IncidentEventRefName>
<IncidentEventRefDescription>71  PREMISE CHECKS</  IncidentEventRefDescription>
<IncidentPriorityRefName>3</IncidentPriorityRefName>
<CurrentIncidentStatusRefName>CLOSED</  CurrentIncidentStatusRefName>
<IncidentCreatorUserName>CAD0001</IncidentCreatorUserName>
<IncidentFirstDispatchTimestamp>2018-11-02T23:01:36</  IncidentFirstDispatchTimestamp>
<IncidentFirstEnrouteTimestamp>2018-11-02T23:01:37</
IncidentFirstEnrouteTimestamp>
<IncidentFirstOnSceneTimestamp>2018-11-02T23:01:38</  IncidentFirstOnSceneTimestamp>
<IncidentCloseTimestamp>2018-11-02T23:01:47</  IncidentCloseTimestamp>
<IncidentDisposition>
<IncidentDispositionRefName>C</IncidentDispositionRefName>
<IncidentDispositionRefDescription>No  further action</  IncidentDispositionRefDescription>
<IncidentDispositionUserName>CAD0001</  IncidentDispositionUserName>
<IncidentDispositionRemarks></IncidentDispositionRemarks>
<IncidentDispositionEffectiveTimestamp>2018-11-02T23:01:47</  IncidentDispositionEffectiveTimestamp>
</IncidentDisposition>
<CallForService>
<CallerName></CallerName>
<CallerPhoneNumber></CallerPhoneNumber>
<CallerCallBackNumber></CallerCallBackNumber>
<CallSourceRefName>Offc</CallSourceRefName>
<CallStatusRefName>CLOSED</CallStatusRefName>
<CallerAddress></CallerAddress>
<CallerStreetPrefix></CallerStreetPrefix>
<CallerHouseNum></CallerHouseNum>
<CallerStreetName></CallerStreetName>
<CallerStreetSuffix></CallerStreetSuffix>
<CallerPostDirectional></CallerPostDirectional>
<CallerBuilding></CallerBuilding>
<CallerApartment></CallerApartment>
<CallerCommunity></CallerCommunity>
<CallerExtendedLocation></CallerExtendedLocation>
<CallerLatitude></CallerLatitude>
<CallerLongitude></CallerLongitude>
<CallerLanguage></CallerLanguage>
<CallerLanguage></CallerLanguage>
<CallForServiceEffectiveTimestamp>2018-11-02T23:01:47</  CallForServiceEffectiveTimestamp>
<CallForServiceLastModifiedTimestamp>2018-11-02T23:01:47</  CallForServiceLastModifiedTimestamp>
</CallForService>
<ServiceRequest>
<ServiceRequestPriorityRefName>1</  ServiceRequestPriorityRefName>
<ServiceRequestStatusRefName>CLOSED</
ServiceRequestStatusRefName>
<ServiceRequestUnitOrgName>96  EXAMPLE TOWN</ ServiceRequestUnitOrgName>
<ServiceRequestUnitOrgPath>CAD##96  EXAMPLE TOWN</ ServiceRequestUnitOrgPath>
<ServiceRequestUnitOrgEventRefName></  ServiceRequestUnitOrgEventRefName>
<ServiceRequestUnitOrgEventRefDescription></  ServiceRequestUnitOrgEventRefDescription>
<ServiceRequestUnitOrgEventRefPriorityRefName>1</  ServiceRequestUnitOrgEventRefPriorityRefName>
<ServiceRequestEffectiveTimestamp>2018-11-02T23:01:47</  ServiceRequestEffectiveTimestamp>
<ServiceRequestLastModifiedTimestamp>2018-11-02T23:01:47</  ServiceRequestLastModifiedTimestamp>
<Unit>
<UnitServiceRequestUnitName>0001</  UnitServiceRequestUnitName>
<UnitServiceRequestStatusRefName>Available</  UnitServiceRequestStatusRefName>
<UnitServiceRequestPreviousStatusRefType>AVAILABLE</  UnitServiceRequestPreviousStatusRefType>
<UnitFirstDispatchUnitStatusRefName>Dispatched</  UnitFirstDispatchUnitStatusRefName>
<UnitFirstDispatchTimestamp>2018-11-02T23:01:36</  UnitFirstDispatchTimestamp>
<UnitFirstEnrouteUnitStatusRefName>EnRoute</  UnitFirstEnrouteUnitStatusRefName>
<UnitFirstEnrouteTimestamp>2018-11-02T23:01:37</  UnitFirstEnrouteTimestamp>
<UnitFirstOnSceneUnitStatusRefName>OnScene</  UnitFirstOnSceneUnitStatusRefName>
<UnitFirstOnSceneTimestamp>2018-11-02T23:01:38</  UnitFirstOnSceneTimestamp>
<UnitFirstSecondaryEnrouteUnitStatusRefName></  UnitFirstSecondaryEnrouteUnitStatusRefName>
<UnitFirstSecondaryOnSceneUnitStatusRefName></  UnitFirstSecondaryOnSceneUnitStatusRefName>
<UnitLastClearUnitStatusRefName>Available</  UnitLastClearUnitStatusRefName>
<UnitLastClearTimestamp>2018-11-02T23:01:47</
UnitLastClearTimestamp>
<UnitServiceRequestEffectiveTimestamp>2018-11-02T23:01:47</UnitServiceRequestEffectiveTimestamp>
<UnitServiceRequestLastModifiedTimestamp>2018-11-02T23:01:47</UnitServiceRequestLastModifiedTimestamp>
<PersonUnit>
<PersonUnitName>John Doe</PersonUnitName>
<Employee>
<EmployeeGivenName>John</EmployeeGivenName>
<EmployeeMiddleNames></EmployeeMiddleNames>
<EmployeeSurName>Doe</EmployeeSurName>
<EmployeeBadgeNumber>0001</EmployeeBadgeNumber>
<EmployeeIdNumber>0001</EmployeeIdNumber>
</Employee>
</PersonUnit>
<UnitHomeUnitOrgPath>CAD##96  TOWN NAME</ UnitHomeUnitOrgPath>
</Unit>
</ServiceRequest>
</Incident>

Data Availability

Most public safety agencies at the State level have CAD systems that were developed and maintained by different vendors.

In a case where public safety and TMCs share information, each public safety agency works directly with a TMC to develop sharing mechanisms through their individual vendor applications. CAD data is not readily available to all TMCs across the country. Agencies are still working with their public safety partners to garner support and develop methods for exchanging this valuable data.

CAD data coverage is directly related to a specific public safety agency jurisdiction. This means that municipal police CAD data may cover an urban area, while county police CAD data may cover arterial and county roads, and State police CAD data may be focused on interstates.

The quality of data varies depending on the agency and system capabilities. Some systems provide a rich set of data elements, while others rely on mostly free-text entry that, while readable to a human, may not be very useful when it comes to automated integration into other systems. Figure 51 and figure 52 show examples of different CAD data that has been integrated into different ATMS platforms. The first two are from the State police in California, and the third is from the county police in Maryland. While one uses short-hand, abbreviations, and "10-codes" to describe the event, the other agency uses machine-readable text and "signal codes" to describe the event.

An incident report that is difficult to read, free-text typed into the California Highway Patrol's computer-aided dispatch system (dashed box) and transmitted to Caltrans. It indicates an injuries involved incident, when the report was started, and a description of the incident that is not understandable because it appears to be using numbers that have no meaning to laypersons.

a) Difficult to read, free-text typed into the California Highway Patrol's computer-aided dispatch system (dashed box) and transmitted to Caltrans.
An incident notification that indicates the location, when the incident report was started, and a description that uses understandable, abbreviated text. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory

An incident notification that indicates the location, when the incident report was started, and a description that uses understandable, abbreviated text. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory

b) Example text free-typed to describe an incident, including use of verbal shorthand.

Figure 51. Screenshot. A California Highway Patrol computer aided dispatch message.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

This incident report text gives the specific incident location, when the report was started and updated, and a clear description. Copyright 2019 University of Maryland Center for Advanced Transportation Technology Laboratory.

Figure 52. Screenshot. Prince George's County Maryland computer-aided dispatch system messages.
Source: University of Maryland Center for Advanced Transportation Technology Laboratory.

Pros and Cons

The most valuable contribution of CAD data to a TMC is in the form of faster incident detection and more effective response. TMCs can dispatch appropriate resources more quickly and clear the scene more effectively if they have a better understanding of the type of incident and its severity.

In addition to improvements in incident clearance time, automated information exchange reduces the burden of communication workload and increases situational awareness.

However, CAD data comes with many challenges. Most of the time, these systems were not developed to support transportation operations. This means that when adapted to share information with TMCs, there are several challenges:

Data sensitivity/security:
- CAD systems collect data for all calls, not just transportation incident calls. This means that feeds must be filtered to exclude non-transportation calls, such as criminal activity types of calls that may include personally identifiable information.
Data structure:
- CAD systems are developed for call takers who must interpret incoming information and quickly input it into the system. This means that many CAD systems rely on abbreviations and shorthand as well as free text entry.
- Free text entry, while readable to a human, can be very challenging to process in an automated manner. This means that systems may not be able to automatically determine incident type and extract relevant information accurately. Simple misspellings may invalidate data and be either ignored or thrown away before they even reach a TMC.
Information overload:
- TMC operators already deal with a large amount of information, both incoming and outgoing. They monitor CCTV, radio, and ITS outputs to determine the status of the transportation system. Sometimes they also deal with phone calls and customer inquiries.
- The addition of CAD data may present a challenge, especially in cases where TMC is already aware of an incident and the system is unable to intelligently handle duplicates or related incidents, or in cases where CAD data volume is so high that the operators cannot keep up with inflow and triage it appropriately.
- Some TMC systems use CAD data to generate alerts for operators when relevant incidents arrive via CAD. However, CAD design, free text abbreviations, and shorthand can often trigger false-positives, which eventually results in operators ignoring the incoming information.

Use Cases for CAD Data

Use Case: California Highway Patrol Computer Aided Dispatch Data for Public Consumption. CHP is generally the first to become aware of incidents on major California highways through 911 calls. In an effort to keep the general public informed about congestion in major metropolitan areas, media outlets always look for the fastest and most reliable source of incident information. This resulted in CHP receiving a high volume of calls from media, and even their DOT partners, requesting information about current conditions and incidents. CHP found that the amount of time and effort expended on responding to these calls and requests was significant and was impacting the agency's ability to accomplish its primary mission of responding to and clearing incidents.

CHP took a bold step to solve this problem by creating an XML data feed of all their transportation-related CAD data, except confidential information, and publishing it for public consumption. These data included detailed text about each and every publicly viewable incident. This means that anyone interested could obtain access to data and do whatever they wanted with it. Initially, some feared that this would result in major liability issues. Instead, the incoming call volume to CHP reduced significantly and resulted in lower cost and better focus on incident response.

While the information in the XML feed was detailed, much of it was in the form of free text fields, making automated integration somewhat difficult. Nevertheless, availability of this information is invaluable for media, the public, and DOT partners looking for better situational awareness.

Use Case: Virginia Department of Transportation Realtime Traffic Incident Management Information System. VDOT was interested in improving safety and mobility by sharing existing information currently available to VDOT with its partner public safety agencies. To break down information and communication silos, VDOT worked with their partners to develop a realtime data sharing system, illustrated in figure 54, using automated data extraction and filtering that limited the impact to operators' workload.

This screenshot shows an incident information page listing a number of specific incidents. A table shows the incident number, incident type, location, location description and general area. On each row is a details link that users click to display information on the selected incident. The detail information is listed by time and the specific operation performed at each time. The information also includes the times for each patrol unit assigned to investigate and direct operations.

© California Highway Patrol.

Figure 53. Screenshot. California Highway Patrol computer-aided dispatch log example.
Source: California Highway Patrol public data feed (http://cad.chp.ca.gov/Traffic.aspx).

The Realtime Traffic Incident Management Information System (RTIMIS) collects data from the CAD system and transmits it via secure connection to the integration partners, including VDOT TMC operators. This integration provided a 34-percent reduction in clearance time across 67 miles of I-95 by making TMC operators aware of incidents more quickly than before.¹⁵ In addition to clearance time reduction, RTIMIS CAD integration reduced communication workload as well as improved general situational awareness.

VDOT found that CAD and TMC operators often have different goals, which impacts the quality of data being exchanged. For example, information that TMC operators may consider critical may not be important to CAD operators, so they may not enter it as frequently. Similarly, CAD systems use free text fields and shorthand that makes it very difficult to automate integration of valuable CAD data.

Use Case: Florida Highway Patrol and SunGuide Integration. The FDOT ATMS platform, SunGuide, receives a filtered realtime CAD data feed. Through this integration, FDOT has seen a reduction in incident verification, response, management, and clearance time. FDOT operators become aware of incidents more quickly using CAD data and can evaluate incident severity and impact and dispatch appropriate field units and towing units more quickly. In addition to quantifiable improvements in incident response, the Florida Highway Patrol dispatchers have experienced a reduction in their workload as they do not have to take extra steps to notify their FDOT counterparts when requiring traffic incident management support.

FDOT CAD integration is read-only with operators having the ability to dismiss alerts, create new events based on the alert, or associate an existing event to an incoming CAD event.

This diagram shows a Regional Participant Internal Network containing a CAD operator and CAD system communicating with a local RTIMIS server, all contained behind a regional participant firewall. Communication from this location goes back and forth through a secure SSL connection via the internet. The secure SSL connection goes through a Regional Sponsor firewall to a RTIMIS web application. This communicates through the firewall to an encrypted internet connection that communicates with a number of integration partners. All communications go both ways so communication can travel between the CAD operator and the integration partners. Copyright 2018 Virginia Department of Transportation.

Figure 54. Virginia Department of Transportation Realtime Traffic Incident Management Information System high-level architecture diagram.
Source: Virginia Department of Transportation, Realtime Traffic Incident Management Information System Presentation, April 23 & 24, 2018. Available at: https://i95coalition.org/wp-content/uploads/2018/06/3-I95CC-CAD-Workshop-Virginia-Presentation-Apr2018.pdf?x70560.

¹ Eric Lipton, "U.S. System for Tracking Traffic Flow Is Faulted," New York Times, December 13, 2009. Available at https://www.nytimes.com/2009/12/14/us/14traffic.html, last accessed March 17, 2019. [ Return to note 1. ]

² I-95 Corridor Coalition, Closing the Realtime Data Gaps Using Crowdsourced Waze Event Data. Unpublished technical report. [ Return to note 2. ]

³ Table 6 displays only select States for which the CATT Lab had access to both DOT ATMS and Waze data. [ Return to note 3. ]

⁴ Wazeopedia, "Realtime closures." Available at: https://wazeopedia.waze.com/wiki/USA/Real_time_closures, last accessed March 17, 2019. [ Return to note 4. ]

⁵ Wazeopedia, "Realtime closures." Available at https://wazeopedia.waze.com/wiki/USA/Real_time_closures#edit_the_closed_segment, last accessed March 18, 2019. [ Return to note 5. ]

⁶ For more information about BSM Part 1 and 2, see US Department of Transportation Vehicle Based Data and Availability Presentation at: https://www.its.dot.gov/itspac/october2012/PDF/data_availability.pdf. [ Return to note 6. ]

⁷ For more details, see B. Cronin, "Vehicle Based Data and Availability," presentation (n.d.). Intelligent Transportation Systems Joint Program Office, Research and Innovative Technology Administration, USDOT. Available at: https://www.its.dot.gov/itspac/october2012/PDF/data_availability.pdf [ Return to note 7. ]

⁸ For more information, see https://www.here.com/en/products-services/here-automotive-suite/connected-vehicle-services/here-road-signs. [ Return to note 8. ]

⁹ For more information, see https://geospatial.trimble.com/products-and-solutions/laser-scanning-solutions. [ Return to note 9. ]

¹⁰ Radio frequency identification uses electromagnetic or electrostatic coupling in the radio frequency portion of the electromagnetic spectrum to uniquely identify an object, animal, or person. This technology can be used to automatically identify and track tags attached to objects, such as cards. [ Return to note 10. ]

¹¹ For more information, visit the "Array of Things" web page at https://arrayofthings.github.io/. The Array of Things is a collaborative effort among scientists, universities, local government, and communities in Chicago to collect realtime data on the city's environment, infrastructure, and activity for research and public use. [ Return to note 11. ]

¹² Zhirui Ye. 2009. Evaluation of the Utah DOT Weather Operations/RWIS Program on Traffic Operations. Iowa Department of Transportation: Ames, IA. Available at: https://westerntransportationinstitute.org/wp-content/uploads/2016/08/4W2324_UDOT_Phase_II_Report.pdf, last accessed March 18, 2019. [ Return to note 12. ]

¹³ Strong, C. and Shi, X. 2008. "Benefit–Cost Analysis of Weather Information for Winter Maintenance: A Case Study," Transportation Research Record: Journal of the Transportation Research Board 2055: 119-127. [ Return to note 13. ]

¹⁴ Federal Highway Administration, Road Weather Management Best Practices. Available at: https://ops.fhwa.dot.gov/weather/best_practices/1024x768/transform2.asp?xslname=casestudies_title.xslt&xmlname=casestudies.xml. [ Return to note 14. ]

¹⁵ Virginia Department of Transportation, Realtime Traffic Incident Management Information System Presentation, April 23 – 24, 2018. Available at: http://i95coalition.org/wp-content/uploads/2018/06/3-I95CC-CAD-Workshop-Virginia-Presentation-Apr2018.pdf?x70560. [ Return to note 15. ]

previous | next