Real-time transit information has many benefits to transit riders and agencies, including shorter perceived and actual wait times, a lower learning curve for new riders, an increased feeling of safety, and increased ridership. In the last few years, a real-time complement to the General Transit Feed Specification (GTFS) format, GTFS-realtime, has emerged. GTFS-realtime has the potential to standardize real-time data feeds and lead to widespread adoption for transit agencies and multimodal apps. However, GTFS-realtime v1.0 has suffered from a lack of clear documentation and openly available validation tools, which significantly increases the time and effort necessary to create and maintain GTFS-realtime feeds. More importantly, bad data have been shown to have a negative effect on ridership, the rider’s opinion of the agency, and the rider’s satisfaction with multimodal apps.
This project focused on the community-driven creation of the GTFS-realtime v2.0 format, which establishes better guidance for transit agencies, application developers, and automatic vehicle location (AVL) system vendors on what fields are required or optional under various transit use cases. The research team also collaborated with the GTFS community to create GTFS Best Practices. In parallel to these standardization efforts, an open-source GTFS-realtime validation tool (https://github.com/CUTR-at-USF/gtfs-realtime-validator) was developed to allow these same parties to quickly identify and resolve problems in a feed. To demonstrate the utility of the GTFS-realtime Validator and capture the current state of real-time data quality in the industry, the Transit Feed Quality Calculator (https://github.com/CUTR-at-USF/transitfeed-quality-calculator) tool was created to automatically download and validate a large number of agency feeds. An evaluation of 78 transit agency GTFS-realtime feeds showed errors in 54 feeds and warnings in 58 feeds, indicating widespread problems with quality control across many agencies and AVL vendors.
Future work should focus on encouraging agencies to use GTFS-realtime v2.0 and the GTFS-realtime Validator, especially when specifying requirements in RFPs for new AVL systems. A hosted instance of the GTFS-realtime Validator would be useful for agencies that cannot run the tool themselves (e.g., due to internal IT policies preventing installation of applications). Official GTFS-realtime Best Practices voted upon by the GTFS community, similar to GTFS Best Practices, should also be created. Some gray areas remain in the GTFS-realtime specification that should be clarified via future proposals to the GTFS community, and new validator rules based on these clarifications could also be created. A data dashboard that shows the current quality of industry feeds may help agencies and vendors better understand how they relate to their peers in terms of data quality. GTFS may benefit from a more formal governance structure going forward, while being careful not to abandon key qualities of the grassroots approach to governance that has served the format well to date. Finally, the research into how the adoption of GTFS-realtime v2.0 and the GTFS-realtime validator impact data quality over time could be examined, as well as the possible institutional barriers that prevent some agencies from acknowledging and resolving errors in the data.