Several strategies are available to transit agencies as they seek to better protect passenger data: 1) eliminating “privacy taxes” and empowering passengers to make an informed choice when choosing their fare media; 2) anonymizing and aggregating the data that is collected; 3) designing fare payment systems that collect less personal data in the first place.
Eliminating Privacy Taxes
Cash is the most secure and private fare payment medium, either when used to board the bus or when purchasing and refilling an agency-issued fare card. It is much more difficult to connect a specific fare card’s movement to an individual passenger when the passenger originally paid for the card using cash.
Because paying fares with currency can slow down service and increase fare collection costs compared to other payment methods, some agencies introduce incentives to pay with cards. For instance, free transfers may not be available when riders pay with currency. Advocates refer to a penalty for paying with cash as a “privacy tax.”
Agencies making the switch to open-loop payment should strive to eliminate privacy taxes. Refilling a card with cash should entitle riders to the same fare value as using credit. And agencies need to ensure that a cash payment option remains accessible throughout the system as open-loop payment systems are introduced. Encouragingly, the MTA has promised to expand its network of sales partners to make agency-issued OMNY cards available for cash purchase in neighborhood stores throughout the city.
Managing Data Securely
Privacy advocates like the Electronic Frontier Foundation (EFF) and STOP have outlined ways agencies can manage data more responsibly. One is simply by developing “clear policies on use, retention, deletion, and access/sharing.” Advocates believe that data collected for the purpose of providing transit service—be it for customer service like when a transit user loses their monthly pass, or for planning bus routes based on ridership patterns—should be used only for that purpose. It is unfortunately unclear what transit agencies can do to prevent other government entities from accessing the data for non-transit purposes.
Anonymizing and aggregating data is a promising strategy for agencies that want to use—and even share—transit data for transportation planning purposes. Researchers at the Metropolitan Transportation Commission (MTC) in San Francisco sought to create a “data product” from fare payment transaction data that could be shared between the San Francisco Bay Area’s 20 separate transit providers and even with interested private sector companies. For the MTC, “the highest value aspect of the Clipper [fare card] transaction data is individual trajectories through the transportation network.” In other words, MTC planners wanted to understand individual riders’ origin-destination data over time.
MTC’s anonymizing scheme:
- Separated all personally identifiable information from the database
- Replaced fare cards’ unique ID with a “pseudo-random identification field that persists for one . . . day”
- Selected a sample of 50 percent of unique cards for each day
- For each day of the week, randomly selected only three of the four or five possible days in which that weekday had occurred that month
- Replaced each date with a unique, random number
- Truncated each timestamp to the nearest 10 minutes
However, when soliciting feedback on the final product from internal and external users, MTC staff found that the anonymization scheme had limited the datasets’ usability for planning purposes. In particular, planners wanted to see trends over more than just 24 hours or on days that saw special events, data points that were lost in the anonymization process. If more attention is paid to data privacy in fare payment, further research could continue to refine the anonymization process while maintaining more of the usability found in the original data set.
Transparency and Clear Communication
Agencies should clearly communicate their fare data management practices to passengers, and explain how fare payment choices affect the collection of personal data. Good privacy practices and transparency about data management can build public trust in a new fare payment system.
Organizations typically convey this information to users by posting privacy policies posted on their websites. Unfortunately, privacy policies are often written in jargon that’s hard for people to digest. The Center for Internet and Society’s Jen King calls such policies “documents created by lawyers, for lawyers.”
Secure Alternatives in Fare Payment
Finally, transit agencies can pursue fare payment systems that collect less passenger data by design. Proof-of-payment fare validation systems, where passengers show an inspector a receipt to demonstrate they’ve paid the fare, potentially generate less location-specific data than gated systems. Proof-of-payment systems face separate challenges, such as the potential for unequal enforcement due to racial profiling, but offer a more secure experience when executed correctly.
Unfortunately, when agencies stop collecting this movement data with swipes and taps, they lose a valuable resource for service planning. The most difficult data to replace is origin-destination data, which is particularly important for designing service that is responsive to where passengers need to go. Surveys are a viable alternative to constructing this origin-destination data. Most agencies already conduct surveys, especially for data that is hard to collect through farebox data alone. Replacement data sources can be found for other important service planning metrics, too: Automatic Passenger Counters (APCs) can be used to calculate load on buses and trains in lieu of farebox data.