Do Not Track: A Guide to Data Privacy For New Transit Fare Media

Press

StreetsblogMass: “Do Not Track Report Offers Privacy Guidance for the T’s New Fare System.”

Introduction

As digital communication technologies proliferate, people can purchase goods and services using an expanding array of payment methods: digital wallets, credit cards equipped with near-field communication (NFC), and other new methods are becoming increasingly common. Public transit agencies nationwide are adapting fare collection systems to accept a wider range of these payment methods, improving the convenience of transactions for passengers. Agencies also see the potential for new payment methods—or fare media—to improve transit operations and service delivery. Possible benefits include faster bus boarding and the integration of fare policy across modes and agencies within a single metro area.

At the same time, new fare media raise legitimate individual privacy concerns. Namely, they have the potential to significantly increase the personal data generated and collected by transit agencies, as well as the private companies agencies contract or partner with. Once collected, the data can be accessed by other government entities, sold to private companies (in the case of private sector data collection), or simply be vulnerable to a data breach. Many riders may not be aware of which data is collected or how it is used. As public service providers, transit agencies must a) limit the collection of passengers’ personal data whenever possible, and b) manage the data they do collect responsibly and in a manner that respects passengers’ privacy. Agencies that enact good privacy practices and transparently share those practices with the public will encourage adoption of new fare media by enabling riders to make informed choices.

This policy brief explores the privacy risks of new transit fare media and recommends four methods agencies can adopt to safeguard riders’ privacy and give them confidence in the fare payment system:

Ensure that riders retain the ability to pay without being linked to a credit card account or other personal identifier, and that these payment options are priced at the same rate as newer payment systems that collect and generate more data. This includes agency-issued fare cards with a stored value that can be refilled with cash at a fare card machine or third-party vendor.
Make secure data management an organizational priority. Agencies should adopt policies for secure data management, and strive to constantly improve data security the same way they actively seek to improve service and operations.
Clearly and transparently communicate privacy policies. Riders should be able to easily find out what data is collected, how that data is used, and which parties can access their data.
Use data sources that protect personal privacy to improve service planning for riders. Many transit agencies use fare payment data to track how riders are using the system. They then use this data to adjust service to best fit rider needs. Other data sources, such as automated passenger counters (APCs) and passenger surveys, may provide similar information while collecting less personal data.

Changes in Fare Payment

The major shift in the fare media landscape is from closed-loop payment systems to open-loop payment systems. Closed-loop payment systems — still the most commonly used by transit agencies — are characterized by fare media used exclusively within the transit system. Examples include physical tokens, punch cards, swipe cards, and even NFC-enabled, agency-issued tap cards. With closed-loop payment systems, the transit agency typically retains control over all data generated by passengers, because the agency controls the fare media. If a passenger uses a fare card with a unique ID, then the agency can collect data on the unique ID, the time that it was used, and the station or stop where it was used. This data is a powerful tool for transit agencies to understand trip patterns and identify where service should be allocated.

Notably, some types of fare media used in closed-loop payment systems—punch cards or tokens, for example—do not provide such detailed data for transit agencies to plan service. This is because punch cards and tokens provide no personally identifying information or unique ID during the transaction process that the agency can associate with a particular user, as well as that user’s travel behavior. The agency can see that a token was used but not who used it.

Open-loop payment systems differ from closed-loop payment systems in that they integrate third-party payment methods, such as NFC-enabled credit cards and digital wallets. These systems can markedly improve convenience for passengers. With open-loop payment, as long as transit users have a credit card or smartphone on hand, they don’t have to worry about purchasing a separate fare card or making sure the card has funds before boarding. These payment systems also can produce dramatic time savings, especially on buses: The MTA estimates that NFC-enabled devices reduce the transaction processing time from about 2.4 seconds swiping a MetroCard to 500 milliseconds, or approximately one minute saved for every 30 passengers who board the bus. For a route like the B6 in Brooklyn with an average daily ridership of 34,000 passengers, this could add up to almost 18 hours of bus service saved per day, which the agency can then reinvest in additional service on the route.

The Two Branches of Transit Data Collection

The major shift in the fare media landscape is from closed-loop payment systems to open-loop payment systems. Closed-loop payment systems — still the most commonly used by transit agencies — are characterized by fare media used exclusively within the transit system. Examples include physical tokens, punch cards, swipe cards, and even NFC-enabled, agency-issued tap cards. With closed-loop payment systems, the transit agency typically retains control over all data generated by passengers, because the agency controls the fare media. If a passenger uses a fare card with a unique ID, then the agency can collect data on the unique ID, the time that it was used, and the station or stop where it was used. This data is a powerful tool for transit agencies to understand trip patterns and identify where service should be allocated.

Notably, some types of fare media used in closed-loop payment systems—punch cards or tokens, for example—do not provide such detailed data for transit agencies to plan service. This is because punch cards and tokens provide no personally identifying information or unique ID during the transaction process that the agency can associate with a particular user, as well as that user’s travel behavior. The agency can see that a token was used but not who used it.

Open-loop payment systems differ from closed-loop payment systems in that they integrate third-party payment methods, such as NFC-enabled credit cards and digital wallets. These systems can markedly improve convenience for passengers. With open-loop payment, as long as transit users have a credit card or smartphone on hand, they don’t have to worry about purchasing a separate fare card or making sure the card has funds before boarding. These payment systems also can produce dramatic time savings, especially on buses: The MTA estimates that NFC-enabled devices reduce the transaction processing time from about 2.4 seconds swiping a MetroCard to 500 milliseconds, or approximately one minute saved for every 30 passengers who board the bus. For a route like the B6 in Brooklyn with an average daily ridership of 34,000 passengers, this could add up to almost 18 hours of bus service saved per day, which the agency can then reinvest in additional service on the route.

Managing Public Sector Data Collection

Concerns related to public sector collection of passenger data are associated with broader and well-known privacy concerns regarding government surveillance and social control. Privacy advocates argue that the government’s collection of personal data is a method for promoting conformist behavior, as individuals fear scrutiny, judgment, and retribution for acting outside accepted norms. While these concerns have often focused more on internet activity, mobility data is also particularly sensitive and deeply revealing about an individual’s activities. The United States Supreme Court writes that time-stamped location data “provides an intimate window into a person’s life, revealing not only his particular movements, but through them his ‘familial, political, professional, religious, and sexual associations.’”

For transit agencies, there is a strong public interest in using mobility data to understand how passengers use the transit system, which informs decisions about how to allocate service. Privacy issues arise, though, when other government entities can access the data that transit agencies collect. In New York, for instance, organizations like the Surveillance Technology Oversight Project (STOP) highlight the risk of allowing MTA travel data to be accessed by NYPD and, in turn, Immigration and Customs Enforcement (ICE): “Transit history data would enable ICE to locate immigrant community members by allowing the agency to track their daily movements. Further, identity-based surveillance using [the fare payment system] OMNY could compromise a rider’s right to anonymous public speech and association.”

This tracking is already possible with the MTA’s MetroCard system, but the MTA estimates that the process of retrieving personal information can take up to two weeks. With OMNY, the process is near-instantaneous, introducing the possibility of real-time social controls.

The impulse to use transit to restrict people’s movement and limit collective expression is well-documented in the U.S. and abroad. During Hong Kong’s 2019 pro-democracy demonstrations, the Mass Transit Railway (MTR) closed metro stations in close proximity to where protestors were gathering, and protestors began paying for transit trips with cash for fear of the government tracing their involvement using transit data. Closer to home, multiple cities shut down transit access to areas where people exercised First Amendment rights during 2020’s Black Lives Matter protests. While those limitations were imposed without access to personal travel data from open-loop systems, that transit access was curtailed in order to restrict movement is concerning. Transit agencies and regulators should take steps to prevent data from open-loop systems leading to more intrusive surveillance and control of individual travel.

How can travel data from new fare payment systems serve the public’s interest in responsive transit planning while preventing that data from being misused by other government agencies? Ultimately, better federal regulation of data privacy similar to the European Union’s General Data Protection Regulation (GDPR) is needed to ensure that transit data is used only for transit-related purposes. Until that time, however, transit agencies must recognize their role in generating and collecting personal mobility data, as well as how that data might be abused. This responsibility extends to the more recent development of open-loop payment systems, through which private companies will gain greater access to transit users’ personal mobility data.

Managing Private Sector Data Collection

Transit data collected through fare payments can contribute to a larger card-data economy, through which private companies can create incredibly detailed profiles of individuals’ personal lives, behaviors, and preferences — all through consumers’ purchasing histories. Tech journalist Geoffrey Fowler tracked his purchase of a single banana at Target and found that “six types of businesses could mine and share elements of [his] purchase, multiplied untold times by other companies they might have passed it to.” While transit agencies cannot be held responsible for the lack of federal regulation that permits this ready exchange of personal data between private companies, they need to understand the role that their fare payment systems play in the monetization and exchange of their riders’ data.

It is helpful to understand what data is generated during each fare payment transaction and who exactly is able to capture this data. Payments expert Stephen Cho describes the “four-party payment systems” regime that is dominant in the United States. These four parties are: 1) the cardholder, 2) the merchant, 3) the card issuer, and 4) the merchant acquirer. In transit, parties 1 and 2 are nearly always the passenger and the transit agency, respectively. The card issuer is the financial institution or bank that has issued the credit or debit card. And the merchant acquirer is “a financial institution that enrolls merchants into programs that accepts cards.” When a passenger (party 1) accesses the transit system using an open-loop payment device, each of the other parties collects data on that transaction.

Once the private companies have collected the data, United States privacy law—in particular, the Gramm-Leach-Bliley Act—imposes few restrictions on how they analyze, share, sell, or otherwise use it. The American Civil Liberties Union (ACLU) warns that “companies could be collecting a vast amount of detail about our lives: how much we spend on travel, restaurants, political or religious donations, liquor stores, sex shops, and on and on,” adding, “that kind of information is more powerful and revealing when combined with other data.” Open-loop payment systems potentially offer private companies more detail on not just “how much we spend on travel,” but when and where we travel, as well.

As with concerns about the public sector and social controls, private companies can use this information to direct consumer spending, target and prey on certain demographic groups, and monetize one’s life in a manner that they are not complicit with. ABC News has reported that, in at least one instance, a credit card company used “behavioral scoring” to lower a man’s credit limit “because other shoppers at certain stores he patronized had proven to have poor credit records.” Similarly, the marketing firm Affinitiv Inc. “develops scores by crunching data on things such as previous car purchases, whether a household has a teenager, where else a person has shopped and zip codes, which can be used as a proxy for income.” This type of consumer scoring—by credit card companies or by other firms that have purchased card data—can clearly lead to a disparate racial impact when factors like zip codes and income are considered.

Public transit is one of many pieces in the broader data privacy and card-economy puzzle. Yet transit agencies are in a uniquely difficult position among government entities in that they straddle a line between public service and profit-oriented business: few other government agencies interact with the public so frequently. If someone has a negative experience at the DMV, the inconvenience is relatively small because it will be months or years before they next need to visit the DMV. For people who rely on public transit, though, transit is a daily necessity. Transit agencies are motivated to provide the best experience for their passengers because they take pride in good service, and also because they must compete with other modes for their passengers’ patronage. Pressure to modernize payment systems derives from the imperative to improve the passenger experience, but in the process agencies must respect passenger privacy.

Protecting Passenger Data: Collection and Anonymization

Several strategies are available to transit agencies as they seek to better protect passenger data: 1) eliminating “privacy taxes” and empowering passengers to make an informed choice when choosing their fare media; 2) anonymizing and aggregating the data that is collected; 3) designing fare payment systems that collect less personal data in the first place.

Eliminating Privacy Taxes

Cash is the most secure and private fare payment medium, either when used to board the bus or when purchasing and refilling an agency-issued fare card. It is much more difficult to connect a specific fare card’s movement to an individual passenger when the passenger originally paid for the card using cash.

Because paying fares with currency can slow down service and increase fare collection costs compared to other payment methods, some agencies introduce incentives to pay with cards. For instance, free transfers may not be available when riders pay with currency. Advocates refer to a penalty for paying with cash as a “privacy tax.”

Agencies making the switch to open-loop payment should strive to eliminate privacy taxes. Refilling a card with cash should entitle riders to the same fare value as using credit. And agencies need to ensure that a cash payment option remains accessible throughout the system as open-loop payment systems are introduced. Encouragingly, the MTA has promised to expand its network of sales partners to make agency-issued OMNY cards available for cash purchase in neighborhood stores throughout the city.

Managing Data Securely

Privacy advocates like the Electronic Frontier Foundation (EFF) and STOP have outlined ways agencies can manage data more responsibly. One is simply by developing “clear policies on use, retention, deletion, and access/sharing.” Advocates believe that data collected for the purpose of providing transit service—be it for customer service like when a transit user loses their monthly pass, or for planning bus routes based on ridership patterns—should be used only for that purpose. It is unfortunately unclear what transit agencies can do to prevent other government entities from accessing the data for non-transit purposes.

Agencies can, however, commit to not sharing data with any private companies. Transit agencies can also limit the data made available to other government entities (and themselves) through policy that limits how long data can be retained, after which it is deleted permanently. STOP points out that OMNY’s privacy policy “places no explicit temporal limits on the MTA or Cubic’s ability to store usage data or personal information nor does it even explain what statutory limits it might be subject to.” Even for service planning purposes, transit agencies have little reason to store granular transit data much longer than one year.

Anonymizing and aggregating data is a promising strategy for agencies that want to use—and even share—transit data for transportation planning purposes. Researchers at the Metropolitan Transportation Commission (MTC) in San Francisco sought to create a “data product” from fare payment transaction data that could be shared between the San Francisco Bay Area’s 20 separate transit providers and even with interested private sector companies. For the MTC, “the highest value aspect of the Clipper [fare card] transaction data is individual trajectories through the transportation network.” In other words, MTC planners wanted to understand individual riders’ origin-destination data over time.

MTC’s anonymizing scheme:

Separated all personally identifiable information from the database
Replaced fare cards’ unique ID with a “pseudo-random identification field that persists for one . . . day”
Selected a sample of 50 percent of unique cards for each day
For each day of the week, randomly selected only three of the four or five possible days in which that weekday had occurred that month
Replaced each date with a unique, random number
Truncated each timestamp to the nearest 10 minutes

However, when soliciting feedback on the final product from internal and external users, MTC staff found that the anonymization scheme had limited the datasets’ usability for planning purposes. In particular, planners wanted to see trends over more than just 24 hours or on days that saw special events, data points that were lost in the anonymization process. If more attention is paid to data privacy in fare payment, further research could continue to refine the anonymization process while maintaining more of the usability found in the original data set.

Transparency and Clear Communication

Agencies should clearly communicate their fare data management practices to passengers, and explain how fare payment choices affect the collection of personal data. Good privacy practices and transparency about data management can build public trust in a new fare payment system.

Organizations typically convey this information to users by posting privacy policies posted on their websites. Unfortunately, privacy policies are often written in jargon that’s hard for people to digest. The Center for Internet and Society’s Jen King calls such policies “documents created by lawyers, for lawyers.”

Transit agencies’ privacy policies are no exception. For example, based on the Flesch-Kincaid scale–a common measure of readability–OMNY’s policy would score at the 15th Grade Level, meaning it’s about as readable as Steven Hawking’s A Brief History of Time or an academic paper. The CTA’s Ventra privacy policy is similarly complex, but the CTA provides simple summary bullets at the top of the webpage. Other agencies might adopt this practice and expand it to include some of the information in this brief (e.g. which law enforcement agencies can access the data without the users’ knowledge, or that filling a fare card with cash prevents personal information from being shared).

Agencies should also guide passengers to their privacy policies through marketing materials and signage throughout the system. One of the concerns about tap-and-go, NFC payment systems is that they smooth the payment process but do not give users the opportunity to understand or opt out of the privacy policy to which they’ve implicitly agreed when they tap into the system. Signage at entrances, in stations, and aboard vehicles can alleviate at least some of these concerns by informing passengers as to where they can access the agency’s privacy policy. QR codes posted at turnstiles can link passengers to the privacy policy using the same smartphone that they are about to tap with. For agencies seeking to assure riders that they can protect personal data, honesty and openness are the best policy.

Secure Alternatives in Fare Payment

Finally, transit agencies can pursue fare payment systems that collect less passenger data by design. Proof-of-payment fare validation systems, where passengers show an inspector a receipt to demonstrate they’ve paid the fare, potentially generate less location-specific data than gated systems. Proof-of-payment systems face separate challenges, such as the potential for unequal enforcement due to racial profiling, but offer a more secure experience when executed correctly.

Unfortunately, when agencies stop collecting this movement data with swipes and taps, they lose a valuable resource for service planning. The most difficult data to replace is origin-destination data, which is particularly important for designing service that is responsive to where passengers need to go. Surveys are a viable alternative to constructing this origin-destination data. Most agencies already conduct surveys, especially for data that is hard to collect through farebox data alone. Replacement data sources can be found for other important service planning metrics, too: Automatic Passenger Counters (APCs) can be used to calculate load on buses and trains in lieu of farebox data.

Conclusion

Open-loop payment systems offer tremendous potential in streamlining fare payments for passengers and agencies alike. And many agencies that are not yet moving to open-loop payment are nonetheless turning to third-party private companies to support their fare payment and service operations, especially through mobile applications. However, the increased involvement of third parties in fare payment underscores the need for better data collection and management policies within transit agencies. Through proactive measures, transit agencies can set the stage for protecting passenger data even as new technologies emerge.