A Federated Approach To Providing User Privacy Rights

Alejo Grigera Sutro
Lyft Engineering
Published in
6 min readSep 7, 2022

--

Our journey of enabling user privacy rights started all the way back with the California Consumer Privacy Act (“CCPA”) going live in 2020. Recently, new state laws have begun going into effect, including upcoming changes in California with the California Privacy Rights Act, and our early work set us up for success. Lyft is positioned to seamlessly handle changes thanks to the choices we made with the federated design of our architecture. In this blog post, we’ll share an overview of some technical strategies Lyft uses to provide important privacy rights like those in CCPA and describe how we implemented user data export and deletion in our online systems.

To start off, let’s review some of the key challenges we faced in developing our response to user requests regarding privacy rights.

Scaling Challenges

Lyft has a dynamic and fast-paced product and engineering environment. The user data we collect and store changes frequently. In planning our design, we needed to ensure that whatever we built could automatically include datastores as they were created or modified without having to make error-prone manual configuration changes.

Staying Secure

We carefully designed our approach to keep personal data secure, especially the data that we would provide in response to access requests. It was crucial that our design would not create new avenues for abuse of the platform or of our users.

Making Data Useful

If we responded to data access requests by dumping unorganized or duplicative datasets in a random assortment of formats, this would not be a readable or useful approach. Instead, we needed to curate the datasets we export to ensure that we present a comprehensive and meaningful view of the data we have for each user.

Business Needs

We had to be mindful of the operational impacts of data deletion. For example, we shouldn’t allow someone to erase their account in the middle of a ride. We also needed to implement deletion in a way that would allow the business to continue to report on broader metrics such as unique users and how many rides were given in the last year.

Design

Initially, we considered a centralized system for deleting data, but that had daunting downsides if things went wrong. Affected teams who would have to recover data and fix business problems caused by a mistake would not be as familiar with a centralized system, which could increase the time required to resume service.

Thus, we decided on a federated design with a central orchestrating service that communicates with all other services involved in exporting/deleting data. We wanted teams to own their data and its lifecycle, and to give them the flexibility to add additional data hygiene tasks into the erasure process.

Export requests orchestrated by Lyft data infrastructure.

A simple state machine lies at the heart of the orchestration service. A series of asynchronous steps progress the request through states in its lifecycle. This way we can be more flexible in configuration options for services, allowing them to batch requests or add other cleanup and auditing tasks as needed.

Some export files are big, so it was also important to give services enough time to load and then write all the data to a staging area. Our asynchronous design consists of queued callbacks for each state transition which provide access points for controlling the flow of requests through the state machine.

Erasure requests orchestrated by Lyft data infrastructure.

For erasure, we also introduced the concept of sanctions. Stakeholder teams implement a “sanction endpoint” that can flag an erasure request and prevent us from removing data when we should not. For those special cases, we designed a process to retain certain data that we need for specific allowable purposes, such as a required hold on information related to litigation. We also introduced a wait period after an erasure request to allow transactions to be reconciled. For example, lost and found claims should be settled before records are deleted. Sanctions are checked before and after the wait period, and data deletion is carried out within reasonable timeframes.

The distributed nature of the entire process ensures that every step is fault-tolerant and idempotent. If an erasure job fails at any point, it will be retried at various steps of the flow. This minimizes operational burdens and improves the reliability of the service as a whole. We developed a cron-based layer that scans the request database for requests that aren’t advancing and pushes them back into the state machine. The self-healing nature of this methodology means that any node in the process can temporarily fail and the process will resume smoothly upon recovery, drastically improving reliability and reducing overall maintenance costs.

Key Elements

We developed a set of policies that dictate the lifecycle of all data at Lyft. To make this easier, we split data into various categories and assigned retention and deletion strategies to each of them.

The Catalog

We built a central catalog of our datastores and added metadata that gives us ownership context and associates them with services. We integrated the catalog into the datastore lifecycle process, ensuring that all new datastores going forward have sensible default policies applied to them, depending on the type of data being stored. This enhanced ownership information gives us finer-grained attribution measures that help to prevent datastores from becoming orphaned as teams and services evolve. The catalog is also an important part of compliance auditing.

Data Retention Tooling

Engineers needed to be able to easily set retention periods for certain datastores, so we built tooling and processes that make this possible with simple configuration changes.

A good example of this is DynamoDB. AWS provides TTL support for Dynamo, but that requires each row to have an attribute specifying the expiry date, so it’s not as simple as just flipping a switch. We extended the syntax of our protobuf table definitions to introduce annotations detailing retention policies. These annotations feed into our existing server code-generation process to produce code that automatically populates a TTL field for all new records in the relevant database. In practice, it looks like this:

An example of TTL attributes.

Finally, we built in-house tooling to facilitate migrating existing datastores to these new systems. For systems like DynamoDB which support record-based TTL, the tool helps with the heavy lifting of reliably backfilling the TTL values of existing records without impacting live traffic. For other systems such as Hive, we built tooling to automate partitioning and partition naming schemes to apply DROP PARTITION commands consistently across tables based on the central catalog.

Deletion on Request

Lastly, we needed tooling to help teams quickly and easily integrate with our export and deletion systems, so we developed a library of boilerplate code for interfacing with the orchestrating service. The library facilitates processing requests from a service and reporting completion back to it. This defers complex decisions, such as whether to delete the entire row or just parts of it, to the team that knows the data best. It also allows for some flexibility in terms of setting timeouts, batching operations, and endpoint naming schemes. These features make providing support much easier, too — using a library means we can deploy central updates to the pipeline boilerplate without requiring manual intervention.

We also implemented dynamic alerting that automatically pages a service owner if we are sending them requests that are not being completed. It provides an automatic failsafe so that a centralized team doesn’t need to monitor everything.

Looking into the Future

This system was primarily built to deliver on export and erasure requests, but it also gives us a solid foundation for continued privacy innovation. We are actively working on enhancements to our privacy offering, data discovery, and unified metadata systems.

Lyft is hiring! If you’re passionate about privacy, security by design, and building the infrastructure that powers it all, come join our team.

--

--