Matchmaking in Lyft Line — Part 1

timothybrownsf
Lyft Engineering
Published in
6 min readFeb 2, 2016

--

Lyft Line is a ridesharing product that automatically pairs riders together with overlapping routes, allowing us to provide a cheaper ride for all passengers. The more overlap two routes have, the larger the discount we can provide. But what makes a good Lyft Line route? We think it’s a cheap, fast, efficient, and friendly ride. And how do we know if we’ve matched a passenger in a route that’s fast and efficient? How can we accurately predict whether two, three, or four passengers together would enjoy the route we’ve selected? That’s the challenge of our matching system, something we’ve been working on for over a year and a half. This is the story of how it’s evolved, and where it’s going, and will be laid out as a 3 part series.

Motivation

We launched Line in August of 2014, though it’s a project that Lyft has long been interested in building given its roots in ridesharing service Zimride. For several years before launch we’ve kept a pulse on the market, but wanted to wait until we had sufficient density before building a true ridesharing product. In May 2014, we ran a simulation on our existing Lyft Classic rides (rides that aren’t shared) and realized that in San Francisco, over 80% of rides shared a significant portion of their route with another ride[1]. This was a staggering number — only a portion of the cars in the city were Lyft rides, and yet potentially the majority of those could be paired together with other passengers. This was a huge opportunity to change transportation and provide more affordable rides to our users.

In May 2014, we ran a simulation on our existing Lyft Classic rides and realized that in San Francisco, over 80% of rides shared a significant portion of their route with another ride.

Design Decisions

In the initial design of Line, passengers would enter in their origin and destination, receive a fare for the ride, and then be put in the matching pool for 1 minute before being assigned a driver. If we didn’t find a good match at the end of that minute, we’d still dispatch a driver to the passenger, and they’d still receive the quoted fare. However, this wasn’t the only design we had considered.

We also contemplated building a bucketed matching system. Instead of being in the matching pool for a fixed period of time (1 minute), we’d have all requests stay in the matching pool until the next clock interval. A ten minute bucket interval would mean a ride requested at 8:03 or 8:09 would remain in the matching pool until 8:10, and a ride requested at 8:11 would be in the pool until 8:20. A longer time in the matching pool meant there would be more requests we could match together at one time and thus a higher likelihood of finding a match. Additionally, a variable bucket interval would allow us to adapt to lower density cities at launch, like Honolulu, as we could increase the bucket interval to be 15 or 20 minutes. However, this approach had some downsides.

Finding drivers for all of our passengers at the same time would add a supply shock to our system as we’d need to have a pool of drivers available on the ten minute mark. And a ten minute interval meant a longer wait for passengers before showing them their route. Given the initial implementation was going to have some rough edges, a ten minute wait might be too long if they were then put into a bad match. Ultimately we decided that the one minute matching pool could provide the best experience for our passengers and was what users had come to expect from Lyft as an on-demand service.

Naive Matching

Because of the opportunity we had to revolutionize transportation, and our engineering philosophy of being agile, we started with a simple greedy haversine matchmaking system. A greedy system is one in which we take the first match we find that satisfies our constraints, as opposed to the best one for the entire system. Choosing a greedy system made sense — optimizing for multiple potential matches wouldn’t be possible until we’d grown the product to a sufficient size. Haversine distances are straight-line distances between two points and are multiplied by the region’s average speed to get a time estimate. A haversine implementation is quick to implement and with tight enough constraints (e.g. detours, time until pickup, etc.) would make sure users had a good match.

The initial implementation compared every passenger with every other passenger in the system (O(n^2)), in all possible orderings. We denoted passengers as letters of the alphabet and every passenger has two stops — a pickup and drop-off — A and A’, B and B’, etc. So when comparing passenger A and B, we looked at 24 potential orderings: ABB’A’, ABA’B’, B’A’BA, B’AA’B, B’ABA’, BAA’B’, AA’B’B, B’BAA’, etc. We were able to reduce the number of permutations down to only 4 given that there would never be a drop-off before a pickup, and an ordering such as AA’BB’ had no overlap and thus wasn’t worth considering. We would look at all four permutations and eliminate ones that didn’t satisfy all of our constraints. We would then choose the most optimal ordering[2], make the match, and notify the passenger. For reference, we will now refer to an ABA’B’ route as ABAB since the second A is inferred to be A’ (drop-off).

def make_matches(all_rides):
for r1 in all_rides:
for r2 in all_rides:
orderings = []
for ordering in get_permutations(r1, r2):
if is_good_match(r1, r2, ordering):
orderings.append(ordering)
best_ordering = get_best_ordering(r1, r2, orderings)
if best_ordering:
make_match(r1, r2, best_ordering)
// etc ...

Our constraints started off with some of the more obvious considerations: detour and pickup time. For a match to be made, the total detour that match added for each passenger would have to be below an absolute threshold, but would also have to be below a proportional threshold. This makes sense as one can imagine a 5 minute detour is much more tolerable on a 30 minute ride than on a 5 minute ride. Detours were simple and efficient to calculate — we would take the estimated time for each passenger to get to their destination without a match and compare that with the total time it would take them to get to their destination when matched. For an ABAB ordering this meant passenger A’s detour was ABA minus AA, and passenger B’s detour was BAB minus BB.

We had similar constraints for additional time until passengers were picked up, and would be calculated as follows for an ABAB route (assuming | represents the driver): |A is A’s pickup time, |AB is B’s pickup time. Clearly, being the second passenger in a Line ride reduced that rider’s average detour, as they might not have a another stop between pickup and drop-off (ABBA), but came with an increased pickup time.

After launch, we quickly started to realize the limitations of our system. Passengers would post examples of their routes that matched them with a stop on the other side of a mountain or in a way that required an additional loop through crowded downtown one-way streets in rush hour. We started learning that passengers didn’t want to go backwards, and in fact the angle of the lines they saw on the map would affect their satisfaction with the match. In ways both obvious to our initial implementation and in other not so rational ways, our system as it was currently built wasn’t going to give the ride quality that we wanted. We had to make some improvements, and we had to do it fast or people were going to get frustrated with their experience. Coming in part 2: improvements to our passenger experience and algorithmic efficiency.

Read part 2 of the series here.

[1] Our initial modeling required rides to share over 50% of their ride and allowed for a maximum detour of 10 minutes when paired.

[2] The most optimal ordering is the one in which the total matched distance is minimized. For example if an ABBA route had a total distance of 3.5 miles, but an ABAB route had a total distance of 3.2 miles, we would select the ABAB route.

If you’re interested in building the future of ridesharing, Lyft is hiring! Shoot me a message on Twitter or email me at tim@lyft.com

--

--