Technical Learning at Lyft: Build a Strong Data Science Team

Shumpei Goke
Lyft Engineering
Published in
6 min readApr 24, 2024

--

Written by Shumpei Goke and Jinshu Niu

Image of a collaborative learning environment

Why Technical Learning?

At Lyft, data scientists tackle challenging technical problems every day. To support and empower our data scientists, Lyft’s Technical Learning Council (TLC) provides diverse and high-quality continuous learning opportunities to hone their technical skills. TLC’s mission is “to equip Data Science team members with the technical knowledge and skills that are applicable to their work and helpful to their career advancement.” Investing in technical learning not only aids data scientists in solving complex problems but also contributes to their professional growth, benefiting both Lyft’s business and the individuals involved. We want to foster a culture of continuous learning by providing resources and forums that are easily accessible to everyone. See our previous blog post for more on the motivations behind TLC.

TLC has four main workstreams: Technical Training, CS4DS (“Computer Science for Data Scientists”), Rideshare Seminar, and Science Brown Bag. Let’s take a closer look at each of these workstreams. These are open to new and existing Lyft data scientists, who are encouraged to choose from the opportunities depending on their current skill levels and schedules.

Technical Training

Technical Learning offers a rich array of lecture series on data science methodologies and applications, taught by our fellow data scientists and typically run for 6–10 weeks. Data science is a multidisciplinary field that combines knowledge in statistics, computer science, machine learning, causal inference and many more. Business acumen and domain knowledge on specific product areas is critical as well. It is rare for a data scientist to be skilled in everything, especially in the first couple of years of their career. The Technical Learning workstream offers Lyft scientists the opportunity to build or brush up their core data science skills across multiple areas.

Over the years, we have successfully launched lecture series on topics such as experimentation, observational causal inference, structural causal modeling, and reinforcement learning. These lectures typically start with the foundational theory, followed by applications within Lyft. Some recent examples can be found in our tech blog: reinforcement learning and structural causal modeling. This style of lectures enables attendees to understand the topic deeply, as the theory and applications complement each other and strike a fine balance. In 2024, the Technical Training workstream is offering a course on Large Language Models with a focus on both the theoretical fundamentals and their day-to-day applications to Lyft’s business.

An oddball in the past offerings by Tech Learning is the series titled “How to Build a Rideshare Company.” This series delves into essential aspects of Lyft’s business, such as pricing, incentive campaigns, assignment of drivers and mapping, and explains how data science is applied to balance the two-sided marketplace and deliver reliable service to both our riders and drivers. The lecture series attracted huge attention among scientists as a great opportunity to develop their holistic view of how Lyft operates its business and where data science plays a role in optimizing our marketplace and driving growth.

Snapshot Photo from a Technical Training Session

CS4DS

Computer science and software engineering are fundamental data science skills. This is especially true at Lyft, where data scientists work very closely with software engineers and read (and write!) production-level code. It is, however, not uncommon for talented data scientists to start their careers without a formal education in computer science.

The CS4DS course addresses this gap. The goal is to teach data scientists the foundational knowledge in computer science, in order to elevate their programming abilities and enhance collaboration with engineers. The course covers both the theoretical foundations such as big-O notation and Object Oriented Programming (OOP) and practical software engineering skills like Git, containers, and unit testing. It is an intensive self-served course with lectures, homework assignments and mentorship with regular check-ins. The students get feedback on their coding styles, which is very hard to get if they were studying on their own, but is essential to upskill their coding practices. Office hours are provided for students to get hands-on assistance from volunteers.

The course has attracted great enthusiasm from the participants, and we typically see higher completion rates than other technical training. It has contributed greatly to uplift the software engineering skill of data scientists. Past participants have found that the course has helped them read and understand production-level code faster and write code that respects the principles of OOP and is easier to maintain. We even have a data analyst who transitioned to a ML software engineer after completing the course!

Landing page of the internal website for the CS4DS course

Rideshare Seminar

Our biweekly Rideshare Seminar invites external guests from academia and industry to deliver talks on their research. It is a great opportunity for data scientists to catch up with the latest research trends that are relevant to Lyft’s business.

The topics are broad and inspiring, ranging from the analysis of the future world with autonomous vehicles (Freund, Lobel, and Zhao, 2022) to the analysis of the gig economy (Lian, Martin, and Ryzin, 2022) to experiment analysis that mitigates bias from marketplace interference (Bright, Delarue, and Lobel, 2023). These seminars provide great opportunities for knowledge sharing, networking, and have strongly inspired and promoted collaboration across teams to build products with advanced technologies in the industry.

We have successfully run about 30 Rideshare seminars in the past 2 years with speakers from 16 different universities around the world, attracting over 400 total seminar attendance.

Image of Rideshare Seminar Sessions

Science Brown Bag

Science Brown Bag provides a forum for Lyft data scientists to showcase their project achievements and promote knowledge sharing. By having open and informal discussions on their work, data scientists can promote their tools and applications for wider adoption and gather valuable feedback from their peers. Additionally, the Brown Bag also serves as a forum to foster collaboration amongst groups with similar ideas.

The topics at Science Brown Bag vary, and include important advances in our technology and infrastructure, such as a new metric on measuring the supply-demand imbalance, graph-based embeddings, recommendation systems, and our machine learning platform and data platform.

Snapshot Photo from a Science Brown Bag Session

Final Words

The Technical Learning Council is proud to offer Lyft’s data scientists with a wide spectrum of learning opportunities. At Lyft, data scientists play a pivotal role in driving innovation, and TLC is committed to building a robust data science team, facilitating innovation, and supporting the continuous growth and success of our scientists.

If you are excited to be part of Lyft’s data science team, explore opportunities on our careers page. Join us and let’s work together to improve people’s lives with the world’s best transportation!

Special thanks to the current and previous workstream leads for the Tech Learning Council, including Amber Wang, Baichuan Mo, Frances Huang, Hao Yi Ong, Li Wang, Miriam Leon, Nick Ung, Paul Havard Duclos, Ramon Iglesias, Vicky Liu, and Zhe Hu!

--

--