Lyft engineering blog

Try:

Lyft1 month ago

Migrating Lyft’s Android Codebase to Kotlin

Lyft completed a multi-year migration of its Android apps from Java to Kotlin. The post covers motivations (conciseness, K2 compiler, Jetpack Compose, coroutines), pre-migration tracking with an internal Migration Tracker, automated migration tooling (Migration Script using Android Studio IDE scripting), challenges with converters and nullability/legacy interfaces, and a post-migration CI lint check that prohibits new Java files.

Lyft2 months ago

Intern Experience at Lyft

Lyft intern data scientists describe developing causal difference-in-differences models to quantify the impact of electric vehicle conversions on driver productivity, incorporating external charging infrastructure projections and third-party data to mitigate incomplete driving pattern visibility. They also outline analytical approaches for the Lyft Rewards loyalty program, defining engagement metrics, segmenting drivers by behavior, and optimizing incentive strategies to improve retention efficiency.

Lyft3 months ago

Solving Dispatch in a Ridesharing Problem Space

Lyft describes modeling driver–rider dispatch as a bipartite matching problem (ILP/LP), explains why LP relaxations and the Hungarian method are applicable, and highlights the practical challenges of computing optimal matchings in real time (frequent graph generation, weight updates, batching tradeoffs, preprocessing/pruning, forecasting and dynamic rebalancing).

Lyft5 months ago

How science inspires our ETA models

Lyft Data Science explains how microscopic, random traffic events aggregate over longer distances to produce more stable ETA estimates. Empirical evidence and intuition are tied to the Central Limit Theorem and plans to analyze the phenomenon via random walks on stochastic networks. The post shows that average travel time per road segment becomes more symmetric and predictable as distance increases and validates this on a Bay Area route. No specific engineering tools are mentioned.

Lyft5 months ago

Beyond Query Optimization: Aurora Postgres Connection Pooling with SQLAlchemy & RDSProxy

Lyft describes adopting AWS RDS Proxy for Aurora Postgres to centralize connection pooling for Flask/SQLAlchemy applications. They outline why application-level pools (gunicorn + gevent on Kubernetes) scale poorly, how to configure RDSProxy (VPC, Secrets Manager, IAM, Terraform), integration with SQLAlchemy, and key gotchas (statement_timeout via connect args, session pinning caused by session-scoped state). Scale tests and CloudWatch metrics show large reductions in DB connections, improved CPU, and better multiplexing ratios after fixing session-pinning behaviors.

Lyft6 months ago

Real-Time Spatial Temporal Forecasting @ Lyft

Lyft describes its real-time spatial-temporal forecasting system for minute-level, geohash-level marketplace signals. The article compares classical time-series and deep-learning models, discusses trade-offs between accuracy, latency, and cost, details refitting strategies for real-time use, and outlines the production architecture and tech stack (streaming pipelines, feature store, Lyft MLP, Airflow, SageMaker, ClickHouse, Kafka/MSK, etc.).

Lyft6 months ago

From manual fixes to automatic upgrades — building the Codemod Platform at Lyft

Lyft's Frontend Developer Experience team built an internal Codemod Platform to automate large-scale code transformations and dependency upgrades across 100+ frontend microservices. They use jscodeshift to support TS/TSX/JS/JSX and non-JS files (YAML/JSON/.env), expose a CLI via an internal npm package (@lyft/codemod) runnable with npx, integrate transforms into CI and their Refactorator PR automation, test codemods with jscodeshift's defineTest and AST Explorer, and have converted many major upgrades into minor, automated updates.

Lyft8 months ago

FacetController: How we made infrastructure changes at Lyft simple

Lyft built FacetController, a Kubernetes controller using CRDs to represent "facets" (deployable parts of microservices). FacetController centralizes facet lifecycle management so infrastructure-level template changes (autoscaling, resource templates, cluster API/version differences) can be rolled out safely and quickly across many services, enabling migrations (e.g., to Karpenter), automatic garbage collection, and improved developer/operator experience without mass redeploys.

Lyft9 months ago

Using Marketplace Marginal Values to Address Interference Bias

Lyft presents Marketplace Marginal Values (MMV) — a method using shadow prices (dual variables) from the dispatch/matching optimization to correct interference (network) bias in marketplace A/B experiments. The article explains the theory (linear relaxation, duality, Taylor approx), how to compute MMVs via hourly solves of the matching/dispatch optimization and extract duals, how to include MMV-corrected metrics in experiment reports (with CUPED), practical choices (1-hour matching cycle), comparisons to alternative designs (time-split, region-split, cluster randomization), empirical backtests showing MMV alters some launch decisions and typically reduces overestimated effects in supply-constrained settings, and limitations where MMV doesn't apply.

Lyft10 months ago

Cartography joins the CNCF

Cartography is an open-source tool that builds a graph of cloud infrastructure assets to identify security risks and attack paths, and it has been refined in production for vulnerability management and IAM analysis. The post shares lessons from its open-source journey—such as setting clear goals, fostering community contributions, maintaining documentation and tests, and appointing external maintainers—to ensure long-term project sustainability. It also explains the rationale and practical changes behind donating Cartography to the CNCF.

Lyft10 months ago

Integrating Extensions into Large-Scale iOS apps

Lyft's deep-dive into integrating a ride-booking Apple Maps extension for a large, modular iOS app. They describe challenges from static linking and transitive dependencies in a Bazel-built codebase, how they used bazel query plus Graphviz/Gephi and a binary-size-diff CI script to identify and measure contributors to binary size and memory usage, and the refactoring choices (extract minimal module vs shared core) that reduced extension binary size from ~45MB to ~15MB. The article also covers SiriKit/Info.plist/GeoJSON deployment caveats and the APPLICATION_EXTENSION_API_ONLY flag constraints.

Lyft1 year ago

Protocol Buffer Design: Principles and Practices for Collaborative Development

Lyft engineering explains principles and practices for collaborative Protocol Buffer (proto3) design, emphasizing clarity and extensibility. The article covers schema patterns (oneof, explicit unknown enum values, optional fields), use of google well-known types (Timestamp/Duration/wrappers), validation with protoc-gen-validate (and its successor protovalidate), cross-entity constants via custom options, language-dependent behavior (Python/Swift/Kotlin), and includes example proto and Python validation snippets.

Lyft1 year ago

Building Lyft’s Next Emblem — Glow

Lyft describes the IoT architecture for its new Glow emblem: BLE-connected devices use the driver’s phone as an IoT gateway to Lyft backend services. Core backend components (Device Registry, Device Controller, Device Shadow, OTA Manager) handle provisioning/authentication, command delivery (passthrough vs complex), device state management with eventual consistency, and robust OTA firmware updates (boot manager, rollback, recovery image). The post covers sensor data streams, diagnostics, and design choices for reliability and security.

Lyft1 year ago

FAQ: Common Questions from Candidates During Lyft Data Science Interviews

This post describes Lyft’s Data Science interview process, distinguishing between Decision and Algorithm Scientist roles and outlining the three stages: recruiter screen, technical phone screen, and virtual onsite interviews with segments like business case, experience, coding, and technical case study. It also covers how candidates are matched to teams, typical timeline expectations, and the hybrid work policy for Data Science roles at Lyft.

Lyft1 year ago

ETA (Estimated Time of Arrival) Reliability at Lyft

Lyft describes their approach to estimating ETA reliability prior to a ride request by training a tree-based classification model (gradient boosting) that scores reliability for all possible ETA brackets. The article covers feature engineering (nearby drivers, geohash-level history, marketplace signals), label construction (binary label per ETA bracket), evaluation with AUC, and production concerns using Lyft's ML platform (LyftLearn) for serving, monitoring, drift detection and automated retraining.

Lyft1 year ago

Keeping OSM fresh, accurate, and navigation-worthy at Lyft

Lyft describes how its Mapping team uses and contributes to OpenStreetMap (OSM) to keep maps fresh and navigation-ready, leveraging driver telemetry, Nearmap imagery, and an internal automated pipeline to detect and publish corrections. The post covers community engagement (changeset discussions, forums, Slack), training on GitHub, operational metrics (hundreds of thousands of changesets and millions of edits), and future mapping priorities like lanes, turn restrictions, destination data, barriers, and construction updates.

Lyft1 year ago

Technical Learning at Lyft: Build a Strong Data Science Team

Written by Shumpei Goke and Jinshu NiuWhy Technical Learning?At Lyft, data scientists tackle challenging technical problems every day. To support and empower...

Lyft1 year ago

Crafting Seamless Journeys with Live Activities

Lyft describes their client-focused implementation of Apple Live Activities: server-driven lifecycle control, a compact SDUI (server-driven UI) model using protobuf for small payloads, handling stale_date and Freshness Monitor semantics, and strategies to deliver remote images safely via App Groups and APNs within a 4KB payload constraint. The post emphasizes UI modeling (RichText, ProgressBar, RemoteImage), lifecycle edge cases, and tradeoffs between server-driven and client-driven approaches.

Lyft1 year ago

Lyft’s Reinforcement Learning Platform

Lyft describes building an RL platform (focused on contextual bandits) by extending their existing ML training and serving stack. The post covers modeling (Vowpal Wabbit, RLlib, internal RL library), evaluation via Off-Policy Evaluation (Coba, IPS/DM/DR, rejection sampling), and operational pieces (LyftLearn Serving, model registry/CI-CD, S3, data joining from a Data Warehouse), plus lessons and next steps.

Lyft1 year ago

Postgres Aurora DB major version upgrade with minimal downtime

Lyft describes a blue/green style upgrade of an Amazon Aurora PostgreSQL cluster from Postgres 10 to 13 for a large transactions DB (~30TB). They cloned the cluster, upgraded the clone, set up native logical replication (publications, subscriptions, replication slots), advanced replication origins to a captured LSN, enabled subscriptions to catch up replication, and performed a brief cutover (source DB read-only + envoy circuit breaker + Route53 DNS switch) to reduce downtime to ~7 minutes. Post-cutover steps included resetting sequences and VACUUM ANALYZE; AWS now offers native blue/green for Aurora which simplifies this process.