System Design Problems
Design Uber
Uber serves 100M+ monthly active users across 70+ countries with real-time ride matching, location tracking, and dynamic pricing. This design covers geospatial indexing, dispatch systems, and ETA estimation.
- Scale — 100M+ MAU, 19M trips/day, 5M+ drivers
- Latency — Driver matching < 5 seconds
- Real-time — Location updates every 4 seconds
Uber's core challenge is solving the real-time matching problem at global scale with sub-second latency.
Requirements Clarification
Functional Requirements
- Request a ride and get matched with a driver
- Real-time driver location tracking
- ETA calculation for pickup and dropoff
- Dynamic pricing (surge pricing)
- Trip management (start, end, payment)
- Driver availability management
- Rating system for riders and drivers
Non-Functional Requirements
- Availability: 99.99% uptime
- Latency: Match driver < 5 seconds
- Consistency: Strong for location, eventual for payments
- Scale: 5M concurrent drivers, 100K ride requests/minute
Uber's key insight: Location data is the core primitive. Everything else (matching, routing, ETA) depends on efficient geospatial indexing and real-time location processing.
Back-of-the-Envelope Estimation
Location Update Rate
Here,
- =Concurrent drivers
- =Update interval
- =Location QPS
Storage Estimation
Location data per driver per day:
- 1 update/4 sec x 86400 sec = 21,600 updates/day
- Each update: 50 bytes (lat, lng, timestamp, driver_id)
- Per driver: 21,600 x 50 = 1.08 MB/day
Total daily location data: 10M drivers x 1.08 MB = 10.8 TB/day
High-Level Architecture
Geospatial Indexing: S2 Geometry
DfS2 Geometry
S2 is Google's spherical geometry library that divides the Earth's surface into hierarchical cells. Uber uses S2 cells at level 15 (~1km x 1km) for driver indexing. This enables efficient range queries for finding nearby drivers.
S2 Cell Lookup
Here,
- =Latitude and longitude
- =Cell resolution (~1km)
- =64-bit cell identifier
S2 cells provide O(1) lookup for "find drivers in this cell." For nearby cell queries, we check the 8 neighboring cells. This is much faster than brute-force distance calculations.
Matching Algorithm
Driver Score
Here,
- =Estimated time to pickup
- =Driver rating (0-5)
- =Current surge multiplier
Surge Pricing
DfDynamic Pricing
Surge pricing is determined by supply-demand ratio in real-time. When demand exceeds supply in a region, prices increase to balance the market. The surge multiplier is calculated every minute for each S2 cell.
Surge Calculation
Here,
- =Ride requests in last 5 min
- =Available drivers in cell
- =Calibration constant
Trip State Machine
Data Model
Trip Schema
Here,
- =Unique trip identifier
- =S2 cell + coordinates
- =State machine state
Practice Exercises
- Geospatial: Design a system to find all drivers within 5km of a rider. What data structure would you use?
- Matching: How would you handle the case where a driver rejects a ride offer? Design the retry logic.
- Surge: Design a surge pricing algorithm that doesn't frustrate riders but balances supply/demand.
- Scale: How would you handle 10x traffic during New Year's Eve?
Key Takeaways:
- S2 Geometry provides efficient geospatial indexing for driver lookup
- Matching uses a scoring function balancing ETA, rating, and surge
- Surge pricing balances supply/demand using real-time ratio calculation
- Trip state machine manages the full lifecycle
- Location updates are the core data stream (1.25M QPS)
What to Learn Next
-> Design Airbnb Marketplace and booking systems.
-> Design Amazon E-commerce at scale.
-> Design WhatsApp Real-time messaging systems.
-> Design Leaderboard Sorted sets and ranking.
-> Saga Pattern Distributed transactions.
-> Idempotency Handling duplicate requests safely.