System Design Problems
Design Twitter
Twitter serves 400M+ monthly active users with 500M+ tweets daily. This design covers tweet distribution, timeline generation, trending topics, and real-time search.
- Scale β 400M MAU, 500M tweets/day, 600K QPS reads
- Timeline β Home timeline vs user timeline with fan-out
- Trends β Real-time trending topics with decay algorithms
Twitter's core challenge is the celebrity problem: a single tweet from a popular account must reach millions instantly.
Requirements Clarification
Functional Requirements
- Post tweets (text, images, videos, polls)
- View home timeline (tweets from followed users)
- View user timeline (tweets from a specific user)
- Follow/unfollow users
- Like, retweet, reply to tweets
- View trending topics
- Search tweets by keywords/hashtags
- Real-time notifications
Non-Functional Requirements
- Availability: 99.99% uptime
- Latency: Timeline loads < 200ms
- Consistency: Eventual consistency acceptable
- Durability: Tweets must never be lost
- Scale: 500M tweets/day, 600K timeline reads QPS
Twitter's fundamental architectural decision: Fan-out on write for home timelines. When a user tweets, the tweet ID is immediately pushed to all followers' timeline caches. This trades write amplification for fast reads.
Back-of-the-Envelope Estimation
Tweet Write QPS
Here,
- =Tweets per day
- =Seconds in a day
Fan-out Storage Estimation
Average user: 200 followers Celebrity user: 10M followers
If 1% of users are celebrities (4M users) and each has 10M followers: Fan-out writes per celebrity tweet = 10M Total fan-out writes for all celebrity tweets per day = 4M Γ 10M Γ tweets_per_celebrity
This is why Twitter uses a hybrid approach: fan-out on write for normal users, pull on read for celebrities.
High-Level Architecture
Timeline Generation Deep Dive
Home Timeline Architecture
DfFan-out on Write Timeline
When a user tweets, the tweet ID is pushed to the timeline cache of all followers. The timeline cache stores the 800 most recent tweet IDs per user. When a user opens their home timeline, we simply read from this cache and hydrate with full tweet data.
Celebrity Handling
For users with millions of followers (celebrities), fan-out on write is too expensive. Instead, Twitter uses a hybrid approach: the home timeline service checks if the author is a celebrity and falls back to pulling their recent tweets at read time.
Celebrity Threshold
Here,
- =Celebrity threshold (typically 10K-100K)
- =Number of followers
Trending Topics
Count-Min Sketch Algorithm
DfCount-Min Sketch
A probabilistic data structure used to estimate frequency counts of items in a stream. Uses multiple hash functions to reduce collision probability. Space-efficient: O(Ξ΅β»ΒΉ Γ ln(Ξ΄β»ΒΉ)) for error Ξ΅ and confidence 1-Ξ΄.
Count-Min Sketch Update
Here,
- =ith hash function
- =ith row, jth column of sketch
Trending Score with Time Decay
Trending Score
Here,
- =Current mention count
- =Decay constant
- =Time since first mention
Trending Score Calculation
Topic A: 10,000 mentions in last hour Topic B: 50,000 mentions but started 6 hours ago
With Ξ» = 0.1: Score_A = 10000 / e^(0.1 Γ 1) = 9048 Score_B = 50000 / e^(0.1 Γ 6) = 50000 / 1.822 = 27,442
Topic B still trends because its total volume outweighs the decay.
Search Architecture
DfInverted Index
Twitter's search uses an inverted index mapping terms to tweet IDs. The index is partitioned by time (recent tweets) and by popularity (all-time tweets). Queries are evaluated using Boolean operations on posting lists.
Data Model
Tweet Schema
Here,
- =Unique tweet ID (Snowflake)
- =Parent tweet ID (null if top-level)
- =Original tweet ID (null if original)
Twitter uses Snowflake IDs: 64-bit IDs encoding timestamp, machine ID, and sequence number. This guarantees uniqueness without coordination and provides rough chronological ordering.
Practice Exercises
-
Fan-out Design: How would you handle a celebrity with 100M followers posting a tweet? Estimate the time and resources needed for fan-out.
-
Timeline Consistency: If a user follows someone who tweets, but the tweet doesn't appear in their timeline for 5 seconds, is this acceptable? Design a system to minimize this delay.
-
Trending Accuracy: How would you detect and filter out artificially inflated trending topics (bot manipulation)?
-
Search Performance: Design a search system that can return results for "machine learning" in under 100ms across 500M tweets.
Key Takeaways:
- Twitter uses fan-out on write with celebrity exception (pull-on-read)
- Snowflake IDs provide globally unique, roughly ordered identifiers
- Trending uses Count-Min Sketch with time decay
- Search relies on inverted indices with time-based partitioning
- The celebrity problem is the central scaling challenge
What to Learn Next
-> Design Instagram Photo sharing and feed generation at scale.
-> Design Facebook Social graph and news feed architecture.
-> Design WhatsApp Messaging systems and real-time delivery.
-> Design Leaderboard Sorted sets and real-time ranking systems.
-> Back Pressure Managing load in message-driven architectures.
-> Caching Strategies Cache invalidation and distributed caching.