System Design Problems

Design Facebook

Facebook serves 3B+ monthly active users with diverse features: news feed, groups, pages, marketplace, and messaging. This design focuses on the core social graph and feed generation.

Scale — 3B MAU, 2B daily active users, 500M+ posts/day
Social Graph — 500B+ friend connections
Feed — Personalized ranking with ML models

Facebook is not just a social network—it's a platform ecosystem requiring microservice architecture at planetary scale.

Requirements Clarification

Functional Requirements

Send friend requests and manage friendships
Post status updates, photos, videos
View personalized news feed
Like, comment, share posts
Create and join groups
Create and follow pages
Send messages (Messenger)
Notifications

Non-Functional Requirements

Availability: 99.99% uptime
Latency: News feed < 500ms
Consistency: Eventual consistency for feed, strong for relationships
Scale: 3B users, 500M posts/day, 2B feed reads/day

Facebook's architecture is fundamentally different from Twitter because of the social graph complexity. Facebook has both strong and weak ties, groups with varying sizes, and pages with millions of followers.

Back-of-the-Envelope Estimation

Social Graph Size

\text{Edges} \approx 3B \times 300 \text{ avg friends} \div 2 \approx 450B \text{ edges}

Here,

$3B$ =Monthly active users
$300$ =Average friends per user
$450B$ =Total friend connections

Storage for Social Graph

At 450B edges, storing each edge as 8 bytes (two 32-bit user IDs): 450B × 8 bytes = 3.6 PB for the graph alone

With metadata (friendship date, status): 3.6 PB × 5 = 18 PB

High-Level Architecture

Social Graph: TAO Architecture

DfTAO (The Associations and Objects)

TAO is Facebook's distributed data store for the social graph. It models the graph as objects (users, posts, comments) and associations (friendships, likes, follows). TAO uses MySQL for storage with a write-through cache layer.

TAO Data Model

\text{Graph} = (O, A) \text{ where } O = \text{objects}, A = \text{associations}

Here,

$O$ =Set of objects (users, posts, etc.)
$A$ =Set of associations (edges between objects)

Graph Partitioning

DfVertex-Cut Partitioning

TAO uses vertex-cut partitioning: each edge is assigned to a partition based on the hash of the source vertex. This ensures all edges from a single user are on the same partition, enabling efficient fan-out queries.

Partition Assignment

\text{partition}(u, v) = \text{hash}(u) \mod N

Here,

$u$ =Source vertex (user)
$v$ =Target vertex
$N$ =Number of partitions

News Feed Generation

Feed Ranking Pipeline

Feed Ranking Score

S = \alpha \cdot P(\text{engage}) + \beta \cdot \text{recency} + \gamma \cdot \text{affinity} + \delta \cdot \text{diversity}

Here,

$P( ext{engage})$ =ML-predicted engagement probability
$recency$ =Time decay factor
$affinity$ =Relationship strength
$diversity$ =Content variety bonus

Group Feed

DfGroup Feed Generation

Group feeds use a different strategy than personal feeds. For small groups (<500 members), fan-out on write is feasible. For large groups (>5000 members), fan-out on read with caching is preferred.

Data Model

User Schema

\text{User} = (user\_id, name, email, profile\_photo, friends\_count, created\_at)

Here,

$user_id$ =Unique user identifier
$friends_count$ =Denormalized friend count

Post Schema

\text{Post} = (post\_id, user\_id, group\_id, page\_id, content, media[], visibility, timestamp)

Here,

$group_id$ =Group ID if posted in a group
$page_id$ =Page ID if posted by a page
$visibility$ =public/friends/private/group

Scaling Strategies

Write Amplification vs Read Amplification

Facebook's feed uses hybrid fan-out: fan-out on write for most users, pull-on-read for users following celebrities or very active posters. The threshold is dynamic based on system load.

Fan-out Cost

\text{Write cost} = O(f) \text{ per post}, \text{Read cost} = O(1)

Here,

$f$ =Number of followers
$O(f)$ =Proportional to follower count

Practice Exercises

Graph Traversal: Design an algorithm to find "People You May Know" using friend-of-friend traversal. What's the time complexity?
Feed Consistency: How do you handle the case where a user unfriends someone, but the unfriended person's posts still appear in the feed? Design a consistency mechanism.
Group Scaling: Design a group with 10M members. How do you handle posts, notifications, and moderation?
Privacy: How would you implement fine-grained privacy controls (e.g., "friends except coworkers") without impacting feed generation performance?

Key Takeaways:

Facebook uses TAO for social graph storage with vertex-cut partitioning
News feed uses ML-based ranking with multi-stage pipeline
Hybrid fan-out (write for normal, read for celebrities) balances performance
GraphQL gateway provides flexible query capabilities
Group feeds require different strategies based on group size

What to Learn Next

-> Design Instagram Photo sharing and media delivery at scale.

-> Design Twitter Real-time feeds and fan-out architectures.

-> Design WhatsApp Messaging systems and real-time delivery.

-> Design YouTube Video streaming and content delivery.

-> CAP Theorem Consistency vs availability trade-offs.

-> Caching Strategies Distributed caching and invalidation.

Design Facebook

Design Facebook

Requirements Clarification

Functional Requirements

Non-Functional Requirements

Back-of-the-Envelope Estimation

Social Graph Size

Storage for Social Graph

High-Level Architecture

Social Graph: TAO Architecture

DfTAO (The Associations and Objects)

TAO Data Model

Graph Partitioning

DfVertex-Cut Partitioning

Partition Assignment

News Feed Generation

Feed Ranking Pipeline

Feed Ranking Score

Group Feed

DfGroup Feed Generation

Data Model

User Schema

Post Schema

Scaling Strategies

Write Amplification vs Read Amplification

Fan-out Cost

Practice Exercises

What to Learn Next

Premium Content

Need Expert System Design Help?