System Design Problems
Design Instagram
Instagram serves 2B+ monthly active users with 100M+ photos uploaded daily. This design covers feed generation, media storage, social graph management, and real-time notifications.
- Scale β 2B MAU, 100M photos/day, 500M stories/day
- Feed β Personalized ranking with sub-100ms latency
- Media β Multi-resolution storage and CDN delivery
Instagram is a masterclass in building systems that balance real-time social interactions with massive media delivery.
Requirements Clarification
Functional Requirements
- Upload photos/videos with captions
- View personalized feed of posts from followed users
- Like, comment, and share posts
- Follow/unfollow users
- View stories (24-hour ephemeral content)
- Direct messaging with media sharing
- Explore/discover trending content
Non-Functional Requirements
- Availability: 99.99% uptime
- Latency: Feed loads < 200ms
- Durability: Media must never be lost
- Consistency: Eventual consistency for feed, strong for actions
- Scale: 100M photo uploads/day, 1B feed reads/day
Instagram's key insight: Feed generation is the most complex problem. With 1000+ followed users per person, computing a personalized feed requires sophisticated ranking algorithms and caching strategies.
Back-of-the-Envelope Estimation
Feed Read QPS
Here,
- =Daily active users
- =Feed views per user per day
- =Seconds in a day
Storage Estimation
Photo storage:
- Average photo size: 2MB
- 100M photos/day Γ 2MB = 200TB/day
- With 3 copies (replication) = 600TB/day
- With multiple resolutions: 600TB Γ 3 = 1.8PB/day
Metadata storage:
- 100M records Γ 1KB = 100GB/day
- User data: 2B users Γ 5KB = 10TB total
High-Level Architecture
Feed Generation Deep Dive
Fan-out on Write vs Fan-out on Read
DfFan-out Strategies
Fan-out on write: When a user posts, immediately push the post ID to all followers' feed caches. Fast reads but expensive for users with millions of followers.
Fan-out on read: When a user requests their feed, pull posts from all followed users and merge. Always fresh but slow for users following many accounts.
Instagram uses a hybrid approach: fan-out on write for most users, fan-out on read for celebrities (users with >10K followers).
Fan-out Cost Comparison
Here,
- =Number of followers
- =Write cost proportional to followers
- =Constant read cost
Fan-out on Read Cost
Here,
- =Number of followed users
- =Read cost proportional to followed users
Feed Ranking Algorithm
DfMulti-Stage Ranking
Instagram's feed ranking uses a multi-stage approach:
- Candidate Generation: Retrieve recent posts from followed users
- Initial Scoring: Fast heuristic scoring
- ML Ranking: Deep learning model for personalization
- Blending: Mix with ads and suggested content
Feed Score Components
Here,
- =Relationship affinity score
- =Recency score
- =Content type preference
- =Predicted engagement
Media Processing Pipeline
Data Model
Post Schema
Here,
- =Unique post identifier (ULID)
- =Array of media URLs at different resolutions
- =Denormalized like count
Scaling Strategies
Horizontal Sharding
DfSharding by User ID
Posts are sharded by user_id using consistent hashing. This ensures all posts from a single user are on the same shard, enabling efficient queries for user profiles.
The challenge with user-based sharding: celebrity users create hotspots. Instagram solves this with read replicas and caching for high-traffic users.
Practice Exercises
-
Feed Design: How would you handle the case where a user follows someone who has posted 1000 times in the last hour? How do you prevent feed flooding?
-
Media Storage: Design a multi-resolution image storage system. How do you determine which resolutions to generate and store?
-
Consistency: When a user likes a post, how do you ensure the like count is eventually consistent across all read replicas while avoiding race conditions?
-
Optimization: How would you reduce the latency of feed generation for users who follow 5000+ accounts?
Key Takeaways:
- Instagram uses hybrid fan-out (write for normal users, read for celebrities)
- Feed ranking is a multi-stage ML pipeline
- Media processing requires async pipelines with multiple resolutions
- Sharding by user_id with special handling for celebrity hotspots
- CDN is critical for media delivery performance
What to Learn Next
-> Design Twitter Real-time feeds, fan-out, and timeline generation.
-> Design Facebook Social graph, news feed, and platform architecture.
-> Design YouTube Video streaming and transcoding at scale.
-> Design Netflix Content delivery and recommendation systems.
-> CAP Theorem Understanding consistency vs availability trade-offs.
-> Caching Strategies Write-through, write-back, and cache invalidation.