System Design Problems
Design a Video Streaming Platform
A video streaming platform enables users to upload, transcode, and watch videos with adaptive bitrate streaming. YouTube serves over 1 billion hours of video daily, requiring sophisticated transcoding pipelines and CDN infrastructure.
- Upload Pipeline β Accept large video files and process them asynchronously
- Transcoding β Convert videos to multiple resolutions and formats
- Adaptive Streaming β Serve the best quality based on network conditions
Video is the most bandwidth-intensive content on the internet. A single 4K video can be 100 GB before transcoding. The system must handle upload, processing, storage, and delivery efficiently.
Requirements
Functional Requirements
- Users can upload videos (up to 4 hours, 4K resolution)
- Transcode videos into multiple resolutions (240p to 4K)
- Adaptive bitrate streaming (HLS/DASH)
- Video playback with buffering and seeking
- Video metadata (title, description, tags)
- Like, comment, and subscribe features
- Content moderation and copyright detection
Non-Functional Requirements
- Upload: Support 500 hours of video uploaded per minute
- Playback: Start playing within 2 seconds
- Availability: 99.99% for video playback
- Latency: First frame < 2 seconds
- Scale: 1 billion hours of video watched daily
Video transcoding is CPU-intensive. Transcoding a 1-hour 4K video takes ~4 hours on a single CPU core. Parallel processing across multiple machines is essential.
Back-of-the-Envelope Estimation
Video Platform Capacity
Upload:
- 500 hours/minute Γ 60 min = 30,000 hours/day
- Average raw size: 10 GB/hour (1080p)
- Daily upload: 300 TB/day raw
Transcoding:
- 30,000 hours/day Γ 4 hours CPU/hour = 120,000 CPU-hours/day
- 1000 transcoding machines Γ 24 hours = 24,000 CPU-hours/day
- Need ~5000 transcoding machines
Storage:
- 10 resolutions Γ 500 KB/min bitrate = 5 MB/min
- 30,000 hours/day Γ 60 min Γ 5 MB = 9 TB/day transcoded
- 1 year retention: ~3.3 PB
CDN:
- 1 billion hours/day Γ 5 Mbps average = 500 PB/day bandwidth
- CDN cost: ~10M/day
High-Level Architecture
Detailed Design
Video Upload Pipeline
DfChunked Upload
Chunked upload splits large video files into smaller chunks (e.g., 5 MB) and uploads them in parallel. This enables resumable uploads, parallel transfer, and progress tracking.
- Client requests upload URL from API
- API returns pre-signed S3 URL for each chunk
- Client uploads chunks in parallel to S3
- Upload service tracks chunk completion
- On completion, trigger transcoding pipeline
Use S3 multipart upload for videos > 100 MB. This enables parallel chunk uploads and automatic retries on failure.
Transcoding Pipeline
Transcoding Time
Here,
- =Raw video duration
- =Number of output resolutions (e.g., 10)
- =Number of parallel transcoding machines
- =GPU acceleration speedup (e.g., 10x)
Transcoding Time Calculation
1-hour 4K video, 10 output resolutions, 100 machines, 10x GPU speedup:
T_total = (1 hour Γ 10) / (100 Γ 10) = 0.01 hours = 36 seconds
The video is transcoded in 36 seconds using 100 machines.
Adaptive Bitrate Streaming (HLS)
DfHLS (HTTP Live Streaming)
HLS divides videos into 10-second segments at multiple bitrates. The client dynamically switches between bitrates based on network conditions, ensuring smooth playback without buffering.
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
stream_360p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2000000,RESOLUTION=1280x720
stream_720p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
stream_1080p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=15000000,RESOLUTION=3840x2160
stream_4k.m3u8
The HLS manifest file (.m3u8) lists all available quality variants. The client player measures bandwidth and selects the optimal variant, switching dynamically as conditions change.
Video Metadata
CREATE TABLE videos (
id UUID PRIMARY KEY,
user_id BIGINT NOT NULL,
title VARCHAR(500),
description TEXT,
duration INT, -- seconds
status VARCHAR(20), -- processing, ready, failed
created_at TIMESTAMP,
view_count BIGINT DEFAULT 0
);
CREATE TABLE video_variants (
video_id UUID REFERENCES videos(id),
resolution VARCHAR(10), -- "1080p"
bitrate INT, -- bps
codec VARCHAR(10), -- "h264", "h265"
file_path TEXT,
file_size BIGINT,
PRIMARY KEY (video_id, resolution)
);
Practice Exercises
-
Design: How would you implement a video seeking feature that allows jumping to any point in a 4-hour video within 500ms? What caching and indexing strategies are needed?
-
Scale: If YouTube uploads 500 hours of video per minute, estimate the transcoding cluster size needed assuming each machine can transcode 1 hour of video in 2 hours.
-
Optimization: How would you implement a video deduplication system that detects re-uploads of the same video with different encodings? What fingerprinting algorithm would you use?
-
Cost: Design a tiered storage system that moves old videos from hot (SSD) to warm (HDD) to cold (Glacier) storage based on access patterns.
Key Takeaways:
- Chunked upload with pre-signed URLs enables parallel, resumable video uploads
- Transcoding is CPU-intensive; parallel processing across hundreds of machines is essential
- HLS/DASH adaptive bitrate streaming provides smooth playback across varying network conditions
- CDN delivery is the primary cost driver; edge caching reduces origin load by 99%+
- Metadata storage is separate from video storage; use relational DB for metadata, object storage for videos
What to Learn Next
-> CDNs Edge caching and global video delivery.
-> Design Object Storage Storing large video files with high durability.
-> Message Queues Async transcoding pipeline with Kafka.
-> Load Balancing Distributing transcoding work across machines.
-> Caching Strategies Caching popular video segments at CDN edges.
-> Design Realtime Analytics Real-time video viewing analytics and metrics.