System Design Problems

Design YouTube

YouTube serves 2B+ monthly active users with 500+ hours of video uploaded every minute. This design covers video upload pipelines, adaptive streaming, CDN delivery, and recommendation systems.

Scale — 2B MAU, 500+ hours uploaded/minute, 1B hours watched/day
Storage — Exabytes of video content
Streaming — Adaptive bitrate with sub-second startup

YouTube's core challenge is the asymmetric nature: massive storage for uploads, massive bandwidth for streaming.

Requirements Clarification

Functional Requirements

Upload videos with metadata
Stream videos with adaptive quality
Search videos by title/tags
Like, comment, subscribe
View personalized recommendations
Live streaming
Short-form content (Shorts)

Non-Functional Requirements

Availability: 99.99% uptime
Latency: Video starts in < 2 seconds
Durability: Videos must never be lost
Consistency: Eventual consistency for counts
Scale: 1B hours of video watched daily

YouTube's key insight: Video transcoding is the bottleneck. A 1-hour 4K video can take 4+ hours to transcode. The system must process millions of videos in parallel.

Back-of-the-Envelope Estimation

Upload Rate

\text{Uploads per minute} = \frac{500 \text{ hours}}{1 \text{ minute}} \approx 8.3 \text{ videos/second}

Here,

$500$ =Hours uploaded per minute
$8.3$ =Videos per second (avg 1 min each)

Storage Estimation

Per video storage:

Original: 10GB (1-hour 4K)
Transcoded variants: 10GB × 6 resolutions = 60GB
With 3 replicas: 60GB × 3 = 180GB per video

Daily storage (86,400 videos/day): 180GB × 86,400 = 15.5 PB/day

Annual storage: 15.5 PB × 365 ≈ 5.7 EB/year

High-Level Architecture

Video Upload Pipeline

Transcoding Pipeline

DfAdaptive Bitrate Streaming

Videos are transcoded into multiple resolutions (240p to 8K) and bitrates. The player dynamically switches quality based on network conditions. This ensures smooth playback across varying bandwidth.

Transcoding Cost

\text{Time} = \frac{\text{Video Duration} \times \text{Complexity}}{\text{Parallel Workers}}

Here,

$Complexity$ =Encoding complexity factor (1x-10x)
$Parallel Workers$ =Number of transcoding workers

Transcoding Parallelism

A 1-hour 4K video at 10x complexity = 10 hours of CPU time.

With 100 parallel workers: 10 hours / 100 = 6 minutes

To process 86,400 videos/day (1 per second average): Need 86,400 × 10 / 86400 = 10,000 CPU cores

Adaptive Streaming

ABR Decision

\text{Quality}_n = \arg\max_{q \in Q} \{ q : \text{bitrate}(q) \leq \text{bandwidth}_{\text{est}} \}

Here,

$Q$ =Available quality levels
$bandwidth_{ ext{est}}$ =Estimated available bandwidth

HLS vs DASH

Feature	HLS	DASH
Protocol	Apple's HTTP Live Streaming	Dynamic Adaptive Streaming
Segment Format	MPEG-TS	fMP4/CMAF
Manifest	m3u8	MPD
DRM	FairPlay	Widevine/PlayReady
Latency	6-30 seconds	2-10 seconds

CDN Architecture

DfMulti-CDN Strategy

YouTube uses a multi-CDN approach with Google's own CDN (GGC) deployed inside ISPs. Popular content is cached at edge nodes close to users. Long-tail content is served from regional data centers.

CDN Cache Hit Ratio

\text{Hit Ratio} = \frac{\text{Requests served from cache}}{\text{Total requests}}

Here,

$Hit Ratio$ =Fraction of requests served from edge
$Target$ => 95% for popular content

Recommendation System

DfCollaborative Filtering + Deep Learning

YouTube's recommendation system uses a two-stage approach:

Candidate Generation: Retrieve hundreds of candidate videos using collaborative filtering
Ranking: Score candidates using a deep neural network

Recommendation Score

P(\text{watch}) = f(\text{user history}, \text{video features}, \text{context})

Here,

$user history$ =Watch history, likes, search queries
$video features$ =Title, tags, thumbnail, duration
$context$ =Time of day, device, location

Data Model

Video Schema

\text{Video} = (video\_id, user\_id, title, description, tags[], duration, status, upload\_time)

Here,

$video_id$ =Unique video identifier
$status$ =processing/ready/failed
$duration$ =Video duration in seconds

Practice Exercises

Upload Design: Design a resumable upload system that handles 10GB files. How do you handle network failures mid-upload?
Transcoding: How would you prioritize transcoding? Should a viral video be transcoded before an old personal video?
Live Streaming: Design a live streaming system with < 5 second latency. How does this differ from VOD?
Storage Optimization: Propose a storage tiering strategy for videos with decreasing popularity over time.

Key Takeaways:

Video upload uses chunked upload with resumable support
Transcoding is the bottleneck: requires massive parallelism
Adaptive bitrate streaming ensures smooth playback across networks
Multi-CDN with edge caching for low-latency delivery
Recommendation system uses two-stage ML pipeline

What to Learn Next

-> Design Netflix Content delivery and adaptive streaming.

-> Design Instagram Media delivery and feed generation.

-> Design Amazon E-commerce at scale.

-> Design Google Search Web-scale indexing and ranking.

-> Back Pressure Managing load in streaming systems.

-> Caching Strategies CDN and edge caching patterns.

Design YouTube

Design YouTube

Requirements Clarification

Functional Requirements

Non-Functional Requirements

Back-of-the-Envelope Estimation

Upload Rate

Storage Estimation

High-Level Architecture

Video Upload Pipeline

Transcoding Pipeline

DfAdaptive Bitrate Streaming

Transcoding Cost

Transcoding Parallelism

Adaptive Streaming

ABR Decision

HLS vs DASH

CDN Architecture

DfMulti-CDN Strategy

CDN Cache Hit Ratio

Recommendation System

DfCollaborative Filtering + Deep Learning

Recommendation Score

Data Model

Video Schema

Practice Exercises

What to Learn Next

Premium Content

Need Expert System Design Help?