🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Design Netflix

System Design ProblemsVideo Streaming Systems🟢 Free Lesson

Advertisement

System Design Problems

Design Netflix

Netflix serves 250M+ subscribers across 190+ countries with 15,000+ titles. This design covers microservice architecture, content delivery with Open Connect CDN, and ML-powered personalization.

  • Scale — 250M+ subscribers, 1B+ hours watched/month
  • CDN — Open Connect appliances in 6000+ ISPs
  • Personalization — Saves $1B/year in reduced churn

Netflix is a masterclass in building a resilient microservice ecosystem with content delivery at planetary scale.

Requirements Clarification

Functional Requirements

  1. Browse and search content catalog
  2. Stream videos with adaptive quality
  3. Personalized recommendations
  4. Multiple profiles per account
  5. Download for offline viewing
  6. Parental controls
  7. Multiple device support

Non-Functional Requirements

  1. Availability: 99.99% uptime
  2. Latency: Video starts in < 2 seconds
  3. Throughput: 15% of global internet bandwidth
  4. Consistency: Eventual for recommendations
  5. Scale: 250M subscribers, 1B hours/month

Netflix's architecture is a microservice ecosystem with 1000+ services. The key insight: separate content delivery, metadata, recommendations, and billing into independent services.

Back-of-the-Envelope Estimation

Bandwidth Estimation

Peak BW=1B×5 Mbps30×24×36001.93 Tbps\text{Peak BW} = \frac{1B \times 5 \text{ Mbps}}{30 \times 24 \times 3600} \approx 1.93 \text{ Tbps}

Here,

  • 1B1B=Hours watched per month
  • 5Mbps5 Mbps=Average bitrate
  • 1.93Tbps1.93 Tbps=Average throughput

Storage Estimation

Content library:

  • 15,000 titles x 50 versions = 750,000 files
  • Average file size: 5GB
  • Total: 750,000 x 5GB = 3.75 PB

With 3x replication: 11.25 PB

High-Level Architecture

Netflix ClientsOpen Connect CDNAPI Gateway (Zuul)User SvcCatalog SvcRecommend SvcStreaming SvcBilling SvcProfile SvcEvent Bus (Kafka)CassandraMySQLElasticsearchS3 (Content)Redis (Cache)

Open Connect CDN

DfOpen Connect Architecture

Netflix deploys custom appliances (OCAs) inside 6000+ ISP networks globally. Each OCA stores popular content and serves it directly to end users. The top 1000 titles serve ~90% of traffic.

Netflix pre-positions content on OCAs based on popularity predictions. OCAs are refreshed during off-peak hours. This reduces internet backbone traffic and improves streaming quality.

Recommendation System

DfTwo-Tier Recommendation

Netflix uses a two-tier approach: (1) Row Generation determines which rows to show ("Trending Now", "Because you watched X"), (2) Ranking ranks items within each row by predicted engagement.

Engagement Prediction

P(watch)=σ(Wf(user,title,context))P(\text{watch}) = \sigma(W \cdot f(\text{user}, \text{title}, \text{context}))

Here,

  • P(watch)P(watch)=Probability of watching
  • sigmasigma=Sigmoid activation
  • ff=Feature embedding function

Chaos Engineering

DfChaos Monkey

Netflix pioneered chaos engineering with Chaos Monkey, which randomly terminates production instances. The Simian Army includes Latency Monkey, Conformity Monkey, and Security Monkey for comprehensive resilience testing.

Netflix runs a "Day 2" test every morning simulating region failure. This ensures their multi-region architecture works under real failure conditions.

Data Model

Content Schema

Title=(title_id,name,type,genres[],rating,cast[],release_year)\text{Title} = (title\_id, name, type, genres[], rating, cast[], release\_year)

Here,

  • titleidtitle_id=Unique title identifier
  • typetype=movie/series/documentary
  • genres[]genres[]=Array of genre tags

Practice Exercises

  1. CDN Design: How does Netflix decide which content to pre-position on each OCA?
  2. Recommendations: How would you handle the cold-start problem for new users?
  3. Resilience: Design a graceful degradation strategy when the recommendation service is down.
  4. Streaming: Compare Netflix's Open Connect with YouTube's multi-CDN approach.

Key Takeaways:

  • Netflix uses 1000+ microservices with Zuul API gateway
  • Open Connect CDN deploys appliances inside ISPs for low-latency delivery
  • Recommendation system uses two-tier row generation + ranking
  • Chaos engineering ensures resilience through deliberate failure injection
  • Multi-region active-active deployment with automatic failover

What to Learn Next

-> Design YouTube Video streaming and transcoding pipelines.

-> Design Uber Real-time location and dispatch systems.

-> Circuit Breaker Preventing cascade failures.

-> Sidecar Pattern Service mesh and sidecar proxies.

-> Saga Pattern Distributed transactions.

-> Back Pressure Load management in streaming systems.

Premium Content

Design Netflix

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert System Design Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement