πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Design a Recommendation System

System Design ProblemsMachine Learning Infrastructure🟒 Free Lesson

Advertisement

System Design Problems

Design a Recommendation System

A recommendation system predicts user preferences and suggests relevant items. Netflix saves $1B/year through its recommendation engine, which drives 80% of content watched. The system combines collaborative filtering, content-based filtering, and deep learning.

  • Collaborative Filtering β€” "Users like you also liked..."
  • Content-Based Filtering β€” "Items similar to ones you've liked..."
  • Hybrid Approach β€” Combine signals for better accuracy

The challenge is balancing relevance (what users want) with diversity (what users don't know they want) at massive scale with sub-second latency.

Requirements

Functional Requirements

  • "Because you watched X" recommendations (item-to-item)
  • "Users like you also liked" (user-to-user)
  • Trending and popular items
  • Personalized homepage feed
  • Real-time updates based on user actions
  • A/B testing framework for algorithms

Non-Functional Requirements

  • Latency: Recommendations in < 200ms
  • Scale: 200M users, 100M items
  • Freshness: User actions reflected within minutes
  • Accuracy: Click-through rate > 5%
  • Diversity: Avoid filter bubbles; introduce serendipity

Recommendation systems face the cold-start problem: new users have no history, and new items have no interactions. The system must handle both gracefully using content-based features.

Back-of-the-Envelope Estimation

Recommendation System Capacity

  • 200M users Γ— 100 interactions/day = 20B interactions/day
  • Interaction storage: 20B Γ— 100 bytes = 2 TB/day
  • Item catalog: 100M items Γ— 1 KB features = 100 GB
  • User profiles: 200M Γ— 10 KB features = 2 TB
  • Recommendation cache: 200M Γ— 50 recommendations Γ— 100 bytes = 1 TB

Recommendation Algorithms

Collaborative FilteringUser-Item Interaction MatrixMatrix Factorization (SVD)Nearest Neighbor SearchNo content features neededContent-BasedItem Feature VectorsCosine SimilarityTF-IDF / EmbeddingsHandles cold-startHybrid (Deep Learning)Neural Collaborative FilteringTwo-Tower ModelTransformer + AttentionBest accuracy βœ“

Collaborative Filtering

DfCollaborative Filtering

Collaborative filtering recommends items based on user behavior patterns. It assumes users who agreed in the past will agree in the future. The user-item interaction matrix is factorized into latent user and item embeddings.

Matrix Factorization

r^ui=puTΓ—qi+bu+bi\hat{r}_{ui} = p_u^T \times q_i + b_u + b_i

Here,

  • hatruihat{r}_{ui}=Predicted rating for user u on item i
  • pup_u=User latent factor vector
  • qiq_i=Item latent factor vector
  • bub_u=User bias
  • bib_i=Item bias

Matrix Factorization Intuition

User vector: [0.8, -0.2, 0.5, 0.1] (likes action, dislikes romance, likes sci-fi, neutral comedy) Item vector: [0.7, 0.1, 0.6, -0.3] (action-heavy, slight romance, sci-fi, anti-comedy)

Predicted rating = dot product = 0.56 - 0.02 + 0.30 - 0.03 = 0.81 (high match)

Content-Based Filtering

DfContent-Based Filtering

Content-based filtering recommends items similar to those a user has previously liked, based on item features (genre, director, keywords). It uses cosine similarity between item feature vectors.

Cosine Similarity

similarity=Aβ‹…B∣∣Aβˆ£βˆ£Γ—βˆ£βˆ£B∣∣similarity = \frac{A \cdot B}{||A|| \times ||B||}

Here,

  • AA=Feature vector of item A
  • BB=Feature vector of item B

Content-Based Similarity

Movie A features: [action:0.9, sci-fi:0.8, comedy:0.1] Movie B features: [action:0.7, sci-fi:0.9, comedy:0.0]

similarity = (0.9Γ—0.7 + 0.8Γ—0.9 + 0.1Γ—0.0) / (√(0.81+0.64+0.01) Γ— √(0.49+0.81+0)) = (0.63 + 0.72) / (1.21 Γ— 1.14) = 1.35 / 1.38 = 0.978 (very similar)

Two-Tower Deep Learning Model

DfTwo-Tower Model

A two-tower model encodes users and items into separate embedding spaces using neural networks. At serving time, pre-computed item embeddings are retrieved via approximate nearest neighbor (ANN) search, enabling sub-second recommendations.

User TowerUser FeaturesHistory EmbedMLP Layers128-dim EmbeddingItem TowerItem FeaturesContent EmbedMLP Layers128-dim EmbeddingCosine Similarity(dot product for ranking)

High-Level Architecture

User ActionsKafkaRecommendation SvcCandidate GenerationRanking ModelRe-rankingDiversity FilterA/B Test AssignItem Embeddings(ANN Index)User Profiles(Redis)Recommendation System Architecture

Three-Phase Pipeline

DfRecommendation Pipeline

The recommendation pipeline has three phases: candidate generation (retrieve thousands of candidates), ranking (score candidates with ML model), and re-ranking (apply business rules, diversity, and freshness).

PhaseCandidatesLatencyAlgorithm
Candidate Generation1000+10msANN search, collaborative filtering
Ranking10050msDeep learning scoring
Re-ranking1010msBusiness rules, diversity

Use approximate nearest neighbor (ANN) libraries like FAISS or ScaNN for sub-millisecond candidate retrieval from millions of item embeddings.

Handling Cold Start

For new users and items:

Cold Start Strategy

score=Ξ±β‹…collaborative+Ξ²β‹…content_based+Ξ³β‹…popularityscore = \alpha \cdot collaborative + \beta \cdot content\_based + \gamma \cdot popularity

Here,

  • alphaalpha=Weight for collaborative signal (0 if no history)
  • eta=Weight for content-based signal (higher for cold start)
  • gammagamma=Weight for popularity (fallback for new items)

Cold Start Weighting

New user (no history): Ξ±=0, Ξ²=0.7, Ξ³=0.3 β†’ content-based + popular Established user: Ξ±=0.5, Ξ²=0.3, Ξ³=0.2 β†’ collaborative + content + popular

Practice Exercises

  1. Algorithm: Implement cosine similarity between two item feature vectors. What is the time complexity for finding the top-K most similar items?

  2. Scale: If you have 100M items with 128-dimensional embeddings, estimate the memory and query time for an FAISS index.

  3. Diversity: Design a re-ranking algorithm that ensures the top 10 recommendations don't all belong to the same genre. How do you balance relevance vs. diversity?

  4. Evaluation: Design an A/B testing framework for recommendation algorithms. What metrics would you track and how long should experiments run?

Key Takeaways:

  • Collaborative filtering uses user behavior patterns; content-based uses item features
  • Two-tower deep learning models pre-compute embeddings for sub-second ANN retrieval
  • Three-phase pipeline: candidate generation β†’ ranking β†’ re-ranking
  • Cold start is handled by shifting weights toward content-based and popularity signals
  • FAISS/ScaNN enable sub-millisecond nearest neighbor search over millions of embeddings

What to Learn Next

-> Design Search Engine Inverted indices and relevance ranking.

-> Design News Feed Personalized content ranking and fan-out.

-> Design Realtime Analytics Real-time user interaction tracking.

-> Design Search Autocomplete Trie-based typeahead with popularity ranking.

-> Databases Storing user profiles and item features.

-> Caching Strategies Caching pre-computed recommendations.

⭐

Premium Content

Design a Recommendation System

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert System Design Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement