🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Recommendation Systems — Collaborative and Content-Based Filtering

Core MLRecommendations🟢 Free Lesson

Advertisement

Specialized Topics

Recommendation Systems — The Algorithm Behind 'You Might Also Like'

Recommendation systems predict what users will like based on past behavior, powering billions of dollars in e-commerce and content revenue.

  • Collaborative Filtering — finds patterns in user behavior to recommend items liked by similar users
  • Content-Based Filtering — recommends items similar to what a user has already enjoyed using item features
  • Matrix Factorization — decomposes sparse user-item matrices into dense latent factor representations

"Our head is a recommendation engine." — Jeff Bezos

Recommendation Systems — Complete Guide

Recommendation systems predict what users will like based on past behavior.


Mathematical Foundations

Cosine Similarity

cos(u,v)=uvuv\cos(\mathbf{u}, \mathbf{v}) = \frac{\mathbf{u} \cdot \mathbf{v}}{||\mathbf{u}|| \cdot ||\mathbf{v}||}

Matrix Factorization Objective

minP,Q(i,j)observed(rijpiTqj)2+λ(P2+Q2)\min_{P, Q} \sum_{(i,j) \in \text{observed}} (r_{ij} - \mathbf{p}_i^T \mathbf{q}_j)^2 + \lambda(||P||^2 + ||Q||^2)

Precision@K

Precision@K={relevant items}{top K recommended}K\text{Precision@K} = \frac{|\{\text{relevant items}\} \cap \{\text{top K recommended}\}|}{K}

NDCG@K

NDCG@K=DCG@KIDCG@K,DCG@K=i=1K2ri1log2(i+1)\text{NDCG@K} = \frac{\text{DCG@K}}{\text{IDCG@K}}, \quad \text{DCG@K} = \sum_{i=1}^{K} \frac{2^{r_i} - 1}{\log_2(i+1)}

Types

DfContent-Based Filtering

Content-based filtering recommends items similar to what a user has liked, using item features.

  • Advantage: No cold-start for new items
  • Limitation: Filter bubble problem

DfCollaborative Filtering

Collaborative filtering recommends based on similar users, using user-item interactions.

  • Advantage: No feature engineering needed
  • Limitation: Cold-start problem for new users/items

DfHybrid Approaches

Hybrid systems combine both content-based and collaborative approaches, getting the best of both worlds. Most production systems use hybrid methods.

Collaborative vs Content-Based Filtering

Collaborative vs Content-Based FilteringCollaborative Filtering"Users like you also liked..."User-ItemABCDUser 1✓✓User 2✓✓✓User 3✓✓?User 1 and 3 are similar → Recommend DUses: User-Item interaction matrixProblem: Cold-start for new usersContent-Based Filtering"Items similar to what you liked..."Movie AAction, Sci-FiMovie BAction, ThrillerMovie CRomance, Drama[0.9, 0.2, 0.1][0.8, 0.7, 0.1][0.1, 0.1, 0.9]Similar features → high similarityUses: Item metadata (genre, tags)Problem: Filter bubble, no discovery

Collaborative Filtering

DfUser-Based Collaborative Filtering

"Users similar to you also liked..."

  • Similarity: Cosine similarity between user vectors
  • Prediction: Weighted average of similar users' ratings

DfItem-Based Collaborative Filtering

"Items similar to what you liked..."

  • Similarity: Cosine similarity between item vectors
  • Prediction: Weighted average of similar items' ratings

Matrix Factorization Diagram

Matrix Factorization — Decomposing User-Item MatrixUser-Item Matrix RItems →←Users5 3 ?4 ? 2? 1 52 ? 4? 4 ?? = missing ratingsUser Factors P5 × k×Item Factors Qk × 3=Predicted RÌ‚5 × 3R ≈ P × Q^T | min Σ (r_ij - p_i · q_j)^2 + λ(||p_i||² + ||q_j||²)k = latent factors (typically 50-200) | SVD or ALS for optimization

DfMatrix Factorization

Matrix factorization decomposes the user-item matrix into latent factors:

  • Methods: SVD, ALS, or neural network
  • Netflix Prize winner used this approach
  • Handles sparse matrices well

Example: SVD for Recommendations

from surprise import SVD, Dataset, Reader
from surprise.model_selection import cross_validate

# Load data
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(ratings_df[['userId', 'itemId', 'rating']], reader)

# Train SVD
model = SVD(n_factors=50, random_state=42)
cross_validate(model, data, measures=['RMSE', 'MAE'], cv=5)

Cold-Start Problem

The Cold-Start Problem in RecommendationsNew User Cold-Start• No interaction history• Cannot find similar users• Collaborative failsSolution: Use content-basedor ask for preferencesonboarding surveyNew Item Cold-Start• No ratings yet• Cannot find similar items• Content-based worksSolution: Use item featuresmetadata, descriptiontext for similaritySystem Cold-Start• Brand new system• No data at all• Need bootstrappingSolution: Popularity-basedthen transition tocollaborative as data grows

Evaluation

DfRecommendation Metrics

Common evaluation metrics:

  • RMSE: (predictedactual)2n\sqrt{\frac{\sum(predicted - actual)^2}{n}}
  • MAE: predictedactualn\frac{\sum|predicted - actual|}{n}
  • Precision@K: Of top K recommended, how many are relevant?
  • Recall@K: Of all relevant items, how many are in top K?
  • MAP: Mean Average Precision across users
  • NDCG: Normalized Discounted Cumulative Gain

Offline evaluation splits data and computes metrics. Online evaluation uses A/B testing to measure CTR and engagement.


Key Takeaways

Summary: Recommendation Systems

  • Collaborative filtering uses user behavior patterns
  • Content-based uses item features
  • Matrix factorization handles sparse data well
  • Cold-start is the biggest challenge (new users/items)
  • Hybrid approaches combine both methods
  • Implicit feedback (clicks, views) is easier to collect
  • Deep learning (NeuMF, Transformer) improves performance
  • Evaluation requires both offline metrics and A/B testing

What to Learn Next

-> Clustering Group similar users or items using K-Means, DBSCAN, and hierarchical methods.

-> Dimensionality Reduction Reduce sparse user-item matrices to dense representations with PCA and autoencoders.

-> Neural Networks Build deep learning models for neural collaborative filtering and representation learning.

-> Model Evaluation Master precision, recall, and ranking metrics for evaluating recommendation quality.

-> A/B Testing Design online experiments to measure the real-world impact of recommendation changes.

-> NLP Fundamentals Process item descriptions and user reviews with text mining for content-based recommendations.

Premium Content

Recommendation Systems — Collaborative and Content-Based Filtering

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Machine Learning Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement