Google & Meta Interview

MLOps: Pipeline Automation, Monitoring & CI/CD for ML

Operationalizing machine learning at scale

Interview Question

"Design an end-to-end MLOps pipeline for a production ML system. How do you handle versioning, monitoring, and continuous training? What are the key differences between MLOps and traditional DevOps?"

Difficulty: Hard | Frequently asked at Google, Meta, Amazon

Theoretical Foundation

What is MLOps?

MLOps combines Machine Learning, DevOps, and Data Engineering to operationalize ML systems. It covers the entire lifecycle from data preparation to model monitoring.

Key Components of MLOps

Data Management: Versioning, validation, lineage
Model Development: Experiment tracking, hyperparameter tuning
Model Deployment: Serving, A/B testing, rollback
Monitoring: Performance, drift, alerts
Governance: Compliance, audit trails, explainability

MLOps vs Traditional DevOps

Aspect	DevOps	MLOps
Artifacts	Code	Code + Data + Models
Versioning	Git (code)	Git + DVC (data) + Model Registry
Testing	Unit/Integration	+ Data/Model/Drift testing
Monitoring	System metrics	+ Model performance, drift
Reproducibility	Deterministic builds	Stochastic training
Feedback Loop	User feedback	Prediction feedback

ML Pipeline Components

Data Pipeline

Ingestion: Collect data from sources
Validation: Check schema, quality, drift
Preprocessing: Transform, normalize, feature engineering
Splitting: Train/validation/test sets

Training Pipeline

Feature Store: Centralized feature management
Training: Distributed training, hyperparameter tuning
Evaluation: Model metrics, fairness checks
Registry: Version models, metadata, lineage

Deployment Pipeline

Serving: Batch, real-time, edge deployment
Traffic Management: Canary, blue-green, shadow
A/B Testing: Statistical comparison
Rollback: Quick recovery from issues

Monitoring Pipeline

Performance Monitoring: Accuracy, latency, throughput
Data Monitoring: Drift detection, quality checks
Business Monitoring: ROI, conversion rates
Alerting: Automated notifications

Feature Store

Purpose: Centralized feature management for ML.

Benefits:

Feature reuse across teams
Consistent training/serving features
Feature versioning and lineage
Low-latency serving

Examples: Feast, Tecton, Hopsworks

Model Registry

Purpose: Version control for models.

Metadata:

Model version and artifact location
Training data version
Hyperparameters and metrics
Lineage and dependencies
Approval status

Examples: MLflow, SageMaker Model Registry

CI/CD for ML

Continuous Integration:

Code testing
Data validation
Model training tests
Integration tests

Continuous Deployment:

Model serving deployment
A/B test configuration
Rollback procedures
Monitoring setup

Continuous Training:

Scheduled retraining
Triggered retraining (drift)
Feature store updates
Model performance validation

ℹ️

Key Insight: MLOps is not just about deployment. It's about creating a feedback loop where model performance in production informs data collection and model improvement.

Monitoring and Observability

Key Metrics to Monitor:

Model Metrics:
- Accuracy, precision, recall, F1
- AUC-ROC, AUC-PR
- Calibration error
System Metrics:
- Latency (p50, p95, p99)
- Throughput (QPS)
- Error rates
Data Metrics:
- Feature distributions
- Missing value rates
- Data drift (PSI, KS test)
Business Metrics:
- Conversion rates
- Revenue impact
- User satisfaction

Drift Detection Strategies

Statistical Tests: KS test, Chi-squared, PSI
Model-based: Train a classifier to distinguish old vs new data
Performance-based: Monitor prediction accuracy
Automated Retraining: Trigger on drift detection

💡

Google Interview Tip: Be prepared to discuss the tradeoffs between automated retraining and manual review. Automated retraining is faster but can propagate issues. Manual review adds latency but catches problems.

Code Implementation

Real-World Applications

Google: ML Platform

Vertex AI: End-to-end ML platform
TFX: ML pipeline framework
TF Serving: Model serving at scale
Continuous training: Automated retraining

Meta: Production ML

FBLearner: Internal ML platform
Online learning: Real-time model updates
A/B testing: Large-scale experimentation
Model governance: Compliance and audit

💡

Google Interview Tip: Be prepared to discuss the maturity levels of MLOps: Level 0 (manual), Level 1 (pipeline automation), Level 2 (CI/CD), Level 3 (full automation).

Common Follow-Up Questions

Q1: What are the key differences between batch and real-time ML systems? Batch systems process data periodically (hourly/daily) with higher latency tolerance. Real-time systems require sub-second latency and handle streaming data.

Q2: How do you handle feature engineering in production? Use a feature store to ensure consistency between training and serving. Compute features in real-time or pre-compute and store for fast lookup.

Q3: What is shadow deployment and when should you use it? Shadow deployment runs the new model alongside production but doesn't serve predictions. Use it to validate model behavior without affecting users.

Q4: How do you ensure reproducibility in ML pipelines? Version everything: code (Git), data (DVC), models (MLflow), environment (Docker). Use deterministic random seeds and fixed dependencies.

MLOps: Pipeline Automation, Monitoring & CI/CD for ML

MLOps: Pipeline Automation, Monitoring & CI/CD for ML

Interview Question

Theoretical Foundation

What is MLOps?

Key Components of MLOps

MLOps vs Traditional DevOps

ML Pipeline Components

Data Pipeline

Training Pipeline

Deployment Pipeline

Monitoring Pipeline

Feature Store

Model Registry

CI/CD for ML

Monitoring and Observability

Drift Detection Strategies

Code Implementation

Real-World Applications

Google: ML Platform

Meta: Production ML

Common Follow-Up Questions

Related Topics