Amazon & Uber Interview

Ensemble Methods: Bagging, Boosting, Stacking & Voting

Combining models for superior performance

Interview Question

"Compare bagging, boosting, and stacking. When would you use each method? How do you build an effective ensemble and what are the theoretical guarantees behind ensemble methods?"

Difficulty: Medium-Hard | Frequently asked at Amazon, Uber, Google

Theoretical Foundation

Why Ensembles?

The Wisdom of Crowds principle: combining multiple diverse models often outperforms any individual model.

Mathematical Foundation:

For $M$ models with error $\epsilon_m$ and pairwise correlation $\rho$ :

\text{Ensemble Error} = \rho \sigma^2 + \frac{1-\rho}{M}\sigma^2

As $M \to \infty$ : $\text{Ensemble Error} \to \rho \sigma^2$

Key Insight: Reducing correlation $\rho$ between models is crucial. Diversity matters more than individual model accuracy.

Bagging (Bootstrap Aggregating)

Algorithm:

Draw $B$ bootstrap samples
Train model on each sample
Average predictions (regression) or majority vote (classification)

Properties:

Reduces variance by averaging
Parallelizable
Each model sees ~63.2% of original data
Random Forest is the most popular bagging method

When to use: High variance models (deep decision trees)

Boosting

Algorithm:

Start with a weak model
Iteratively add models that correct previous errors
Weighted combination of all models

Properties:

Reduces both bias and variance
Sequential (harder to parallelize)
Can overfit with too many iterations

When to use: Weak learners, when you need to reduce bias

Stacking (Stacked Generalization)

Algorithm:

Train base models (level-0)
Train meta-model on base model predictions (level-1)
Use meta-model for final prediction

Properties:

Combines different model types
Learns optimal combination weights
Risk of overfitting at meta-level

When to use: When you have diverse, strong base models

Voting

Hard Voting: Majority vote of classifiers Soft Voting: Average of class probabilities

Properties:

Simple to implement
Requires similar model types
No learning involved

Ensemble Methods Comparison

Method	Bias	Variance	Parallel	Overfitting Risk
Bagging	No change	Decreases	Yes	Low
Boosting	Decreases	May increase	No	High
Stacking	May decrease	Decreases	Partial	Medium
Voting	No change	Decreases	Yes	Low

ℹ️

Key Insight: Bagging reduces variance, boosting reduces bias, and stacking learns the optimal combination. Choose based on whether your base model has high bias or high variance.

Code Implementation

Real-World Applications

Amazon: Product Recommendations

Random Forest: Feature importance for product ranking
Gradient Boosting: CTR prediction for ad targeting
Stacking: Combining multiple recommendation algorithms

Uber: Demand Prediction

XGBoost: ETA prediction for ride requests
LightGBM: Surge pricing optimization
Ensemble: Combining multiple models for robust predictions

💡

Amazon Interview Tip: Be prepared to discuss when ensembles fail. If base models are highly correlated, ensembles provide diminishing returns. Diversity is key.

Common Follow-Up Questions

Q1: Why does bagging reduce variance but not bias? Bagging trains independent models and averages them. Averaging reduces variance (by factor $1/M$ ) but doesn't affect bias since all models have the same expected bias.

Q2: Why can boosting overfit? Boosting sequentially corrects errors. With too many iterations, it can fit noise in the training data. Early stopping is crucial.

Q3: How do you ensure diversity in ensembles?

Use different algorithms
Train on different features
Use different hyperparameters
Use different training data (bootstrap)

Q4: What is the theoretical guarantee of ensemble methods? The error bound shows ensemble error depends on model accuracy and diversity. Even weak learners (accuracy slightly > 0.5) can be combined to achieve arbitrary accuracy.

Ensemble Methods: Bagging, Boosting, Stacking & Voting

Ensemble Methods: Bagging, Boosting, Stacking & Voting

Interview Question

Theoretical Foundation

Why Ensembles?

Bagging (Bootstrap Aggregating)

Boosting

Stacking (Stacked Generalization)

Voting

Ensemble Methods Comparison

Code Implementation

Real-World Applications

Amazon: Product Recommendations

Uber: Demand Prediction

Common Follow-Up Questions

Company-Specific Tips

Amazon Interview Tips

Uber Interview Tips

Related Topics