Multilevel (Hierarchical) Linear Models

Statistics

Modeling Nested Data With Random Effects

Multilevel models account for data hierarchies — students within schools, patients within hospitals — by allowing intercepts and slopes to vary across groups. They produce unbiased estimates when observations are correlated within clusters.

Education Research — Compare teaching methods across schools with varying baseline performance
Healthcare — Analyze treatment effects while accounting for hospital-level variation
Economics — Study firm behavior across industries with different competitive structures

When data has layers, each layer needs its own part of the story.

Multilevel models (also called hierarchical or mixed-effects models) handle data with nested structures — students within schools, patients within hospitals, repeated measures within individuals.

DfMultilevel Model

A regression model that includes random effects to account for the hierarchical structure of data, allowing both intercepts and slopes to vary across groups.

Why Multilevel Models?

Key Problem with Standard Regression

Standard regression assumes observations are independent. When data are nested (e.g., students in schools), this assumption is violated. Students in the same school share teachers, resources, and policies — their outcomes are correlated.

| Issue | Consequence | Solution |

|-------|------------|---------|

| Non-independence | Standard errors too small | Random effects |

| Varying intercepts | Different group baselines | Random intercept model |

| Varying slopes | Different group effects | Random slope model |

| Crossed effects | Multiple grouping factors | Crossed random effects |

Random Intercept Model

Y_{ij} = \beta_0 + u_{0j} + \beta_1 X_{ij} + \varepsilon_{ij}

Here,

$Y_{ij}$ =Outcome for individual i in group j
$\beta_0$ =Fixed intercept (overall mean)
$u_{0j}$ =Random intercept for group j
$\beta_1$ =Fixed slope
$\varepsilon_{ij}$ =Level-1 residual (within-group)

Random Effects Distribution

u_{0j} \sim N(0, \sigma_u^2)

Here,

$\sigma_u^2$ =Between-group variance (random intercept variance)
$\sigma_e^2$ =Within-group variance (residual variance)

Random Slope Model

Allows the effect of X to vary across groups:

Random Slope Model

Y_{ij} = \beta_0 + u_{0j} + (\beta_1 + u_{1j})X_{ij} + \varepsilon_{ij}

Here,

$u_{1j}$ =Random slope for group j
$u_{0j}, u_{1j}$ =Jointly normal: $\sim N(0, \mathbf{G})$

Intraclass Correlation (ICC)

The ICC measures the proportion of total variance that lies between groups.

Intraclass Correlation

ICC = \frac{\sigma_u^2}{\sigma_u^2 + \sigma_e^2}

Here,

$\sigma_u^2$ =Between-group variance
$\sigma_e^2$ =Within-group variance

| ICC | Interpretation |

|-----|---------------|

| 0.00 - 0.05 | Negligible clustering — standard regression may suffice |

| 0.05 - 0.15 | Moderate clustering — multilevel modeling recommended |

| 0.15 - 0.25 | Substantial clustering — multilevel modeling essential |

| > 0.25 | Very strong clustering — must model group structure |

Variance Partition Coefficient

The VPC (same as ICC in the random intercept model) answers: How much does the outcome vary between groups?

Sample Size Consideration

You need sufficient groups (at least 20-30) to reliably estimate random effects. With too few groups, the variance estimates are unstable.

Estimation Methods

| Method | Description | When to Use |

|--------|-------------|-------------|

| REML (Restricted Maximum Likelihood) | Unbiased variance estimates | Default; preferred |

| ML (Maximum Likelihood) | Biased variance estimates | When comparing models with different fixed effects |

| Bayesian | Full posterior distribution | Complex models; small samples |

Model Comparison

Likelihood Ratio Test

\chi^2 = -2(\ln L_0 - \ln L_1)

Here,

$L_0$ =Likelihood of reduced model
$L_1$ =Likelihood of full model

Test the significance of random effects by comparing models with and without the random effect.

Python Implementation


import numpy as np

import pandas as pd

import statsmodels.api as sm

from statsmodels.regression.mixed_linear_model import MixedLM

from scipy import stats

import matplotlib.pyplot as plt



np.random.seed(42)



# Simulate hierarchical data: students in schools

n_schools = 30

n_students_per = 20

n = n_schools * n_students_per



school_id = np.repeat(np.arange(n_schools), n_students_per)

school_effect = np.random.randn(n_schools) * 2  # Random intercepts

u_0 = school_effect[school_id]



X = np.random.randn(n)

Y = 5 + u_0 + 0.8 * X + np.random.randn(n) * 1.5



df = pd.DataFrame({'Y': Y, 'X': X, 'school': school_id})



# ICC calculation

from statsmodels.formula.api import ols

null_model = ols('Y ~ 1', data=df).fit()

total_var = np.var(null_model.resid)



# Between-school variance

school_means = df.groupby('school')['Y'].mean()

between_var = np.var(school_means)



# Within-school variance

within_var = total_var - between_var * n_students_per / (n_students_per - 1)

ICC = between_var / (between_var + within_var)

print(f"ICC: {ICC:.3f}")



# Random intercept model

ri_model = MixedLM.from_formula('Y ~ X', groups='school', data=df)

ri_result = ri_model.fit()

print(f"\nRandom Intercept Model:")

print(ri_result.summary())



# Random slope model

rs_model = MixedLM.from_formula('Y ~ X', groups='school', re_formula='~X', data=df)

rs_result = rs_model.fit()

print(f"\nRandom Slope Model:")

print(rs_result.summary())



# Compare models using AIC

print(f"\nRandom Intercept AIC: {ri_result.aic:.1f}")

print(f"Random Slope AIC: {rs_result.aic:.1f}")

Worked Example

Example: Student Achievement Across Schools

Examining math scores across 50 schools:

|-------|-------------|----------|-----|--------------|----------|

| Null | Intercept | 72.5 | 1.2 | School intercept | 18.3 |

| Random Intercept | Intercept | 72.4 | 1.1 | School intercept | 16.8 |

| | SES | 3.2 | 0.4 | Residual | 45.2 |

| Random Slope | Intercept | 72.3 | 1.0 | School intercept | 15.1 |

| | SES | 3.1 | 0.3 | School SES slope | 2.4 |

ICC = 18.3 / (18.3 + 52.1) = 0.26 -> 26% of variance in scores is between schools

Conclusion: Schools explain a substantial portion of achievement differences. The random slope model shows that the effect of SES varies across schools.

Key Takeaways

Summary: Multilevel Modeling

Use multilevel models when data have a nested/hierarchical structure
ICC measures between-group variance relative to total variance
Random intercept model: different baselines for each group
Random slope model: different effects of predictors across groups
REML is preferred for variance estimation; ML for model comparison
Test random effects using likelihood ratio tests
Ensure sufficient groups (=30) for stable variance estimates

Multilevel (Hierarchical) Linear Models

Multilevel (Hierarchical) Linear Models

Modeling Nested Data With Random Effects

DfMultilevel Model

Why Multilevel Models?

Random Intercept Model

Random Intercept Model

Random Effects Distribution

Random Slope Model

Random Slope Model

Intraclass Correlation (ICC)

Intraclass Correlation

Variance Partition Coefficient

Estimation Methods

Model Comparison

Likelihood Ratio Test

Python Implementation

Worked Example

Example: Student Achievement Across Schools

Key Takeaways

Summary: Multilevel Modeling

Related Topics

Premium Content

Need Expert Statistics Help?