πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Multilevel (Hierarchical) Linear Models

StatisticsAdvanced Regression🟒 Free Lesson

Advertisement

Multilevel (Hierarchical) Linear Models

Statistics

Modeling Nested Data With Random Effects

Multilevel models account for data hierarchies β€” students within schools, patients within hospitals β€” by allowing intercepts and slopes to vary across groups. They produce unbiased estimates when observations are correlated within clusters.

  • Education Research β€” Compare teaching methods across schools with varying baseline performance

  • Healthcare β€” Analyze treatment effects while accounting for hospital-level variation

  • Economics β€” Study firm behavior across industries with different competitive structures

When data has layers, each layer needs its own part of the story.


Multilevel models (also called hierarchical or mixed-effects models) handle data with nested structures β€” students within schools, patients within hospitals, repeated measures within individuals.

DfMultilevel Model

A regression model that includes random effects to account for the hierarchical structure of data, allowing both intercepts and slopes to vary across groups.


Why Multilevel Models?

Key Problem with Standard Regression

Standard regression assumes observations are independent. When data are nested (e.g., students in schools), this assumption is violated. Students in the same school share teachers, resources, and policies β€” their outcomes are correlated.

| Issue | Consequence | Solution |

|-------|------------|---------|

| Non-independence | Standard errors too small | Random effects |

| Varying intercepts | Different group baselines | Random intercept model |

| Varying slopes | Different group effects | Random slope model |

| Crossed effects | Multiple grouping factors | Crossed random effects |


Random Intercept Model

Random Intercept Model

Yij=Ξ²0+u0j+Ξ²1Xij+Ξ΅ijY_{ij} = \beta_0 + u_{0j} + \beta_1 X_{ij} + \varepsilon_{ij}

Here,

  • YijY_{ij}=Outcome for individual i in group j
  • Ξ²0\beta_0=Fixed intercept (overall mean)
  • u0ju_{0j}=Random intercept for group j
  • Ξ²1\beta_1=Fixed slope
  • Ξ΅ij\varepsilon_{ij}=Level-1 residual (within-group)

Random Effects Distribution

u0j∼N(0,Οƒu2)u_{0j} \sim N(0, \sigma_u^2)

Here,

  • Οƒu2\sigma_u^2=Between-group variance (random intercept variance)
  • Οƒe2\sigma_e^2=Within-group variance (residual variance)

Random Slope Model

Allows the effect of X to vary across groups:

Random Slope Model

Yij=Ξ²0+u0j+(Ξ²1+u1j)Xij+Ξ΅ijY_{ij} = \beta_0 + u_{0j} + (\beta_1 + u_{1j})X_{ij} + \varepsilon_{ij}

Here,

  • u1ju_{1j}=Random slope for group j
  • u0j,u1ju_{0j}, u_{1j}=Jointly normal: $\sim N(0, \mathbf{G})$

Intraclass Correlation (ICC)

The ICC measures the proportion of total variance that lies between groups.

Intraclass Correlation

ICC=Οƒu2Οƒu2+Οƒe2ICC = \frac{\sigma_u^2}{\sigma_u^2 + \sigma_e^2}

Here,

  • Οƒu2\sigma_u^2=Between-group variance
  • Οƒe2\sigma_e^2=Within-group variance

| ICC | Interpretation |

|-----|---------------|

| 0.00 - 0.05 | Negligible clustering β€” standard regression may suffice |

| 0.05 - 0.15 | Moderate clustering β€” multilevel modeling recommended |

| 0.15 - 0.25 | Substantial clustering β€” multilevel modeling essential |

| > 0.25 | Very strong clustering β€” must model group structure |


Variance Partition Coefficient

The VPC (same as ICC in the random intercept model) answers: How much does the outcome vary between groups?

Sample Size Consideration

You need sufficient groups (at least 20-30) to reliably estimate random effects. With too few groups, the variance estimates are unstable.


Estimation Methods

| Method | Description | When to Use |

|--------|-------------|-------------|

| REML (Restricted Maximum Likelihood) | Unbiased variance estimates | Default; preferred |

| ML (Maximum Likelihood) | Biased variance estimates | When comparing models with different fixed effects |

| Bayesian | Full posterior distribution | Complex models; small samples |


Model Comparison

Likelihood Ratio Test

Ο‡2=βˆ’2(ln⁑L0βˆ’ln⁑L1)\chi^2 = -2(\ln L_0 - \ln L_1)

Here,

  • L0L_0=Likelihood of reduced model
  • L1L_1=Likelihood of full model

Test the significance of random effects by comparing models with and without the random effect.


Python Implementation


import numpy as np

import pandas as pd

import statsmodels.api as sm

from statsmodels.regression.mixed_linear_model import MixedLM

from scipy import stats

import matplotlib.pyplot as plt



np.random.seed(42)



# Simulate hierarchical data: students in schools

n_schools = 30

n_students_per = 20

n = n_schools * n_students_per



school_id = np.repeat(np.arange(n_schools), n_students_per)

school_effect = np.random.randn(n_schools) * 2  # Random intercepts

u_0 = school_effect[school_id]



X = np.random.randn(n)

Y = 5 + u_0 + 0.8 * X + np.random.randn(n) * 1.5



df = pd.DataFrame({'Y': Y, 'X': X, 'school': school_id})



# ICC calculation

from statsmodels.formula.api import ols

null_model = ols('Y ~ 1', data=df).fit()

total_var = np.var(null_model.resid)



# Between-school variance

school_means = df.groupby('school')['Y'].mean()

between_var = np.var(school_means)



# Within-school variance

within_var = total_var - between_var * n_students_per / (n_students_per - 1)

ICC = between_var / (between_var + within_var)

print(f"ICC: {ICC:.3f}")



# Random intercept model

ri_model = MixedLM.from_formula('Y ~ X', groups='school', data=df)

ri_result = ri_model.fit()

print(f"\nRandom Intercept Model:")

print(ri_result.summary())



# Random slope model

rs_model = MixedLM.from_formula('Y ~ X', groups='school', re_formula='~X', data=df)

rs_result = rs_model.fit()

print(f"\nRandom Slope Model:")

print(rs_result.summary())



# Compare models using AIC

print(f"\nRandom Intercept AIC: {ri_result.aic:.1f}")

print(f"Random Slope AIC: {rs_result.aic:.1f}")

Worked Example

Example: Student Achievement Across Schools

Examining math scores across 50 schools:

| Model | Fixed Effect | Estimate | SE | Random Effect | Variance |

|-------|-------------|----------|-----|--------------|----------|

| Null | Intercept | 72.5 | 1.2 | School intercept | 18.3 |

| Random Intercept | Intercept | 72.4 | 1.1 | School intercept | 16.8 |

| | SES | 3.2 | 0.4 | Residual | 45.2 |

| Random Slope | Intercept | 72.3 | 1.0 | School intercept | 15.1 |

| | SES | 3.1 | 0.3 | School SES slope | 2.4 |

ICC = 18.3 / (18.3 + 52.1) = 0.26 -> 26% of variance in scores is between schools

Conclusion: Schools explain a substantial portion of achievement differences. The random slope model shows that the effect of SES varies across schools.


Key Takeaways

Summary: Multilevel Modeling

  • Use multilevel models when data have a nested/hierarchical structure

  • ICC measures between-group variance relative to total variance

  • Random intercept model: different baselines for each group

  • Random slope model: different effects of predictors across groups

  • REML is preferred for variance estimation; ML for model comparison

  • Test random effects using likelihood ratio tests

  • Ensure sufficient groups (=30) for stable variance estimates


Related Topics

⭐

Premium Content

Multilevel (Hierarchical) Linear Models

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement