🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Panel Data Analysis — Fixed and Random Effects

StatisticsEconometrics🟢 Free Lesson

Advertisement

Panel Data Analysis — Fixed and Random Effects

Statistics

Leveraging Cross-Sectional and Time-Series Dimensions

Panel data follows the same units over time, enabling control for unobserved heterogeneity. Fixed effects eliminate time-invariant confounders, while random effects exploit efficiency gains when assumptions hold.

  • Labor Economics — Estimate wage growth effects while controlling for individual ability

  • Public Policy — Evaluate policy changes using within-state variation over time

  • Finance — Analyze firm performance across years with entity fixed effects

The Hausman test decides: absorb individual differences or borrow strength across groups.


Panel data combines cross-sectional and time-series dimensions — the same units are observed over multiple time periods. This structure allows controlling for unobserved heterogeneity.

DfPanel Data

Data that follow the same individuals, firms, countries, or other units over time. Also called longitudinal data.


Panel Data Structure

| Entity | Time 1 | Time 2 | Time 3 |

|--------|--------|--------|--------|

| Unit 1 | Y11Y_{11} | Y12Y_{12} | Y13Y_{13} |

| Unit 2 | Y21Y_{21} | Y22Y_{22} | Y23Y_{23} |

| Unit 3 | Y31Y_{31} | Y32Y_{32} | Y33Y_{33} |

Notation: YitY_{it} — outcome for unit ii at time tt


The Pooled OLS Problem

Omitted Variable Bias

If unobserved factors (e.g., ability, culture) are correlated with both the dependent and independent variables, pooled OLS gives biased estimates. Panel data methods address this.


Fixed Effects (FE) Model

Fixed Effects Model

Yit=αi+βXit+εitY_{it} = \alpha_i + \beta X_{it} + \varepsilon_{it}

Here,

  • αi\alpha_i=Entity-specific intercept (fixed effect)
  • β\beta=Common slope across entities
  • XitX_{it}=Time-varying covariate
  • εit\varepsilon_{it}=Idiosyncratic error

How FE Works

Fixed effects are estimated by demeaning — subtracting each entity's mean from all its observations:

(YitYˉi)=β(XitXˉi)+(εitεˉi)(Y_{it} - \bar{Y}_i) = \beta(X_{it} - \bar{X}_i) + (\varepsilon_{it} - \bar{\varepsilon}_i)

This eliminates all time-invariant unobserved heterogeneity (αi\alpha_i).


Random Effects (RE) Model

Random Effects Model

Yit=βXit+(αi+εit)Y_{it} = \beta X_{it} + (\alpha_i + \varepsilon_{it})

Here,

  • αi\alpha_i=Random effect: $\alpha_i \sim N(0, \sigma_\alpha^2)$
  • εit\varepsilon_{it}=Idiosyncratic error: $\varepsilon_{it} \sim N(0, \sigma_\varepsilon^2)$

RE Assumption

RE assumes the entity-specific effects αi\alpha_i are uncorrelated with the regressors XitX_{it}. If this assumption fails, RE estimates are biased.


FE vs RE Comparison

| Feature | Fixed Effects | Random Effects |

|---------|--------------|----------------|

| Assumption | αi\alpha_i correlated with XX | αi\alpha_i uncorrelated with XX |

| Time-invariant variables | Cannot estimate | Can estimate |

| Efficiency | Less efficient | More efficient |

| Consistency | Consistent even if αi\alpha_i correlated | Consistent only if uncorrelated |

| Estimation | Demeaning / LSDV | GLS |


Hausman Test

The Hausman test compares FE and RE to determine which is appropriate.

Hausman Test

H=(β^FEβ^RE)(Var(β^FE)Var(β^RE))1(β^FEβ^RE)H = (\hat{\beta}_{FE} - \hat{\beta}_{RE})'(\text{Var}(\hat{\beta}_{FE}) - \text{Var}(\hat{\beta}_{RE}))^{-1}(\hat{\beta}_{FE} - \hat{\beta}_{RE})

Here,

  • HH=Test statistic (asymptotically $\chi^2_k$)
  • β^FE\hat{\beta}_{FE}=Fixed effects estimates
  • β^RE\hat{\beta}_{RE}=Random effects estimates

| Decision | Interpretation |

|---------|---------------|

| Reject H0H_0 | Use Fixed Effects (correlation exists) |

| Fail to reject H0H_0 | Use Random Effects (more efficient) |


First Differences Alternative

First Difference Estimator

ΔYit=βΔXit+Δεit\Delta Y_{it} = \beta \Delta X_{it} + \Delta \varepsilon_{it}

Here,

  • Δ\Delta=First difference operator: $\Delta Y_{it} = Y_{it} - Y_{i,t-1}$

Differencing eliminates the entity-specific effect, just like demeaning.


Time Fixed Effects

Controls for factors that change over time but are constant across entities (e.g., economic shocks, policy changes).

Two-Way Fixed Effects

Yit=αi+λt+βXit+εitY_{it} = \alpha_i + \lambda_t + \beta X_{it} + \varepsilon_{it}

Here,

  • αi\alpha_i=Entity fixed effects
  • λt\lambda_t=Time fixed effects

Python Implementation


import numpy as np

import pandas as pd

import statsmodels.api as sm

from linearmodels.panel import PanelOLS, RandomEffects

from linearmodels.panel import compare

import matplotlib.pyplot as plt



np.random.seed(42)



# Simulate panel data

n_entities = 100

n_periods = 10

n = n_entities * n_periods



entity_id = np.repeat(np.arange(n_entities), n_periods)

time_id = np.tile(np.arange(n_periods), n_entities)



# Entity effects

alpha = np.random.randn(n_entities) * 2

alpha_panel = alpha[entity_id]



# Covariates

X = np.random.randn(n)

Y = 5 + alpha_panel + 0.8 * X + np.random.randn(n) * 1.5



df = pd.DataFrame({

    'Y': Y, 'X': X,

    'entity': entity_id,

    'time': time_id

}).set_index(['entity', 'time'])



# Fixed Effects

fe_model = PanelOLS.from_formula('Y ~ 1 + X', data=df, entity_effects=True)

fe_result = fe_model.fit()

print("Fixed Effects:")

print(fe_result.summary.tables[1])



# Random Effects

re_model = RandomEffects.from_formula('Y ~ 1 + X', data=df)

re_result = re_model.fit()

print("\nRandom Effects:")

print(re_result.summary.tables[1])



# Compare

print("\nModel Comparison:")

print(compare({'FE': fe_result, 'RE': re_result}))

Worked Example

Example: Wage Determinants

Panel data: 500 workers over 5 years, examining effects of education and experience on wages.

| Model | Education Coef | Experience Coef | R² (within) |

|-------|---------------|----------------|-------------|

| Pooled OLS | 2.85*** | 0.42*** | 0.28 |

| Fixed Effects | 2.12*** | 0.38*** | 0.15 |

| Random Effects | 2.65*** | 0.40*** | 0.32 |

Hausman test: χ2=28.4\chi^2 = 28.4, p < 0.001 -> Use Fixed Effects

The FE estimate of education (2.12) is smaller than pooled OLS (2.85), suggesting positive omitted variable bias (e.g., ability correlated with both education and wages).


Key Takeaways

Summary: Panel Data Analysis

  • Panel data tracks the same units over time, enabling control for unobserved heterogeneity

  • Fixed Effects eliminate entity-specific intercepts through demeaning

  • Random Effects assume entity effects are uncorrelated with regressors

  • Hausman test determines whether FE or RE is more appropriate

  • Time fixed effects control for temporal shocks constant across entities

  • FE cannot estimate time-invariant covariates (e.g., gender, race)

  • Always check for serial correlation and heteroscedasticity in panel data


Related Topics

Premium Content

Panel Data Analysis — Fixed and Random Effects

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement