πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Difference-in-Differences Estimation

StatisticsCausal Inference🟒 Free Lesson

Advertisement

Difference-in-Differences Estimation

Statistics

Comparing Changes Over Time Between Treatment and Control

Difference-in-differences estimates causal effects by comparing outcome changes over time between treated and control groups. The parallel trends assumption ensures that without treatment, both groups would have followed the same trajectory.

  • Policy Evaluation β€” Assess minimum wage effects on employment across states

  • Healthcare β€” Evaluate insurance expansions using state-by-state rollout timing

  • Business β€” Measure marketing campaign impact using geographic staggered launches

The double difference β€” in changes, not levels β€” eliminates time-invariant confounders.


Difference-in-Differences (DiD) estimates causal effects by comparing changes in outcomes over time between a treatment group and a control group.

DfDifference-in-Differences

A quasi-experimental method that uses the double difference (difference in changes over time) between treated and control units to estimate the causal effect of a policy or intervention.


Basic DiD Formula

DiD Estimator

Ο„^DiD=(E[Ypost∣D=1]βˆ’E[Ypre∣D=1])βˆ’(E[Ypost∣D=0]βˆ’E[Ypre∣D=0])\hat{\tau}_{DiD} = (E[Y_{post}|D=1] - E[Y_{pre}|D=1]) - (E[Y_{post}|D=0] - E[Y_{pre}|D=0])

Here,

  • DD=Treatment group indicator (1=treated, 0=control)
  • YpreY_{pre}=Outcome before treatment
  • YpostY_{post}=Outcome after treatment
  • Ο„^DiD\hat{\tau}_{DiD}=Estimated treatment effect

Parallel Trends Assumption

Key Assumption

The parallel trends assumption states that, in the absence of treatment, the treatment and control groups would have experienced the same trends over time:

E[Yt(0)∣D=1]βˆ’E[Yt(0)∣D=0]=constantE[Y_t(0)|D=1] - E[Y_t(0)|D=0] = \text{constant}

This is untestable because we never observe the treated group's counterfactual.

Testing Parallel Trends

Pre-treatment periods can provide indirect evidence:

  • Plot pre-treatment trends for both groups

  • Test whether pre-treatment differences are stable

  • If pre-treatment trends diverge, the assumption is questionable


Regression Specification

DiD Regression

Yit=Ξ±+Ξ²Di+Ξ³Postt+Ο„(DiΓ—Postt)+Ξ΅itY_{it} = \alpha + \beta D_i + \gamma \text{Post}_t + \tau (D_i \times \text{Post}_t) + \varepsilon_{it}

Here,

  • Ξ²\beta=Pre-existing difference between groups
  • Ξ³\gamma=Common time trend
  • Ο„\tau=DiD estimate (causal effect)
  • DiΓ—PosttD_i \times \text{Post}_t=Interaction term (the DiD estimator)

Event Study Design

Extends DiD by estimating dynamic effects β€” how the treatment effect evolves over time.

Event Study

Yit=Ξ±i+Ξ»t+βˆ‘kβ‰ βˆ’1Ξ΄k1(tβˆ’tiβˆ—=k)+Ξ΅itY_{it} = \alpha_i + \lambda_t + \sum_{k \neq -1}\delta_k \mathbb{1}(t - t_i^* = k) + \varepsilon_{it}

Here,

  • Ξ±i\alpha_i=Unit fixed effects
  • Ξ»t\lambda_t=Time fixed effects
  • Ξ΄k\delta_k=Effect at event time k (relative to k=-1)
  • tiβˆ—t_i^*=Time when unit i receives treatment

Event Study Plot

Plot Ξ΄k\delta_k against kk. Pre-treatment Ξ΄k\delta_k should be statistically insignificant (validating parallel trends). Post-treatment Ξ΄k\delta_k shows the dynamic treatment effect.


Staggered Adoption

Many units adopt treatment at different times. Recent research shows that two-way fixed effects (TWFE) can be biased in staggered DiD.

| Method | Problem |

|--------|---------|

| TWFE | Negative weights; can give wrong sign |

| Callaway-Sant'Anna | Robust estimator for staggered DiD |

| Sun-Abraham | Interaction-weighted estimator |

| de Chaisemartin-D'HaultfΕ“uille | robust DiD with heterogeneous effects |


Difference-in-Differences with Covariates

DiD with Covariates

Yit=Ξ±+Ο„(DiΓ—Postt)+Xitβ€²Ξ²+Ξ΅itY_{it} = \alpha + \tau(D_i \times \text{Post}_t) + X_{it}'\beta + \varepsilon_{it}

Here,

  • XitX_{it}=Time-varying covariates

Including covariates can improve precision but is not required for identification.


Python Implementation


import numpy as np

import pandas as pd

import statsmodels.api as sm

import matplotlib.pyplot as plt



np.random.seed(42)



# Simulate DiD data

n_units = 200

n_periods = 10

treated = np.repeat(np.random.binomial(1, 0.5, n_units), n_periods)

time = np.tile(np.arange(n_periods), n_units)

post = (time >= 5).astype(int)

treatment = treated * post



# True effect = 2.0

Y = 5 + 1.5*treated + 0.3*time + 2.0*treatment + np.random.randn(n_units*n_periods)*2



df = pd.DataFrame({'Y': Y, 'treated': treated, 'time': time, 'post': post, 'treatment': treatment})



# Basic DiD

model = sm.OLS.from_formula('Y ~ treated + post + treatment', data=df).fit()

print("Basic DiD:")

print(model.summary().tables[1])



# Event study

event_coefs = []

for k in range(-4, 6):

    if k == -1:

        continue

    df['event_k'] = ((df['time'] - 5) == k) * df['treated']

    model_k = sm.OLS.from_formula('Y ~ C(time) + C(treated) + event_k', data=df).fit()

    event_coefs.append({'k': k, 'coef': model_k.params['event_k'],

                        'se': model_k.bse['event_k']})



event_df = pd.DataFrame(event_coefs)

event_df['ci_lower'] = event_df['coef'] - 1.96*event_df['se']

event_df['ci_upper'] = event_df['coef'] + 1.96*event_df['se']



plt.figure(figsize=(8, 5))

plt.errorbar(event_df['k'], event_df['coef'], 

             yerr=1.96*event_df['se'], fmt='o-', capsize=4)

plt.axhline(y=0, color='gray', linestyle='--', alpha=0.5)

plt.axvline(x=0, color='red', linestyle='--', alpha=0.5)

plt.xlabel('Event Time (relative to treatment)')

plt.ylabel('Treatment Effect')

plt.title('Event Study Plot')

plt.show()

Worked Example

Example: Minimum Wage Increase

Evaluating the effect of a minimum wage increase in State A (treated) vs State B (control):

| Period | State A | State B | Difference |

|--------|---------|---------|------------|

| Pre (before) | <MathBlock tex=10.20 &#124; \ />9.80 | $0.40 |

| Post (after) | <MathBlock tex=10.85 &#124; \ />9.95 | $0.90 |

DiD = (10.85 - 10.20) - (9.95 - 9.80) = 0.65 - 0.15 = $0.50

The minimum wage increase raised wages by approximately $0.50/hour above what would have happened without the policy.


Key Takeaways

Summary: Difference-in-Differences

  • DiD uses a double difference to estimate causal effects

  • The key assumption is parallel trends β€” the groups would have had similar trends without treatment

  • Event study plots provide indirect evidence for parallel trends

  • Staggered adoption requires special estimators (not standard TWFE)

  • Include pre-treatment periods in the event study to validate the design

  • Always check for pre-trends β€” if they diverge, the design is compromised

  • DiD is widely used for policy evaluation and natural experiments


Related Topics

⭐

Premium Content

Difference-in-Differences Estimation

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement