πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Regression Discontinuity Design

StatisticsCausal Inference🟒 Free Lesson

Advertisement

Regression Discontinuity Design

Statistics

Exploiting Threshold Rules for Causal Estimation

Regression discontinuity exploits cutoff-based treatment assignment. Units just above and just below the threshold are nearly identical, so the jump in outcomes at the cutoff reveals the causal effect.

  • Education β€” Estimate scholarship effects using GPA eligibility cutoffs

  • Policy Evaluation β€” Assess income-based benefit thresholds on employment outcomes

  • Healthcare β€” Evaluate age-based screening programs at eligibility boundaries

At the cutoff, treatment assignment is as good as random β€” the discontinuity is the causal effect.


Regression discontinuity (RD) exploits a threshold rule that assigns treatment based on whether a running variable crosses a cutoff. Units just above and just below the cutoff are assumed to be comparable.

DfRegression Discontinuity Design

A quasi-experimental method where treatment is assigned based on a running variable XX relative to a cutoff cc. The causal effect is estimated as the discontinuity in the outcome at the cutoff.


Sharp RD

Sharp RD

Ti=1(Xiβ‰₯c)T_i = \mathbb{1}(X_i \geq c)

Here,

  • TiT_i=Treatment indicator for unit i
  • XiX_i=Running variable (forcing variable)
  • cc=Cutoff value
  • 1(β‹…)\mathbb{1}(\cdot)=Indicator function

Treatment is deterministically assigned: everyone above the cutoff is treated, everyone below is not.


Fuzzy RD

Fuzzy RD

P(Ti=1∣Xi) is discontinuous at cP(T_i = 1 | X_i) \text{ is discontinuous at } c

Here,

  • P(Ti=1∣Xi)P(T_i = 1 | X_i)=Probability of treatment given running variable

Treatment assignment is probabilistic but has a jump at the cutoff. This is analyzed using IV-like local estimation.


Key Assumption

Continuity Assumption

In the absence of treatment, the conditional expectation E[Yi(0)∣Xi]E[Y_i(0)|X_i] would be continuous at the cutoff cc. This means units just above and below the cutoff are comparable in all respects except treatment.

| Violation | Consequence |

|-----------|------------|

| Manipulation of running variable | Bias β€” people sort around cutoff |

| Discrete running variable | Binning required; may introduce bias |

| Covariate imbalance at cutoff | Suggests manipulation or confounding |


Local Estimation

RD Estimator (Sharp)

Ο„^RD=lim⁑x↓cE[Y∣X=x]βˆ’lim⁑x↑cE[Y∣X=x]\hat{\tau}_{RD} = \lim_{x \downarrow c} E[Y|X=x] - \lim_{x \uparrow c} E[Y|X=x]

Here,

  • Ο„^RD\hat{\tau}_{RD}=Local Average Treatment Effect at the cutoff

In practice, estimate local polynomial regressions on each side of the cutoff.


Bandwidth Selection

The bandwidth hh determines the window around the cutoff used for estimation.

Bandwidth Trade-off

  • Small hh: Less bias but higher variance (fewer observations)

  • Large hh: Lower variance but more bias (includes distant observations)

Optimal bandwidth methods (e.g., Imbens-Kalyanaraman, CCT) balance this trade-off.


Covariate Balance Check

Before interpreting results, check that baseline covariates are continuous at the cutoff:

Covariate Balance

lim⁑x↓cE[Zi∣Xi=x]=lim⁑x↑cE[Zi∣Xi=x]\lim_{x \downarrow c} E[Z_i|X_i=x] = \lim_{x \uparrow c} E[Z_i|X_i=x]

Here,

  • ZiZ_i=Baseline covariates

If covariates show discontinuities at the cutoff, the identifying assumption may be violated.


McCrary Density Test

Tests for manipulation of the running variable at the cutoff. If people can precisely control their running variable, they may sort around the cutoff.

Manipulation Test

A significant McCrary test suggests the running variable is manipulated, which undermines the RD design. This is a critical validity check.


Python Implementation


import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from rdrobust import rdrobust

from rdrobust import rddensity



np.random.seed(42)



# Simulate sharp RD data

n = 1000

X = np.random.uniform(-1, 1, n)  # Running variable

T = (X >= 0).astype(int)  # Treatment

Y = 2.0 * T + 3.0 * X + np.random.randn(n) * 0.5



# Main RD estimate

result = rdrobust(Y, X, c=0)

print("RD Estimate:")

print(result)



# Covariate balance check

Z = np.random.randn(n)  # Covariate

print("\nCovariate balance at cutoff:")

rd_z = rdrobust(Z, X, c=0)

print(rd_z)



# McCrary density test

density = rddensity(X, c=0)

print("\nMcCrary density test:")

print(density)



# Plot

fig, axes = plt.subplots(1, 2, figsize=(12, 5))



# Outcome

axes[0].scatter(X, Y, alpha=0.3, s=10)

axes[0].axvline(x=0, color='red', linestyle='--')

axes[0].set_title('Outcome vs Running Variable')

axes[0].set_xlabel('Running Variable')

axes[0].set_ylabel('Outcome')



# Density

axes[1].hist(X, bins=50, edgecolor='black')

axes[1].axvline(x=0, color='red', linestyle='--')

axes[1].set_title('Density of Running Variable')



plt.tight_layout()

plt.show()

Worked Example

Example: Scholarship Eligibility

Students scoring =70 on an entrance exam receive a scholarship (TT). Outcome is GPA at graduation.

| Bandwidth | Estimate | SE | 95% CI |

|-----------|----------|-----|---------|

| 5 points | 0.45 | 0.12 | [0.21, 0.69] |

| 10 points | 0.38 | 0.09 | [0.20, 0.56] |

| 20 points | 0.32 | 0.07 | [0.18, 0.46] |

McCrary test: p = 0.42 -> No manipulation detected

Covariate balance: All p-values > 0.30 -> Baseline characteristics are continuous at the cutoff

Conclusion: The scholarship has a positive effect on GPA (~0.4 grade points) for students near the eligibility threshold.


Key Takeaways

Summary: Regression Discontinuity

  • RD identifies causal effects using a threshold rule for treatment assignment

  • Sharp RD: Treatment is deterministic based on the cutoff

  • Fuzzy RD: Treatment probability jumps at the cutoff (like IV)

  • The key assumption is continuity of potential outcomes at the cutoff

  • Use covariate balance checks and McCrary test to validate the design

  • Bandwidth selection balances bias-variance trade-off

  • RD provides local treatment effects at the cutoff only


Related Topics

⭐

Premium Content

Regression Discontinuity Design

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement