πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Propensity Score Matching

StatisticsCausal Inference🟒 Free Lesson

Advertisement

Propensity Score Matching

Statistics

Creating Pseudo-Experiments From Observational Data

Propensity score matching pairs treated and control units with similar treatment probabilities, mimicking randomization. It balances observed covariates, reducing selection bias in observational studies.

  • Healthcare β€” Compare outcomes for patients who chose different treatments

  • Education β€” Evaluate school choice effects by matching applicants with similar backgrounds

  • Marketing β€” Assess campaign effectiveness when exposure was not randomly assigned

Matching on the propensity score reduces many dimensions of confounding to a single number.


Propensity score matching (PSM) creates pseudo-experimental conditions in observational studies by matching treated and control units with similar probabilities of receiving treatment.

DfPropensity Score

The probability that a unit receives treatment, given its observed covariates:

Propensity Score

e(Xi)=P(Ti=1∣Xi)e(X_i) = P(T_i = 1 | X_i)

Here,

  • XiX_i=Observed covariates for unit i
  • TiT_i=Treatment indicator
  • e(Xi)e(X_i)=Propensity score

Key Theorem

Rosenbaum-Rubin Theorem (1983)

If treatment assignment is unconfounded given XX, it is also unconfounded given the propensity score e(X)e(X):

TβŠ₯Y(0),Y(1)∣Xβ€…β€ŠβŸΉβ€…β€ŠTβŠ₯Y(0),Y(1)∣e(X)T \perp Y(0), Y(1) | X \implies T \perp Y(0), Y(1) | e(X)

Matching on the scalar propensity score balances all multivariate covariates.


Assumptions

| Assumption | Meaning | Testable? |

|-----------|---------|-----------|

| Unconfoundedness | No unobserved confounders | No |

| Overlap | 0<e(X)<10 < e(X) < 1 for all X | Yes |

| SUTVA | No interference between units | Partially |


Estimation Steps

| Step | Action |

|------|--------|

| 1 | Estimate propensity score (logistic regression, ML) |

| 2 | Check overlap (common support) |

| 3 | Match treated to control units |

| 4 | Assess covariate balance |

| 5 | Estimate treatment effect |

| 6 | Conduct sensitivity analysis


Matching Methods

| Method | Description |

|--------|------------|

| Nearest neighbor | Match each treated to closest control on propensity score |

| Caliper | Only match if propensity scores are within caliper distance |

| Full matching | Create matched sets that partition all units |

| Kernel matching | Weight all controls by kernel function of propensity score |

Caliper Matching

A common caliper is 0.2 Γ— SD(propensity score). Caliper matching reduces matching bias but may leave some treated units unmatched.


Covariate Balance

After matching, check that covariates are balanced between groups.

Standardized Mean Difference

SMD=XΛ‰Tβˆ’XΛ‰C(sT2+sC2)/2\text{SMD} = \frac{\bar{X}_T - \bar{X}_C}{\sqrt{(s_T^2 + s_C^2)/2}}

Here,

  • XΛ‰T\bar{X}_T=Mean in treated group
  • XΛ‰C\bar{X}_C=Mean in control group
  • sT,sCs_T, s_C=Standard deviations

| SMD | Interpretation |

|-----|---------------|

| < 0.1 | Excellent balance |

| 0.1 - 0.2 | Adequate balance |

| > 0.2 | Poor balance β€” matching failed |


ATT Estimation

ATT via IPW

Ο„^ATT=βˆ‘i:Ti=1YinTβˆ’βˆ‘j:Tj=0wjYjβˆ‘jwj\hat{\tau}_{ATT} = \frac{\sum_{i:T_i=1} Y_i}{n_T} - \frac{\sum_{j:T_j=0} w_j Y_j}{\sum_j w_j}

Here,

  • wjw_j=Matching weight for control unit j
  • nTn_T=Number of treated units

Sensitivity Analysis

Unconfoundedness is Untestable

PSM assumes no unmeasured confounders. Sensitivity analysis (e.g., Rosenbaum bounds) assesses how strong an unmeasured confounder would need to be to change the conclusion.

Rosenbaum's Gamma

Ξ“=P(Ti=1∣X,U)/P(Ti=0∣X,U)P(Tj=1∣X,U)/P(Tj=0∣X,U)\Gamma = \frac{P(T_i=1|X, U) / P(T_i=0|X, U)}{P(T_j=1|X, U) / P(T_j=0|X, U)}

Here,

  • Ξ“\Gamma=Bound on the degree of hidden bias
  • Ξ“=1\Gamma = 1=No hidden bias
  • LargeΞ“Large \Gamma=Robust to hidden bias

Python Implementation


import numpy as np

import pandas as pd

from sklearn.linear_model import LogisticRegression

from sklearn.neighbors import NearestNeighbors

import matplotlib.pyplot as plt



np.random.seed(42)



# Simulate observational data

n = 1000

X1 = np.random.randn(n)

X2 = np.random.binomial(1, 0.5, n)



# Propensity (confounded)

propensity = 1 / (1 + np.exp(-(0.5*X1 + 0.3*X2)))

T = np.random.binomial(1, propensity)



# Outcome (true ATE = 2.0)

Y0 = 3*X1 + 2*X2 + np.random.randn(n)

Y1 = Y0 + 2.0

Y = T * Y1 + (1 - T) * Y0



df = pd.DataFrame({'Y': Y, 'T': T, 'X1': X1, 'X2': X2})



# Estimate propensity score

logit = LogisticRegression().fit(df[['X1','X2']], df['T'])

df['ps'] = logit.predict_proba(df[['X1','X2']])[:, 1]



# Match

treated_idx = df[df['T']==1].index

control_idx = df[df['T']==0].index



nn = NearestNeighbors(n_neighbors=1, metric='euclidean')

nn.fit(df.loc[control_idx, ['ps']])

distances, matches = nn.kneighbors(df.loc[treated_idx, ['ps']])



# Balance check

for col in ['X1', 'X2']:

    before_smd = abs(df[df['T']==1][col].mean() - df[df['T']==0][col].mean()) / \

                 np.sqrt((df[df['T']==1][col].var() + df[df['T']==0][col].var())/2)

    

    matched_control = control_idx[matches.flatten()]

    after_smd = abs(df[df['T']==1][col].mean() - df.loc[matched_control, col].mean()) / \

                np.sqrt((df[df['T']==1][col].var() + df.loc[matched_control, col].var())/2)

    

    print(f"{col}: Before SMD={before_smd:.3f}, After SMD={after_smd:.3f}")



# ATT estimate

att = df[df['T']==1]['Y'].mean() - df.loc[matched_control, 'Y'].mean()

print(f"\nATT estimate: {att:.3f} (true: 2.0)")

Worked Example

Example: Effect of Smoking on Birth Weight

Observational study comparing birth weights of smokers vs non-smokers:

Before matching:

| Covariate | SMD |

|-----------|-----|

| Age | 0.35 |

| Income | 0.52 |

| Education | 0.28 |

After matching:

| Covariate | SMD |

|-----------|-----|

| Age | 0.04 |

| Income | 0.08 |

| Education | 0.05 |

ATT estimate: -245 grams (95% CI: [-310, -180])

Smoking reduces birth weight by approximately 245 grams. Rosenbaum's G = 2.5 β€” the result holds unless an unmeasured confounder more than doubles the odds of treatment.


Key Takeaways

Summary: Propensity Score Matching

  • PSM creates pseudo-experimental conditions from observational data

  • The propensity score e(X)=P(T=1∣X)e(X) = P(T=1|X) summarizes all confounders

  • Match on the propensity score, not raw covariates

  • Check covariate balance (SMD < 0.1) after matching

  • Overlap assumption: propensity scores must overlap between groups

  • Unconfoundedness is untestable β€” use sensitivity analysis (Rosenbaum bounds)

  • Common caliper: 0.2 Γ— SD(propensity score)


Related Topics

⭐

Premium Content

Propensity Score Matching

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement