🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Meta-Analysis

Advanced Statistical MethodsEvidence Synthesis🟢 Free Lesson

Advertisement

Meta-Analysis

Advanced Statistical Methods

Combining Evidence Across Studies for Stronger Conclusions

Meta-analysis statistically synthesizes results from multiple studies to produce a single summary estimate with greater precision. Fixed-effect and random-effects models account for heterogeneity across studies.

  • Clinical medicine — Combine trial results to establish definitive treatment guidelines
  • Education — Synthesize intervention studies to identify effective teaching strategies
  • Environmental policy — Aggregate epidemiological evidence for regulatory decision-making

Meta-analysis transforms a forest of individual studies into a clear, quantitative conclusion.


DfMeta-Analysis

A meta-analysis is a statistical procedure that combines results from multiple independent studies to produce a single summary estimate of an effect size. It quantifies the overall evidence, assesses consistency across studies, and identifies sources of heterogeneity.

"The goal of meta-analysis is not to produce a single number, but to understand the structure of evidence across studies." — Higgins & Green, Cochrane Handbook


Why Meta-Analysis?

Individual studies may be:

  • Underpowered: Too small to detect the true effect
  • Conflicting: Some studies find significance, others do not
  • Context-specific: Results vary by population, intervention, or setting

Meta-analysis addresses these issues by:

  1. Increasing statistical power through pooled sample sizes
  2. Quantifying heterogeneity across studies
  3. Identifying moderators that explain variability
  4. Providing a transparent, replicable summary of evidence

Effect Size Measures

Before pooling, each study's result must be converted to a common metric.

Standardized Mean Difference (Cohen's d)

d=XˉTXˉRSpooledd = \frac{\bar{X}_T - \bar{X}_R}{S_{\text{pooled}}}

where Spooled=(nT1)ST2+(nR1)SR2nT+nR2S_{\text{pooled}} = \sqrt{\frac{(n_T - 1)S_T^2 + (n_R - 1)S_R^2}{n_T + n_R - 2}}

Hedges' g (Bias-Corrected)

g=d(134(nT+nR)9)g = d \cdot \left(1 - \frac{3}{4(n_T + n_R) - 9}\right)

For binary outcomes:

Odds Ratio

OR=adbc\text{OR} = \frac{a \cdot d}{b \cdot c}

where a,b,c,da, b, c, d are the cell frequencies in a 2×2 table:

EventNo Event
Treatmentab
Controlcd

Fixed-Effect Model

DfFixed-Effect Model

The fixed-effect model assumes all studies share a common true effect size θ\theta. Variation across studies is due solely to sampling error. Study ii's observed effect is:

Yi=θ+εi,εiN(0,vi)Y_i = \theta + \varepsilon_i, \quad \varepsilon_i \sim N(0, v_i)

The pooled estimate is the inverse-variance weighted mean:

Fixed-Effect Pooled Estimate

θ^FE=i=1KwiYii=1Kwi,wi=1vi\hat{\theta}_{FE} = \frac{\sum_{i=1}^{K} w_i Y_i}{\sum_{i=1}^{K} w_i}, \quad w_i = \frac{1}{v_i}

The variance of the pooled estimate:

Var(θ^FE)=1i=1Kwi\text{Var}(\hat{\theta}_{FE}) = \frac{1}{\sum_{i=1}^{K} w_i}

Cochran's Q

DfCochran's Q

Cochran's Q tests whether the observed effects are consistent with a common true effect:

Q=i=1Kwi(Yiθ^FE)2Q = \sum_{i=1}^{K} w_i (Y_i - \hat{\theta}_{FE})^2

Under the fixed-effect null hypothesis, QχK12Q \sim \chi^2_{K-1}.

Interpretation of Q

A significant Q (p<0.10p < 0.10) suggests heterogeneity beyond sampling error, indicating that a random-effects model may be more appropriate.


Random-Effects Model

DfRandom-Effects Model

The random-effects model assumes true effect sizes vary across studies:

Yi=μ+ui+εiY_i = \mu + u_i + \varepsilon_i

where uiN(0,τ2)u_i \sim N(0, \tau^2) is the study-specific deviation from the overall mean μ\mu, and εiN(0,vi)\varepsilon_i \sim N(0, v_i) is the sampling error.

The total variance of study ii is:

Random-Effects Variance

Var(Yi)=vi+τ2=σi2\text{Var}(Y_i) = v_i + \tau^2 = \sigma_i^2

The pooled estimate:

Random-Effects Pooled Estimate

θ^RE=i=1KwiYii=1Kwi,wi=1vi+τ^2\hat{\theta}_{RE} = \frac{\sum_{i=1}^{K} w_i^* Y_i}{\sum_{i=1}^{K} w_i^*}, \quad w_i^* = \frac{1}{v_i + \hat{\tau}^2}

Key Difference

In random-effects models, studies with larger variance (smaller samples) receive relatively more weight compared to fixed-effect models, since the added τ2\tau^2 term dilutes the precision advantage of large studies.


Estimating τ²

DerSimonian-Laird Method

DfDerSimonian-Laird Estimator

The most common method for estimating τ2\tau^2:

τ^DL2=Q(K1)i=1Kwii=1Kwi2i=1Kwi\hat{\tau}^2_{DL} = \frac{Q - (K - 1)}{\sum_{i=1}^{K} w_i - \frac{\sum_{i=1}^{K} w_i^2}{\sum_{i=1}^{K} w_i}}

If QK1Q \leq K - 1, then τ^DL2=0\hat{\tau}^2_{DL} = 0.

Other Estimators

MethodDescriptionProperty
REMLRestricted maximum likelihoodLess biased, often preferred
Paule-MandelIterative matching of expected QGood small-sample properties
HedgesMoment-basedSimple closed-form
PMProfile likelihoodBetter coverage in simulation

Heterogeneity Measures

I² Statistic

DfI² Statistic

I2=Q(K1)Q×100%I^2 = \frac{Q - (K - 1)}{Q} \times 100\%

I2I^2 represents the percentage of variability due to heterogeneity rather than sampling error. Values:

  • I2=0%I^2 = 0\%: No observed heterogeneity
  • I2=25%I^2 = 25\%: Low heterogeneity
  • I2=50%I^2 = 50\%: Moderate heterogeneity
  • I2=75%I^2 = 75\%: High heterogeneity

τ² and τ

  • τ2\tau^2 is the between-study variance (absolute heterogeneity)
  • τ=τ2\tau = \sqrt{\tau^2} is the standard deviation of true effects

Prediction Interval

Prediction Interval

A 95% prediction interval for a new study's effect:

μ^±tK2,0.025τ^2+Var(μ^)\hat{\mu} \pm t_{K-2, 0.025} \cdot \sqrt{\hat{\tau}^2 + \text{Var}(\hat{\mu})}

This is wider than the confidence interval and quantifies the range of effects we might expect in a future study.


Publication Bias

DfPublication Bias

Publication bias occurs when the likelihood of a study being published depends on its results. Studies with statistically significant or positive findings are more likely to be published, inflating the meta-analytic estimate.

Funnel Plot

DfFunnel Plot

A scatter plot of effect sizes (x-axis) against a measure of precision (y-axis, typically standard error). In the absence of bias, studies should form a symmetric inverted funnel shape centered on the pooled estimate.

Egger's Test

DfEgger's Test

Egger's test regresses the standardized effect sizes on precision:

Yivi=β0+β11vi+εi\frac{Y_i}{\sqrt{v_i}} = \beta_0 + \beta_1 \cdot \frac{1}{\sqrt{v_i}} + \varepsilon_i

A significant intercept (β00\beta_0 \neq 0) at p<0.10p < 0.10 suggests small-study effects (asymmetry).

Trim-and-Fill

DfTrim-and-Fill

The trim-and-fill method (Duval & Tweedie, 2000) imputes missing studies to restore funnel plot symmetry:

  1. Estimate the number of missing studies (mm) by trimming extreme values
  2. Impute mm studies on the sparse side of the funnel
  3. Recompute the pooled estimate including imputed studies

Limitations

Trim-and-fill assumes asymmetry is solely due to publication bias. Asymmetry may also result from true heterogeneity, small-study effects, or methodological differences. Use multiple methods.


Moderator Analysis (Meta-Regression)

DfMeta-Regression

Meta-regression extends meta-analysis by modeling the relationship between study-level covariates and effect sizes:

Yi=β0+β1Xi1++βpXip+ui+εiY_i = \beta_0 + \beta_1 X_{i1} + \cdots + \beta_p X_{ip} + u_i + \varepsilon_i

where XijX_{ij} are study-level characteristics (e.g., dose, duration, population age).

The proportion of heterogeneity explained:

R2=τ^null2τ^model2τ^null2R^2 = \frac{\hat{\tau}^2_{\text{null}} - \hat{\tau}^2_{\text{model}}}{\hat{\tau}^2_{\text{null}}}

Network Meta-Analysis

DfNetwork Meta-Analysis

Network meta-analysis (NMA), also called mixed treatment comparisons, allows indirect comparisons of multiple treatments using a network of randomized trials. If Treatment A vs B and Treatment B vs C have been studied, NMA can estimate A vs C without direct head-to-head evidence.

Consistency Assumption

DfConsistency

The consistency assumption states that direct and indirect evidence agree:

θAC=θAB+θBC\theta_{AC} = \theta_{AB} + \theta_{BC}

Inconsistency is assessed using node-splitting models or design-by-treatment interaction models.

SUCRA (Surface Under the Cumulative Ranking)

SUCRA

SUCRAj=k=1K1rankjk/(K1)1×100%\text{SUCRA}_j = \frac{\sum_{k=1}^{K-1} \text{rank}_{jk} / (K-1)}{1} \times 100\%

SUCRA ranges from 0% (worst) to 100% (best), summarizing the probability that a treatment is ranked among the best.


Python Implementation

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# --- Fixed-effect meta-analysis (inverse-variance method) ---
def fixed_effect_meta-analysis(effects, variances):
    """
    Fixed-effect meta-analysis using inverse-variance weighting.
    
    Parameters:
        effects: array of effect sizes (one per study)
        variances: array of sampling variances
    Returns:
        dict with pooled estimate, CI, Q statistic, I²
    """
    effects = np.asarray(effects)
    variances = np.asarray(variances)
    weights = 1.0 / variances
    K = len(effects)
    
    theta_hat = np.sum(weights * effects) / np.sum(weights)
    var_theta = 1.0 / np.sum(weights)
    se_theta = np.sqrt(var_theta)
    
    # Cochran's Q
    Q = np.sum(weights * (effects - theta_hat)**2)
    df = K - 1
    p_Q = 1 - stats.chi2.cdf(Q, df)
    
    # I²
    I2 = max(0, (Q - df) / Q * 100) if Q > 0 else 0
    
    # 95% CI
    ci_lower = theta_hat - 1.96 * se_theta
    ci_upper = theta_hat + 1.96 * se_theta
    
    return {
        'theta': theta_hat, 'se': se_theta,
        'ci_95': (ci_lower, ci_upper),
        'Q': Q, 'df': df, 'p_Q': p_Q, 'I2': I2,
        'weights': weights / np.sum(weights) * 100
    }

# --- Random-effects meta-analysis (DerSimonian-Laird) ---
def random_effects_meta_analysis(effects, variances):
    """
    Random-effects meta-analysis using DerSimonian-Laird.
    """
    effects = np.asarray(effects)
    variances = np.asarray(variances)
    K = len(effects)
    
    # Fixed-effect for Q calculation
    w_fe = 1.0 / variances
    theta_fe = np.sum(w_fe * effects) / np.sum(w_fe)
    Q = np.sum(w_fe * (effects - theta_fe)**2)
    
    # DerSimonian-Laird tau²
    C = np.sum(w_fe) - np.sum(w_fe**2) / np.sum(w_fe)
    tau2 = max(0, (Q - (K - 1)) / C)
    
    # Random-effects weights
    w_re = 1.0 / (variances + tau2)
    theta_re = np.sum(w_re * effects) / np.sum(w_re)
    var_re = 1.0 / np.sum(w_re)
    se_re = np.sqrt(var_re)
    
    tau = np.sqrt(tau2)
    
    # Prediction interval
    t_crit = stats.t.ppf(0.975, K - 2)
    pred_lower = theta_re - t_crit * np.sqrt(tau2 + var_re)
    pred_upper = theta_re + t_crit * np.sqrt(tau2 + var_re)
    
    return {
        'theta': theta_re, 'se': se_re,
        'ci_95': (theta_re - 1.96*se_re, theta_re + 1.96*se_re),
        'tau2': tau2, 'tau': tau,
        'Q': Q, 'df': K-1, 'I2': max(0, (Q-(K-1))/Q*100) if Q > 0 else 0,
        'pred_interval': (pred_lower, pred_upper),
        'weights': w_re / np.sum(w_re) * 100
    }

# --- Funnel plot ---
def funnel_plot(effects, se, labels=None):
    """Create a funnel plot for publication bias assessment."""
    fig, ax = plt.subplots(figsize=(8, 6))
    ax.scatter(effects, se, s=50, c='steelblue', edgecolors='black', alpha=0.7)
    
    theta_pooled = np.average(effects, weights=1/np.array(se)**2)
    ax.axvline(x=theta_pooled, color='red', linestyle='--', label='Pooled estimate')
    
    # Pseudo 95% CI funnel
    se_range = np.linspace(0.01, max(se)*1.1, 100)
    for z in [1.96, -1.96]:
        ax.plot(theta_pooled + z * se_range, se_range, 'gray', linestyle=':', alpha=0.5)
    
    ax.set_xlabel('Effect Size')
    ax.set_ylabel('Standard Error')
    ax.set_title('Funnel Plot')
    ax.invert_yaxis()
    ax.legend()
    ax.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.savefig('funnel_plot.png', dpi=150)
    plt.show()

# --- Example: 10 studies on drug efficacy ---
np.random.seed(42)
K = 10
true_effect = 0.40  # True standardized mean difference
true_tau = 0.15

# Generate study effects
true_effects = np.random.normal(true_effect, true_tau, K)
sample_sizes = np.random.randint(30, 200, K)
variances = 2 / sample_sizes + true_tau**2 * np.random.uniform(0.5, 1.5, K)
observed_effects = np.random.normal(true_effects, np.sqrt(variances))

print("=== Fixed-Effect Meta-Analysis ===")
fe = fixed_effect_meta_analysis(observed_effects, variances)
print(f"Pooled effect: {fe['theta']:.3f} (SE: {fe['se']:.3f})")
print(f"95% CI: ({fe['ci_95'][0]:.3f}, {fe['ci_95'][1]:.3f})")
print(f"Cochran's Q: {fe['Q']:.2f}, df={fe['df']}, p={fe['p_Q']:.4f}")
print(f"I²: {fe['I2']:.1f}%")

print("\n=== Random-Effects Meta-Analysis (DerSimonian-Laird) ===")
re = random_effects_meta_analysis(observed_effects, variances)
print(f"Pooled effect: {re['theta']:.3f} (SE: {re['se']:.3f})")
print(f"95% CI: ({re['ci_95'][0]:.3f}, {re['ci_95'][1]:.3f})")
print(f"τ²: {re['tau2']:.4f}, τ: {re['tau']:.3f}")
print(f"I²: {re['I2']:.1f}%")
print(f"Prediction interval: ({re['pred_interval'][0]:.3f}, {re['pred_interval'][1]:.3f})")

# Forest plot
fig, ax = plt.subplots(figsize=(10, 7))
y_positions = np.arange(K, 0, -1)
for i in range(K):
    ci_lower = observed_effects[i] - 1.96 * np.sqrt(variances[i])
    ci_upper = observed_effects[i] + 1.96 * np.sqrt(variances[i])
    weight = fe['weights'][i]
    
    ax.plot([ci_lower, ci_upper], [y_positions[i], y_positions[i]], 'b-', linewidth=1.5)
    ax.plot(observed_effects[i], y_positions[i], 'bs', markersize=8,
            label=f'Study {i+1}' if i < 5 else None)

# Pooled estimate
ax.plot(re['theta'], 0, 'rD', markersize=10, label='Random-effects pooled')
ax.plot([re['ci_95'][0], re['ci_95'][1]], [0, 0], 'r-', linewidth=2)
ax.axvline(x=0, color='gray', linestyle='-', linewidth=0.5)

ax.set_yticks(y_positions.tolist() + [0])
ax.set_yticklabels([f'Study {i+1}' for i in range(K)] + ['Pooled'])
ax.set_xlabel('Effect Size (Standardized Mean Difference)')
ax.set_title('Forest Plot')
ax.legend(loc='lower right')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('forest_plot.png', dpi=150)
plt.show()

# Funnel plot
funnel_plot(observed_effects, np.sqrt(variances))

Key Takeaways

Summary: Meta-Analysis

  1. Fixed-effect models assume a common true effect; random-effects models allow true effects to vary across studies.
  2. DerSimonian-Laird is the standard method for estimating between-study heterogeneity τ2\tau^2.
  3. Cochran's Q tests for heterogeneity; quantifies the percentage of variability due to heterogeneity.
  4. Publication bias can be assessed via funnel plots, Egger's test, and trim-and-fill methods.
  5. Meta-regression identifies study-level moderators that explain heterogeneity.
  6. Network meta-analysis enables indirect comparisons across multiple treatments using a connected evidence network.
  7. Always report prediction intervals alongside confidence intervals to convey the range of effects in future studies.

Next Steps

Premium Content

Meta-Analysis

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement