Bootstrap Confidence Intervals — Resampling-Based Inference
Foundations of Statistics
When Theory Meets Reality
Bootstrap methods use resampling to construct confidence intervals without distributional assumptions, making them applicable to virtually any statistic. They provide reliable inference even when traditional formulas break down.
- Finance — Constructing confidence intervals for risk measures like VaR
- Ecology — Estimating uncertainty for biodiversity indices
- Machine Learning — Quantifying variability in model performance metrics
The bootstrap lets the data speak for itself.
What Is Bootstrap Confidence Intervals?
DfBootstrap Confidence Intervals
Bootstrap confidence intervals use resampling — repeatedly drawing samples with replacement from the observed data — to estimate the sampling distribution of a statistic without relying on distributional assumptions.
Core Concept
Bootstrap Resampling
Here,
- =Bootstrap estimate from the b-th resample
- =Statistic function applied to the resample
- =The b-th bootstrap sample (drawn with replacement)
- =Total number of bootstrap resamples
Theoretical Foundation
ThBootstrap Principle (Efron, 1979)
The empirical distribution places mass at each observed data point . The bootstrap approximates the true sampling distribution of by the distribution of where is the distribution of a resample from . As , almost surely (by the Glivenko-Cantelli theorem), so the bootstrap distribution converges to the true sampling distribution.
Key insight: We replace the unknown population with the known empirical , then use the computer to simulate what we cannot compute analytically.
Three Bootstrap CI Methods
Percentile Bootstrap CI
Here,
- =The $\alpha/2$ quantile of bootstrap estimates
- =The $1-\alpha/2$ quantile of bootstrap estimates
Limitations of the Percentile Method
The percentile interval is not transformation-resymmetric and can have poor coverage when the statistic is biased. It works well only when the sampling distribution is approximately symmetric and the statistic is close to unbiased.
BCA (Bias-Corrected and Accelerated) CI
Here,
- =$\Phi(\hat{z}_0 + (\hat{z}_0 + z_{\alpha/2})/(1 - \hat{a}(\hat{z}_0 + z_{\alpha/2})))$
- =$\Phi(\hat{z}_0 + (\hat{z}_0 + z_{1-\alpha/2})/(1 - \hat{a}(\hat{z}_0 + z_{1-\alpha/2})))$
- =Bias correction: fraction of bootstrap estimates $< \hat{\theta}$
- =Acceleration: measures skewness of the statistic's distribution
When to Use BCA
BCA adjusts for both bias () and skewness (). It is generally preferred over the percentile method and has better finite-sample coverage properties.
Studentized (Pivotal) Bootstrap CI
Here,
- =$p$-th quantile of bootstrap $t$-statistics $T^{*b} = (\hat{\theta}^{*b} - \hat{\theta})/\hat{\text{se}}^{*b}$
- =Bootstrap standard error estimate
Worked Example: Median Bootstrap CI
Given data (), compute a 95% BCA bootstrap CI for the population median.
Step 1: Compute the sample median: .
Step 2: Draw bootstrap samples, compute the median of each.
Step 3: Apply the BCA correction:
- is estimated via the jackknife:
where is the median computed on the sample with the -th observation deleted.
Step 4: Adjusted quantiles give the BCA interval.
Why the Median is Hard
The median is a non-smooth statistic — its influence function is discontinuous. The CLT applies (asymptotically normal), but finite-sample distributions are skewed. The bootstrap handles this automatically without requiring the analyst to derive the asymptotic variance.
Python Implementation: Three Bootstrap Methods
import numpy as np
from scipy import stats
np.random.seed(42)
data = np.array([2.1, 3.5, 4.2, 1.8, 5.6, 3.9, 2.7, 4.8, 3.2, 6.1, 2.9, 5.0])
theta_hat = np.median(data)
B = 10000
# Generate bootstrap samples and compute medians
boot_medians = np.array([np.median(np.random.choice(data, size=len(data), replace=True))
for _ in range(B)])
# Percentile CI
ci_percentile = np.percentile(boot_medians, [2.5, 97.5])
# BCA CI (using jackknife for acceleration)
n = len(data)
jack_medians = np.array([np.median(np.delete(data, i)) for i in range(n)])
jack_mean = np.mean(jack_medians)
a_hat = np.sum((jack_mean - jack_medians)**3) / (6 * np.sum((jack_mean - jack_medians)**2)**1.5)
z0 = stats.norm.ppf(np.mean(boot_medians < theta_hat))
z_alpha = stats.norm.ppf([0.025, 0.975])
alpha1 = stats.norm.cdf(z0 + (z0 + z_alpha) / (1 - a_hat * (z0 + z_alpha)))
ci_bca = np.percentile(boot_medians, alpha1 * 100)
# Studentized CI
def se_median(x, B_boot=1000):
boots = np.array([np.median(np.random.choice(x, len(x), replace=True)) for _ in range(B_boot)])
return np.std(boots)
se_hat = se_median(data)
boot_t = (boot_medians - theta_hat) / se_hat
ci_studentized = theta_hat - np.percentile(boot_t, [97.5, 2.5]) * se_hat
print(f"Sample median: {theta_hat:.3f}")
print(f"Percentile CI: [{ci_percentile[0]:.3f}, {ci_percentile[1]:.3f}]")
print(f"BCA CI: [{ci_bca[0]:.3f}, {ci_bca[1]:.3f}]")
print(f"Studentized CI:[{ci_studentized[0]:.3f}, {ci_studentized[1]:.3f}]")
Coverage Comparison
ThBootstrap Coverage Accuracy
Under regularity conditions, the studentized bootstrap CI achieves correct asymptotic coverage:
with convergence rate — faster than the rate of the percentile method.
Bootstrap Failures
The bootstrap can fail when:
- The statistic is not pivotal and the sample size is small
- The data are not exchangeable (e.g., time series with dependence)
- The parameter lies on the boundary of the parameter space (e.g., variance components) In these cases, use specialized bootstrap methods (block bootstrap, subsampling).
Key Takeaways
Summary: Bootstrap Confidence Intervals
- Bootstrap CIs use resampling to estimate the sampling distribution without distributional assumptions
- Percentile: simplest, but only reliable for symmetric, approximately unbiased statistics
- BCA: adjusts for bias and skewness; preferred for general use
- Studentized: best theoretical properties ( coverage error) but requires computing the standard error for each resample
- Requires resamples for stable percentile and BCA intervals
- Not a panacea: fails with dependent data, boundary parameters, or very small samples