Confidence Intervals for Proportions — Estimating p
Foundations of Statistics
Measuring Uncertainty in Binary Outcomes
Confidence intervals for proportions quantify uncertainty in estimated percentages, from conversion rates to disease prevalence. Different interval methods handle edge cases and small samples with varying accuracy.
- Political Polling — Reporting election results with appropriate margins of error
- Marketing — Estimating true conversion rates from A/B test samples
- Public Health — Tracking disease prevalence with quantified uncertainty
Proportion intervals are essential whenever outcomes are yes/no or success/failure.
Core Concepts
Confidence intervals for proportions estimate the true population proportion from sample data. The Wald interval is the most common, but the Wilson interval performs better for extreme proportions.
DfConfidence Interval for a Proportion
A confidence interval for the population proportion is centered at the sample proportion with margin of error based on the standard error of . Different methods (Wald, Wilson, Clopper-Pearson) differ in how they handle the boundary behavior and discreteness.
Wald Interval
Here,
- =Sample proportion (successes/n)
- =Critical value from standard normal
- =Sample size
Wald Limitations
The Wald interval performs poorly when is near 0 or 1, or when is small. Coverage can drop well below the nominal level. The Wilson interval should be preferred in practice.
Derivation of the Wald Interval
ThNormal Approximation for Proportions
Let and . By the CLT:
The Wald interval replaces with in the standard error, giving:
Why Wald Fails
The approximation is poor when is near 0 or 1 because:
- The binomial distribution is highly asymmetric for extreme
- can be 0 when , giving zero-width intervals
- The normal approximation to the binomial requires and
Wilson Score Interval
Wilson Interval
Here,
- =Sample proportion
- =Critical value (e.g., 1.96 for 95%)
- =Sample size
ThDerivation of the Wilson Interval
The Wilson interval is derived by inverting the score test. We seek values of that satisfy:
This is a quadratic inequality in . Solving for and rearranging yields the Wilson interval. Unlike the Wald interval, the Wilson interval:
- Never produces empty or zero-width intervals
- Is centered near when is extreme
- Has coverage close to the nominal level even for small
Wilson vs Wald
Simulation studies (Brown, Cai, and DasGupta, 2001) show that the Wilson interval has coverage probability closest to the nominal level across the entire range of . The Wald interval can have coverage as low as 70% when the nominal level is 95%.
Sample Size for Desired Margin of Error
Sample Size for Proportion
Here,
- =Desired margin of error
- =Critical value
- =Estimated proportion (use 0.5 if unknown)
Conservative Sample Size
When is unknown, use for the most conservative (largest) sample size, since is maximized at :
For example, for a 95% CI with margin of error : , so .
Worked Example: Wald Interval
A poll of voters finds support for a candidate. Construct a 95% CI.
Step 1. Check conditions: and . ✓
Step 2. Standard error: .
Step 3. Margin of error: .
Step 4. The 95% Wald CI: .
Interpretation: We are 95% confident that between 53.7% and 62.3% of voters support the candidate.
Worked Example: Wilson Interval
For the same data (, ), compute the Wilson interval.
Step 1. Compute the components: , .
Step 2. The center: .
Step 3. The standard error term: .
Step 4. The Wilson CI: .
Lower: . Upper: .
The Wilson CI is — very close to the Wald interval here because is large and is not extreme.
Comparison of Methods
Method Comparison for $n = 20$, $\hat{p} = 0.15$
| Method | 95% CI | Coverage (simulated) |
|---|---|---|
| Wald | 82% | |
| Wilson | 93% | |
| Clopper-Pearson | 97% |
The Wald interval has terrible coverage for small and extreme . The Wilson interval is the best general-purpose choice. Clopper-Pearson is conservative (guaranteed coverage).
Clopper-Pearson (Exact) Interval
Exact Method
The Clopper-Pearson interval inverts the binomial test. For successes in trials, the CI is:
where is the F-distribution quantile. This interval is always valid but tends to be conservative (wider than necessary).
Key Takeaways
Summary: Confidence Intervals for Proportions
- Wald interval: — simple but unreliable for extreme
- Wilson interval — more accurate, especially for small or extreme proportions; derived from inverting the score test
- Clopper-Pearson — exact but conservative; guarantees coverage
- Use for conservative sample size calculation (maximum variance)
- Check and for valid normal approximation (Wald)
- Common in polling, A/B testing, and quality control