Confidence Intervals for Proportions — Estimating p

Foundations of Statistics

Measuring Uncertainty in Binary Outcomes

Confidence intervals for proportions quantify uncertainty in estimated percentages, from conversion rates to disease prevalence. Different interval methods handle edge cases and small samples with varying accuracy.

Political Polling — Reporting election results with appropriate margins of error
Marketing — Estimating true conversion rates from A/B test samples
Public Health — Tracking disease prevalence with quantified uncertainty

Proportion intervals are essential whenever outcomes are yes/no or success/failure.

Core Concepts

Confidence intervals for proportions estimate the true population proportion $p$ from sample data. The Wald interval is the most common, but the Wilson interval performs better for extreme proportions.

DfConfidence Interval for a Proportion

A $(1-\alpha)\times 100\%$ confidence interval for the population proportion $p$ is centered at the sample proportion $\hat{p}$ with margin of error based on the standard error of $\hat{p}$ . Different methods (Wald, Wilson, Clopper-Pearson) differ in how they handle the boundary behavior and discreteness.

Wald Interval

\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

Here,

$\hat{p}$ =Sample proportion (successes/n)
$z_{\alpha/2}$ =Critical value from standard normal
$n$ =Sample size

Wald Limitations

The Wald interval performs poorly when $\hat{p}$ is near 0 or 1, or when $n$ is small. Coverage can drop well below the nominal level. The Wilson interval should be preferred in practice.

Derivation of the Wald Interval

ThNormal Approximation for Proportions

Let $X \sim \text{Binomial}(n, p)$ and $\hat{p} = X/n$ . By the CLT:

\frac{\hat{p} - p}{\sqrt{p(1-p)/n}} \xrightarrow{d} N(0, 1)

The Wald interval replaces $p$ with $\hat{p}$ in the standard error, giving:

P\left(\hat{p} - z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \leq p \leq \hat{p} + z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right) \approx 1 - \alpha

Why Wald Fails

The approximation $\sqrt{p(1-p)/n} \approx \sqrt{\hat{p}(1-\hat{p})/n}$ is poor when $p$ is near 0 or 1 because:

The binomial distribution is highly asymmetric for extreme $p$
$\hat{p}(1-\hat{p})$ can be 0 when $\hat{p} \in \{0, 1\}$ , giving zero-width intervals
The normal approximation to the binomial requires $np \geq 10$ and $n(1-p) \geq 10$

Wilson Score Interval

Wilson Interval

\frac{\hat{p} + \frac{z^2}{2n} \pm z\sqrt{\frac{\hat{p}(1-\hat{p})}{n} + \frac{z^2}{4n^2}}}{1 + \frac{z^2}{n}}

Here,

$\hat{p}$ =Sample proportion
$z$ =Critical value (e.g., 1.96 for 95%)
$n$ =Sample size

ThDerivation of the Wilson Interval

The Wilson interval is derived by inverting the score test. We seek values of $p$ that satisfy:

\frac{|\hat{p} - p|^2}{p(1-p)/n} \leq z_{\alpha/2}^2

This is a quadratic inequality in $p$ . Solving for $p$ and rearranging yields the Wilson interval. Unlike the Wald interval, the Wilson interval:

Never produces empty or zero-width intervals
Is centered near $0.5$ when $\hat{p}$ is extreme
Has coverage close to the nominal level even for small $n$

Wilson vs Wald

Simulation studies (Brown, Cai, and DasGupta, 2001) show that the Wilson interval has coverage probability closest to the nominal level across the entire range of $p$ . The Wald interval can have coverage as low as 70% when the nominal level is 95%.

Sample Size for Desired Margin of Error

Sample Size for Proportion

n = \left(\frac{z_{\alpha/2}}{E}\right)^2 \hat{p}(1-\hat{p})

Here,

$E$ =Desired margin of error
$z_{\alpha/2}$ =Critical value
$\hat{p}$ =Estimated proportion (use 0.5 if unknown)

Conservative Sample Size

When $p$ is unknown, use $\hat{p} = 0.5$ for the most conservative (largest) sample size, since $p(1-p)$ is maximized at $p = 0.5$ :

n = \left(\frac{z_{\alpha/2}}{2E}\right)^2

For example, for a 95% CI with margin of error $E = 0.03$ : $n = (1.96/0.06)^2 = 1067.1$ , so $n = 1068$ .

Worked Example: Wald Interval

A poll of $n = 500$ voters finds $\hat{p} = 0.58$ support for a candidate. Construct a 95% CI.

Step 1. Check conditions: $n\hat{p} = 290 \geq 10$ and $n(1-\hat{p}) = 210 \geq 10$ . ✓

Step 2. Standard error: $\text{SE} = \sqrt{0.58 \times 0.42 / 500} = \sqrt{0.000487} = 0.0221$ .

Step 3. Margin of error: $E = 1.96 \times 0.0221 = 0.0433$ .

Step 4. The 95% Wald CI: $0.58 \pm 0.043 = (0.537, 0.623)$ .

Interpretation: We are 95% confident that between 53.7% and 62.3% of voters support the candidate.

Worked Example: Wilson Interval

For the same data ( $n = 500$ , $\hat{p} = 0.58$ ), compute the Wilson interval.

Step 1. Compute the components: $z^2/(2n) = 1.96^2/1000 = 0.00384$ , $z^2/n = 0.00768$ .

Step 2. The center: $\frac{0.58 + 0.00384}{1 + 0.00768} = \frac{0.58384}{1.00768} = 0.5794$ .

Step 3. The standard error term: $\sqrt{\frac{0.58 \times 0.42}{500} + \frac{0.00384^2}{4}} = \sqrt{0.000487 + 0.000004} = 0.0221$ .

Step 4. The Wilson CI: $\frac{0.58384 \pm 1.96 \times 0.0221}{1.00768} = \frac{0.58384 \pm 0.0433}{1.00768}$ .

Lower: $(0.58384 - 0.0433)/1.00768 = 0.5364$ . Upper: $(0.58384 + 0.0433)/1.00768 = 0.6224$ .

The Wilson CI is $(0.536, 0.622)$ — very close to the Wald interval here because $n$ is large and $\hat{p}$ is not extreme.

Comparison of Methods

Method Comparison for $n = 20$, $\hat{p} = 0.15$

Method	95% CI	Coverage (simulated)
Wald	$(0.0, 0.33)$	82%
Wilson	$(0.05, 0.33)$	93%
Clopper-Pearson	$(0.03, 0.37)$	97%

The Wald interval has terrible coverage for small $n$ and extreme $p$ . The Wilson interval is the best general-purpose choice. Clopper-Pearson is conservative (guaranteed $\geq 95\%$ coverage).

Clopper-Pearson (Exact) Interval

Exact Method

The Clopper-Pearson interval inverts the binomial test. For $x$ successes in $n$ trials, the $(1-\alpha)$ CI is:

\left(\frac{x}{x + (n-x+1) \cdot F_{1-\alpha/2, 2(n-x+1), 2x}}, \quad \frac{(x+1) \cdot F_{1-\alpha/2, 2(x+1), 2(n-x)}}{(n-x) + (x+1) \cdot F_{1-\alpha/2, 2(x+1), 2(n-x)}}\right)

where $F$ is the F-distribution quantile. This interval is always valid but tends to be conservative (wider than necessary).

Key Takeaways

Summary: Confidence Intervals for Proportions

Wald interval: $\hat{p} \pm z\sqrt{\hat{p}(1-\hat{p})/n}$ — simple but unreliable for extreme $\hat{p}$
Wilson interval — more accurate, especially for small $n$ or extreme proportions; derived from inverting the score test
Clopper-Pearson — exact but conservative; guarantees coverage $\geq 1-\alpha$
Use $\hat{p} = 0.5$ for conservative sample size calculation (maximum variance)
Check $n\hat{p} \geq 10$ and $n(1-\hat{p}) \geq 10$ for valid normal approximation (Wald)
Common in polling, A/B testing, and quality control

Confidence Intervals for Proportions — Estimating p

Confidence Intervals for Proportions — Estimating p

Measuring Uncertainty in Binary Outcomes

Core Concepts

DfConfidence Interval for a Proportion

Wald Interval

Derivation of the Wald Interval

ThNormal Approximation for Proportions

Wilson Score Interval

Wilson Interval

ThDerivation of the Wilson Interval

Sample Size for Desired Margin of Error

Sample Size for Proportion

Worked Example: Wald Interval

Worked Example: Wilson Interval

Comparison of Methods

Clopper-Pearson (Exact) Interval

Key Takeaways

Summary: Confidence Intervals for Proportions

Premium Content

Need Expert Statistics Help?