Confidence Intervals for Two Samples — Comparing Groups
Foundations of Statistics
Comparing Groups with Precision
Two-sample intervals estimate the difference between population parameters, providing the foundation for comparing treatments, groups, or conditions. They answer the practical question: how different are these groups really?
- A/B Testing — Quantifying the true difference in website conversion rates
- Clinical Trials — Estimating treatment versus control group differences
- Social Science — Measuring effect sizes in observational studies
The difference between groups is often more important than the groups themselves.
Core Concepts
Two-sample confidence intervals estimate the difference between two population parameters (means or proportions). They are the foundation for comparing groups.
DfTwo-Sample Confidence Interval
A confidence interval for (or ) quantifies the uncertainty in the difference between two population parameters.
CI for Difference of Means (Independent, Equal Variance)
Here,
- =Sample means
- =Pooled standard deviation
- =Sample sizes
- =Critical value with pooled df
Pooled Standard Deviation
Here,
- =Sample standard deviations
- =Sample sizes
Welch's t-Interval (Unequal Variance)
Welch's Approximate Degrees of Freedom
Here,
- =Sample variances
- =Sample sizes
- =Effective degrees of freedom (not necessarily integer)
When to Use Welch's t
Welch's t-interval is the safer default because it does not assume equal population variances. The equal-variance pooled t-interval is only appropriate when there is strong evidence that .
Derivation: The Two-Sample Pivot
ThTwo-Sample t-Pivot (Equal Variance)
Let and be independent samples. Then
where is the pooled variance.
Proof sketch: The numerator is normal with mean 0 and variance . Standardize: . By Cochran's theorem, , independent of and . Therefore where , giving by definition.
CI for Difference of Proportions
CI for Difference of Proportions
Here,
- =Sample proportions
- =Sample sizes
Wilson Score Interval
The standard Wald interval above can have poor coverage when is near 0 or 1. The Wilson score interval provides better finite-sample coverage:
with analogous adjustments to the center.
Worked Example: Clinical Trial
Two drugs are compared for blood pressure reduction. Drug A: , mmHg, . Drug B: , mmHg, . Construct a 95% CI for .
Step 1: Check if equal variance is reasonable: . The ratio is less than 2, so pooled t is reasonable.
Step 2: Compute pooled standard deviation:
Step 3: Compute standard error:
Step 4: Critical value: (interpolating from t-table).
Step 5: Construct CI:
Interpretation
Since the 95% CI for does not contain 0, we conclude Drug A provides significantly greater blood pressure reduction than Drug B at the level. The effect size is estimated at 1.7 mmHg with margin of error 1.37 mmHg.
Key Takeaways
Summary: Confidence Intervals for Two Samples
- Estimates the difference between two population parameters
- Means (equal variance): pooled t-interval with
- Means (unequal variance): Welch's t-interval with approximate df (safer default)
- Proportions: -interval based on difference of sample proportions; prefer Wilson score for small
- If the CI includes 0, the difference is not statistically significant at that level
- The pooled t-interval is not robust to unequal variances; always check or use Welch