🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Confidence Intervals for Two Samples — Comparing Groups

Foundations of StatisticsConfidence Intervals🟢 Free Lesson

Advertisement

Confidence Intervals for Two Samples — Comparing Groups

Foundations of Statistics

Comparing Groups with Precision

Two-sample intervals estimate the difference between population parameters, providing the foundation for comparing treatments, groups, or conditions. They answer the practical question: how different are these groups really?

  • A/B Testing — Quantifying the true difference in website conversion rates
  • Clinical Trials — Estimating treatment versus control group differences
  • Social Science — Measuring effect sizes in observational studies

The difference between groups is often more important than the groups themselves.


Core Concepts

Two-sample confidence intervals estimate the difference between two population parameters (means or proportions). They are the foundation for comparing groups.

DfTwo-Sample Confidence Interval

A (1α)×100(1-\alpha)\times 100\\% confidence interval for mu1mu2\\mu_1 - \\mu_2 (or p1p2p_1 - p_2) quantifies the uncertainty in the difference between two population parameters.

CI for Difference of Means (Independent, Equal Variance)

(xˉ1xˉ2)±tα/2,n1+n22sp1n1+1n2(\bar{x}_1 - \bar{x}_2) \pm t_{\alpha/2, n_1+n_2-2} \cdot s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}

Here,

  • xˉ1,xˉ2\bar{x}_1, \bar{x}_2=Sample means
  • sps_p=Pooled standard deviation
  • n1,n2n_1, n_2=Sample sizes
  • tα/2,n1+n22t_{\alpha/2, n_1+n_2-2}=Critical value with pooled df

Pooled Standard Deviation

sp=(n11)s12+(n21)s22n1+n22s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}

Here,

  • s1,s2s_1, s_2=Sample standard deviations
  • n1,n2n_1, n_2=Sample sizes

Welch's t-Interval (Unequal Variance)

Welch's Approximate Degrees of Freedom

ν=(s12n1+s22n2)2(s12/n1)2n11+(s22/n2)2n21\nu = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}

Here,

  • s12,s22s_1^2, s_2^2=Sample variances
  • n1,n2n_1, n_2=Sample sizes
  • ν\nu=Effective degrees of freedom (not necessarily integer)

When to Use Welch's t

Welch's t-interval is the safer default because it does not assume equal population variances. The equal-variance pooled t-interval is only appropriate when there is strong evidence that σ12=σ22\sigma_1^2 = \sigma_2^2.


Derivation: The Two-Sample Pivot

ThTwo-Sample t-Pivot (Equal Variance)

Let X1,ldots,Xn1simN(mu1,σ2)X_1, \\ldots, X_{n_1} \\sim \mathcal{N}(\\mu_1, \sigma^2) and Y1,ldots,Yn2simN(mu2,σ2)Y_1, \\ldots, Y_{n_2} \\sim \mathcal{N}(\\mu_2, \sigma^2) be independent samples. Then

T=(XˉYˉ)(μ1μ2)Sp1/n1+1/n2tn1+n22T = \frac{(\bar{X} - \bar{Y}) - (\mu_1 - \mu_2)}{S_p\sqrt{1/n_1 + 1/n_2}} \sim t_{n_1+n_2-2}

where Sp2=(n11)S12+(n21)S22n1+n22S_p^2 = \frac{(n_1-1)S_1^2 + (n_2-1)S_2^2}{n_1+n_2-2} is the pooled variance.

Proof sketch: The numerator XˉYˉ(mu1mu2)\bar{X} - \bar{Y} - (\\mu_1 - \\mu_2) is normal with mean 0 and variance σ2(1/n1+1/n2)\sigma^2(1/n_1 + 1/n_2). Standardize: Z=XˉYˉ(mu1mu2)σsqrt1/n1+1/n2simN(0,1)Z = \frac{\bar{X} - \bar{Y} - (\\mu_1 - \\mu_2)}{\sigma\\sqrt{1/n_1 + 1/n_2}} \\sim \mathcal{N}(0,1). By Cochran's theorem, (n11)S12/σ2+(n21)S22/σ2simchin1+n222(n_1-1)S_1^2/\sigma^2 + (n_2-1)S_2^2/\sigma^2 \\sim \\chi^2_{n_1+n_2-2}, independent of Xˉ\bar{X} and Yˉ\bar{Y}. Therefore T=Z/sqrtQ/(n1+n22)T = Z/\\sqrt{Q/(n_1+n_2-2)} where Qsimchin1+n222Q \\sim \\chi^2_{n_1+n_2-2}, giving tn1+n22t_{n_1+n_2-2} by definition.


CI for Difference of Proportions

CI for Difference of Proportions

(p^1p^2)±zα/2p^1(1p^1)n1+p^2(1p^2)n2(\hat{p}_1 - \hat{p}_2) \pm z_{\alpha/2} \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}

Here,

  • p^1,p^2\hat{p}_1, \hat{p}_2=Sample proportions
  • n1,n2n_1, n_2=Sample sizes

Wilson Score Interval

The standard Wald interval above can have poor coverage when p^\hat{p} is near 0 or 1. The Wilson score interval provides better finite-sample coverage:

p^1(1p^1)/n1+p^2(1p^2)/n2+zα/22/(2n1+2n2)(1+zα/22/(n1+n2))2\frac{\hat{p}_1(1-\hat{p}_1)/n_1 + \hat{p}_2(1-\hat{p}_2)/n_2 + z_{\alpha/2}^2/(2n_1 + 2n_2)}{(1 + z_{\alpha/2}^2/(n_1 + n_2))^2}

with analogous adjustments to the center.


Worked Example: Clinical Trial

Two drugs are compared for blood pressure reduction. Drug A: n1=40n_1 = 40, xˉ1=8.2\bar{x}_1 = 8.2 mmHg, s1=3.1s_1 = 3.1. Drug B: n2=35n_2 = 35, xˉ2=6.5\bar{x}_2 = 6.5 mmHg, s2=2.8s_2 = 2.8. Construct a 95% CI for mu1mu2\\mu_1 - \\mu_2.

Step 1: Check if equal variance is reasonable: s12/s22=9.61/7.84=1.23s_1^2/s_2^2 = 9.61/7.84 = 1.23. The ratio is less than 2, so pooled t is reasonable.

Step 2: Compute pooled standard deviation:

sp=39×9.61+34×7.8473=374.79+266.5673=8.786=2.964s_p = \sqrt{\frac{39 \times 9.61 + 34 \times 7.84}{73}} = \sqrt{\frac{374.79 + 266.56}{73}} = \sqrt{8.786} = 2.964

Step 3: Compute standard error:

SE=2.964140+135=2.9640.025+0.02857=2.964×0.2314=0.686SE = 2.964\sqrt{\frac{1}{40} + \frac{1}{35}} = 2.964\sqrt{0.025 + 0.02857} = 2.964 \times 0.2314 = 0.686

Step 4: Critical value: t0.025,73approx1.993t_{0.025, 73} \\approx 1.993 (interpolating from t-table).

Step 5: Construct CI:

(8.26.5)±1.993×0.686=1.7±1.367=[0.333, 3.067](8.2 - 6.5) \pm 1.993 \times 0.686 = 1.7 \pm 1.367 = [0.333, \ 3.067]

Interpretation

Since the 95% CI for mu1mu2\\mu_1 - \\mu_2 does not contain 0, we conclude Drug A provides significantly greater blood pressure reduction than Drug B at the α=0.05\alpha = 0.05 level. The effect size is estimated at 1.7 mmHg with margin of error 1.37 mmHg.


Key Takeaways

Summary: Confidence Intervals for Two Samples

  • Estimates the difference between two population parameters
  • Means (equal variance): pooled t-interval with df=n1+n22df = n_1 + n_2 - 2
  • Means (unequal variance): Welch's t-interval with approximate df (safer default)
  • Proportions: zz-interval based on difference of sample proportions; prefer Wilson score for small nn
  • If the CI includes 0, the difference is not statistically significant at that level
  • The pooled t-interval is not robust to unequal variances; always check or use Welch

Premium Content

Confidence Intervals for Two Samples — Comparing Groups

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement