🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Confidence Intervals for Variance — Chi-Square Interval

Foundations of StatisticsConfidence Intervals🟢 Free Lesson

Advertisement

Confidence Intervals for Variance — Chi-Square Interval

Foundations of Statistics

Quantifying Uncertainty in Variability

Variance intervals use the chi-square distribution's asymmetry, producing unequal bounds around the point estimate. Understanding this asymmetry is crucial for interpreting precision in variability estimates.

  • Manufacturing — Assessing process consistency and setting tolerance specifications
  • Finance — Estimating volatility ranges for risk management
  • Quality Engineering — Monitoring measurement system variability

Variance intervals reveal that precision itself is uncertain.


Core Concepts

Confidence intervals for variance use the chi-square distribution. Unlike intervals for the mean, these intervals are asymmetric — the lower and upper bounds are not equidistant from the point estimate.

DfChi-Square Confidence Interval for Variance

A (1α)×100(1-\alpha)\times 100\\% confidence interval for the population variance σ2\sigma^2 is based on the pivotal quantity (n1)s2/σ2simchin12(n-1)s^2/\sigma^2 \\sim \\chi^2_{n-1}.

Confidence Interval for Variance

[(n1)s2χα/2,n12,(n1)s2χ1α/2,n12]\left[\frac{(n-1)s^2}{\chi^2_{\alpha/2, n-1}}, \quad \frac{(n-1)s^2}{\chi^2_{1-\alpha/2, n-1}}\right]

Here,

  • nn=Sample size
  • s2s^2=Sample variance
  • χα/2,n12\chi^2_{\alpha/2, n-1}=Upper critical value
  • χ1α/2,n12\chi^2_{1-\alpha/2, n-1}=Lower critical value

Asymmetric Intervals

The chi-square interval is not symmetric about s2s^2. The lower tail critical value is always closer to 0 than the upper tail is to 2(n1)2(n-1), making the interval wider on the right.


Confidence Interval for Standard Deviation

CI for Standard Deviation

[(n1)s2χα/2,n12,(n1)s2χ1α/2,n12]\left[\sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2, n-1}}}, \quad \sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2, n-1}}}\right]

Here,

  • ss=Sample standard deviation
  • nn=Sample size

Derivation from the Sampling Distribution

ThChi-Square Pivot for Variance

Let X1,X2,ldots,XnX_1, X_2, \\ldots, X_n be i.i.d. N(mu,σ2)\mathcal{N}(\\mu, \sigma^2). Define the sample variance S2=1n1i=1n(XiXˉ)2S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(X_i - \bar{X})^2. Then the pivotal quantity

Q=(n1)S2σ2χn12Q = \frac{(n-1)S^2}{\sigma^2} \sim \chi^2_{n-1}

has a chi-square distribution with n1n-1 degrees of freedom, independent of mu\\mu.

Proof sketch: Standardize each observation: Zi=(Ximu)/σsimN(0,1)Z_i = (X_i - \\mu)/\sigma \\sim \mathcal{N}(0,1). The sum of squared standard normals is Zi2simchin2\sum Z_i^2 \\sim \\chi^2_n. Decompose using Cochran's theorem: Zi2=(XiXˉ)2/σ2+(Xˉmu)2n/σ2\sum Z_i^2 = \sum(X_i - \bar{X})^2/\sigma^2 + (\bar{X} - \\mu)^2 n/\sigma^2. The first term on the right is (n1)S2/σ2(n-1)S^2/\sigma^2 and the second is chi12\\chi^2_1. By independence (since Xˉ\bar{X} is sufficient for mu\\mu and S2S^2 is sufficient for σ2\sigma^2 in the normal family), the first term is chin12\\chi^2_{n-1}.

Why the Chi-Square Distribution Appears

The chi-square distribution arises as the distribution of a sum of squared independent standard normals. The key insight is that the sample variance, when properly scaled, is such a sum — but the degrees of freedom are reduced by 1 because Xˉ\bar{X} is estimated from the data rather than known.


Worked Example: Quality Control

A quality control engineer measures the diameter of 25 ball bearings. The sample variance is s2=0.0036mm2s^2 = 0.0036\\ \text{mm}^2. Construct a 95% CI for σ2\sigma^2.

Step 1: Identify parameters: n=25n = 25, s2=0.0036s^2 = 0.0036, α=0.05\alpha = 0.05, df=24df = 24.

Step 2: Find chi-square critical values:

χ0.025,242=39.364,χ0.975,242=12.401\chi^2_{0.025, 24} = 39.364, \quad \chi^2_{0.975, 24} = 12.401

Step 3: Compute the interval:

[24×0.003639.364,24×0.003612.401]=[0.00219,0.00697]\left[\frac{24 \times 0.0036}{39.364}, \quad \frac{24 \times 0.0036}{12.401}\right] = [0.00219, \quad 0.00697]

Step 4: For standard deviation, take square roots:

[0.00219,0.00697]=[0.0468,0.0835] mm[\sqrt{0.00219}, \quad \sqrt{0.00697}] = [0.0468, \quad 0.0835]\ \text{mm}

Interpretation

We are 95% confident that the true variance lies in [0.00219,0.00697][0.00219, 0.00697] and the true standard deviation lies in [0.0468,0.0835][0.0468, 0.0835]. Note the asymmetry: the upper bound is 3.18×3.18\times the lower bound for the variance, not symmetric about s2=0.0036s^2 = 0.0036.


Sensitivity to Non-Normality

ThRobustness Failure of Chi-Square CI

The chi-square confidence interval for σ2\sigma^2 is not robust to departures from normality. If the underlying distribution has excess kurtosis kappa4>0\\kappa_4 > 0, the actual coverage probability can be substantially lower than the nominal 1α1 - \alpha.

Proof sketch: For a non-normal population with kurtosis kappa4\\kappa_4, the statistic (n1)S2/σ2(n-1)S^2/\sigma^2 no longer follows exactly chin12\\chi^2_{n-1}. A Cornish-Fisher expansion shows the leading correction is proportional to kappa4/n\\kappa_4 / n. For heavy-tailed distributions (e.g., t5t_5 with kappa4=6\\kappa_4 = 6), the true coverage can be 90% when 95% is nominal.

Practical Consequence

Unlike confidence intervals for the mean (which are robust via the CLT), the variance interval requires normality. With skewed or heavy-tailed data, use bootstrap methods instead.


Python Implementation: Bootstrap Comparison

import numpy as np
from scipy import stats

np.random.seed(42)
n = 25
sigma_true = 1.0
data = np.random.normal(loc=0.0, scale=sigma_true, size=n)
s2 = np.var(data, ddof=1)

# Chi-square CI (parametric)
chi2_low = stats.chi2.ppf(0.975, df=n-1)
chi2_high = stats.chi2.ppf(0.025, df=n-1)
ci_parametric = [(n-1)*s2 / chi2_low, (n-1)*s2 / chi2_high]
print(f"Parametric CI for σ²: [{ci_parametric[0]:.4f}, {ci_parametric[1]:.4f}]")

# Bootstrap CI (non-parametric)
B = 10000
boot_vars = np.array([np.var(np.random.choice(data, size=n, replace=True), ddof=1)
                      for _ in range(B)])
ci_bootstrap = np.percentile(boot_vars, [2.5, 97.5])
print(f"Bootstrap CI for σ²:  [{ci_bootstrap[0]:.4f}, {ci_bootstrap[1]:.4f}]")

# Compare coverage (repeat 1000 times)
coverage_param = 0
coverage_boot = 0
M = 1000
for _ in range(M):
    sample = np.random.normal(0, sigma_true, n)
    sv = np.var(sample, ddof=1)
    chi2_lo = stats.chi2.ppf(0.975, n-1)
    chi2_hi = stats.chi2.ppf(0.025, n-1)
    lo_p, hi_p = (n-1)*sv/chi2_lo, (n-1)*sv/chi2_hi
    if lo_p <= sigma_true**2 <= hi_p:
        coverage_param += 1
    boot_v = np.array([np.var(np.random.choice(sample, n, replace=True), ddof=1)
                       for _ in range(1000)])
    lo_b, hi_b = np.percentile(boot_v, [2.5, 97.5])
    if lo_b <= sigma_true**2 <= hi_b:
        coverage_boot += 1
print(f"Parametric coverage: {coverage_param/M:.3f}")
print(f"Bootstrap coverage:  {coverage_boot/M:.3f}")

Key Takeaways

Summary: Confidence Intervals for Variance

  • Based on (n1)s2/σ2simchin12(n-1)s^2/\sigma^2 \\sim \\chi^2_{n-1}
  • Asymmetric interval: lower and upper bounds are not equidistant from s2s^2
  • Requires the population to be normally distributed (sensitive to non-normality)
  • For standard deviation, take square roots of the variance interval endpoints
  • Not robust to kurtosis: heavy tails cause under-coverage; prefer bootstrap for non-normal data
  • Critical values satisfy chiα/2,nu2+chi1α/2,nu2nu\\chi^2_{\alpha/2,\\nu} + \\chi^2_{1-\alpha/2,\\nu} \neq \\nu — this is why the interval is asymmetric

Premium Content

Confidence Intervals for Variance — Chi-Square Interval

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement