🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Chi-Square Distribution — Sum of Squared Normals

Foundations of StatisticsSampling Distributions🟢 Free Lesson

Advertisement

Chi-Square Distribution — Sum of Squared Normals

Foundations of Statistics

The Foundation of Variance-Based Inference

The chi-square distribution arises from summing squared normal variables, making it essential for testing variances and independence. Its applications span from quality control to genetics, wherever squared deviations matter.

  • Genetics — Testing Hardy-Weinberg equilibrium in population studies
  • Manufacturing — Quality control through variance testing and goodness-of-fit
  • Market Research — Analyzing survey response patterns against expected distributions

The chi-square distribution connects normal theory to categorical data analysis.


Core Concepts

The chi-square distribution arises as the sum of squared independent standard normal random variables. It is fundamental to tests of variance and independence.

DfChi-Square Distribution

If Z1,Z2,,ZνZ_1, Z_2, \ldots, Z_\nu are independent standard normals, then Q=Z12+Z22++Zν2Q = Z_1^2 + Z_2^2 + \cdots + Z_\nu^2 follows a chi-square distribution with ν\nu degrees of freedom, written Qχν2Q \sim \chi^2_\nu.

PDF of Chi-Square Distribution

f(x)=xν/21ex/22ν/2Γ(ν/2),x>0f(x) = \frac{x^{\nu/2 - 1} e^{-x/2}}{2^{\nu/2} \Gamma(\nu/2)}, \quad x > 0

Here,

  • ν\nu=Degrees of freedom
  • Γ\Gamma=Gamma function

Special Cases

  • χ12\chi^2_1 = square of a single standard normal Z2Z^2
  • χ22\chi^2_2 = exponential distribution with rate 1/21/2 (i.e., 2χ22Exp(1/2)2\chi^2_2 \sim \text{Exp}(1/2))
  • When ν\nu is large, χν2N(ν,2ν)\chi^2_\nu \approx N(\nu, 2\nu) by the CLT

Interactive Visualization

Chi-Square Distribution — Interactive Explorer
02.95.78.611.414.317.120χ²00.040.080.130.170.21f(x)μ = 4.00Md = 3.37Mo = 2.00χ²(k = 4)
Mean (μ) = 4.0000Var = 8.0000σ = 2.8284
Effect of Degrees of Freedom on Chi-Square Distribution
02.95.78.611.414.317.120χ²00.110.220.330.440.56f(x)μ = 4.00Md = 3.37Mo = 2.00χ²(k = 4)
Mean (μ) = 4.0000Var = 8.0000σ = 2.8284

Mean, Variance, and Higher Moments

Chi-Square Mean and Variance

E[χν2]=ν,Var(χν2)=2νE[\chi^2_\nu] = \nu, \quad \text{Var}(\chi^2_\nu) = 2\nu

Here,

  • ν\nu=Degrees of freedom

Proof

Mean: Since E[Zi2]=Var(Zi)+(E[Zi])2=1+0=1E[Z_i^2] = \text{Var}(Z_i) + (E[Z_i])^2 = 1 + 0 = 1, we have E[Q]=i=1νE[Zi2]=νE[Q] = \sum_{i=1}^\nu E[Z_i^2] = \nu.

Variance: Since Zi2Z_i^2 are independent and Var(Zi2)=E[Zi4](E[Zi2])2=31=2\text{Var}(Z_i^2) = E[Z_i^4] - (E[Z_i^2])^2 = 3 - 1 = 2 (using the fourth moment of the normal):

Var(Q)=i=1νVar(Zi2)=2ν\text{Var}(Q) = \sum_{i=1}^\nu \text{Var}(Z_i^2) = 2\nu

The skewness is γ1=8/ν\gamma_1 = \sqrt{8/\nu} and excess kurtosis is γ2=12/ν\gamma_2 = 12/\nu, both decreasing to 0 as ν\nu \to \infty.


Relationship to Normal and Gamma

Connections

  • χν2=Gamma(α=ν/2,β=1/2)\chi^2_\nu = \text{Gamma}(\alpha = \nu/2, \beta = 1/2) where β\beta is the rate parameter
  • (n1)s2σ2χn12\frac{(n-1)s^2}{\sigma^2} \sim \chi^2_{n-1} for normal samples (used in variance tests)
  • Sum of independent chi-squares: if Qiχνi2Q_i \sim \chi^2_{\nu_i} independently, then Qiχνi2\sum Q_i \sim \chi^2_{\sum \nu_i}

Derivation: Sample Variance and Chi-Square

ThDistribution of Sample Variance

If X1,,Xni.i.d.N(μ,σ2)X_1, \ldots, X_n \overset{\text{i.i.d.}}{\sim} N(\mu, \sigma^2), then (n1)s2σ2χn12\frac{(n-1)s^2}{\sigma^2} \sim \chi^2_{n-1}.

Proof Sketch

Step 1. Standardize: Zi=(Xiμ)/σN(0,1)Z_i = (X_i - \mu)/\sigma \sim N(0,1), so Zi2χn2\sum Z_i^2 \sim \chi^2_n.

Step 2. Rewrite using the identity Zi2=nZˉ2+(ZiZˉ)2\sum Z_i^2 = \sqrt{n}\,\bar{Z}^2 + \sum(Z_i - \bar{Z})^2.

Step 3. By Fisher's lemma, Zˉ\bar{Z} and the deviations (ZiZˉ)(Z_i - \bar{Z}) are independent.

Step 4. nZˉN(0,1)\sqrt{n}\,\bar{Z} \sim N(0,1), so nZˉ2χ12\sqrt{n}\,\bar{Z}^2 \sim \chi^2_1. Therefore (ZiZˉ)2=Zi2nZˉ2χn12\sum(Z_i - \bar{Z})^2 = \sum Z_i^2 - \sqrt{n}\,\bar{Z}^2 \sim \chi^2_{n-1} by the reproductive property.

Step 5. Since s2=σ2n1(ZiZˉ)2s^2 = \frac{\sigma^2}{n-1}\sum(Z_i - \bar{Z})^2, we get (n1)s2σ2χn12\frac{(n-1)s^2}{\sigma^2} \sim \chi^2_{n-1}.


Worked Example

A quality engineer tests whether a filling machine has variance σ2=4\sigma^2 = 4 mL2^2. From n=25n = 25 bottles, s2=6.3s^2 = 6.3 mL2^2.

Step 1. Compute the chi-square test statistic:

Q=(n1)s2σ02=24×6.34=37.8Q = \frac{(n-1)s^2}{\sigma_0^2} = \frac{24 \times 6.3}{4} = 37.8

Step 2. Under H0H_0, Qχ242Q \sim \chi^2_{24} with E[Q]=24E[Q] = 24 and Var(Q)=48\text{Var}(Q) = 48, so SD(Q)=6.93\text{SD}(Q) = 6.93.

Step 3. The observed value 37.8 is (37.824)/6.93=1.99(37.8 - 24)/6.93 = 1.99 standard deviations above the mean.

Step 4. For a two-sided test at α=0.05\alpha = 0.05, the critical values are χ0.975,242=12.401\chi^2_{0.975, 24} = 12.401 and χ0.025,242=39.364\chi^2_{0.025, 24} = 39.364. Since 37.8<39.36437.8 < 39.364, we fail to reject H0H_0.

Step 5. The chi-square distribution is asymmetric, so the test is inherently two-sided in a different way than z/t tests. The p-value is 2×min(P(Q37.8),P(Q>37.8))2 \times \min(P(Q \leq 37.8), P(Q > 37.8)).

Asymmetry Matters

Unlike the normal distribution, the chi-square is right-skewed. The two-sided rejection region is not symmetric about ν\nu. For ν=24\nu = 24, the lower critical value is 12.4 but the upper is 39.4 — far from symmetric about 24.


Normal Approximation

ThWilson–Hilferty Transformation

For large ν\nu, the chi-square distribution can be approximated by a normal via:

(χν2ν)1/3N(129ν,  29ν)\left(\frac{\chi^2_\nu}{\nu}\right)^{1/3} \approx N\left(1 - \frac{2}{9\nu},\; \frac{2}{9\nu}\right)

This approximation is remarkably accurate even for ν\nu as small as 5.


Key Takeaways

Summary: Chi-Square Distribution

  • Sum of ν\nu squared independent standard normals: χν2\chi^2_\nu
  • Mean: ν\nu, Variance: 2ν2\nu, Skewness: 8/ν\sqrt{8/\nu}
  • Always positive and right-skewed; approaches normal for large ν\nu
  • Used in tests of variance: (n1)s2/σ2χn12(n-1)s^2/\sigma^2 \sim \chi^2_{n-1}
  • Foundation for chi-square tests of independence and goodness-of-fit
  • Reproductive property: sum of independent chi-squares is chi-square

Premium Content

Chi-Square Distribution — Sum of Squared Normals

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement