t-Distribution — When σ is Unknown
Foundations of Statistics
The Real-World Workhorse for Means
The t-distribution accounts for the extra uncertainty when estimating σ with s, making it the standard for real-world mean comparisons. Its heavier tails provide more conservative inference than the normal distribution.
- Quality Control — Comparing process means when population variance is unknown
- Clinical Research — Testing treatment effects with small sample sizes
- Business Analytics — A/B testing with limited data to make faster decisions
When σ is unknown, the t-distribution is your trusted companion.
Core Concepts
The t-distribution arises when we estimate the population standard deviation with the sample standard deviation . It has heavier tails than the normal, reflecting additional uncertainty from estimating .
Dft-Distribution
Let and be independent. Then follows a t-distribution with degrees of freedom, written .
PDF of t-Distribution
Here,
- =Degrees of freedom
- =Gamma function
Heavy Tails
The t-distribution has heavier tails than the normal, meaning more probability in the extremes. This reflects the additional uncertainty from estimating . As , the t-distribution approaches .
Interactive Visualization
Derivation: Why the t-Distribution Appears
ThOrigin of the t-Statistic
If , then:
where .
Proof Sketch
Step 1. Define by properties of the normal.
Step 2. By Fisher's lemma, and are independent for normal samples. Moreover, .
Step 3. Therefore , which is the definition of .
The independence of and is specific to the normal distribution — it fails for other distributions, which is why the t-test is not robust to non-normality for small .
Degrees of Freedom and Tail Behavior
t-Statistic
Here,
- =Sample mean
- =Hypothesized population mean
- =Sample standard deviation
- =Sample size
- =Degrees of freedom
Why Degrees of Freedom Matter
With degrees of freedom, the estimator uses independent pieces of information (one is lost estimating ). Fewer degrees of freedom means more uncertainty about , hence heavier tails. The variance of is for , which exceeds 1 (the normal variance) and decreases to 1 as .
Critical Values
Common t-Critical Values
| (95%) | (99%) | (normal) | |
|---|---|---|---|
| 5 | 2.571 | 4.032 | 1.960 |
| 10 | 2.228 | 3.169 | 1.960 |
| 29 | 2.045 | 2.756 | 1.960 |
| 100 | 1.984 | 2.626 | 1.960 |
| 1.960 | 2.576 | 1.960 |
As increases, t-critical values converge to z-critical values. The difference is substantial for small .
Worked Example
A biochemist measures enzyme reaction rates (in μmol/min) for samples: , . Test vs at .
Step 1. Compute the t-statistic:
Step 2. With degrees of freedom, the critical values are .
Step 3. Since , we fail to reject . The observed difference is not statistically significant at the 5% level.
Step 4. For comparison, if we had used the normal approximation: with critical value . We would still fail to reject, but the normal approximation underestimates the tail probability. The exact p-value from is 0.133, while the normal approximation gives 0.113.
Small Sample Consequence
With , the t-distribution is substantially wider than the normal. Using the normal approximation would underestimate the p-value by about 15% in this case. Always use the t-distribution when is unknown and is small.
Convergence to Normal
ThAsymptotic Normality of t
As , . More precisely, by Slutsky's theorem:
since by the law of large numbers.
Key Takeaways
Summary: t-Distribution
- Used when is unknown and estimated by
- for normal populations
- Heavier tails than normal (more uncertainty); approaches normal as
- Degrees of freedom: (one lost estimating )
- The variance of is for , always
- Foundation for t-tests and t-intervals for the mean
- Derived from the independence of and under normality