Sampling Distribution of the Mean — Foundation of Inference
Foundations of Statistics
Why Sample Means Behave Predictably
The sampling distribution of the mean explains why repeated samples give consistent results and underpins all confidence intervals and hypothesis tests. Understanding this concept is the key to mastering statistical inference.
- Clinical Trials — Determining whether drug effects are real or due to sampling variation
- Polling — Estimating margin of error for survey results
- Manufacturing — Quality control through process monitoring and control charts
The sampling distribution transforms individual randomness into collective predictability.
Core Concepts
The sampling distribution of the mean describes how the sample mean varies across all possible samples of size .
DfSampling Distribution of the Mean
The sampling distribution of is the probability distribution of the statistic computed over all possible samples of size from a population. It is the theoretical basis for confidence intervals and hypothesis tests.
Mean and Standard Error
Here,
- =Population mean
- =Population standard deviation
- =Sample size
- =Standard error of the mean
Key Insight
The standard error decreases as increases — larger samples give more precise estimates of the mean.
Central Limit Theorem
ThCentral Limit Theorem (CLT)
For any population with mean and finite variance , let be i.i.d. random variables. Then as :
Equivalently, for large , regardless of the population distribution shape.
Proof Sketch (Lindeberg–Lévy CLT)
Step 1. Assume and . Define , so and .
Step 2. The moment generating function of is .
Step 3. Taylor-expand: as . Thus .
Step 4. Since is the MGF of , by Lévy's continuity theorem, .
Rule of Thumb
The CLT approximation is generally valid when . For highly skewed or heavy-tailed populations, larger may be needed. The Berry–Esseen theorem quantifies the rate: where .
Formal Properties of
ThUnbiasedness and Minimum Variance
The sample mean is an unbiased estimator of : . Moreover, among all linear unbiased estimators, has the minimum variance (Gauss–Markov theorem for the i.i.d. case).
Proof Sketch
Unbiasedness: .
Variance: by independence. Any other linear combination with has variance by Cauchy–Schwarz, with equality iff all .
Worked Example
Suppose the heights of adult males in a city are normally distributed with cm and cm. A researcher samples men.
Step 1. The sampling distribution of is exactly:
Step 2. The standard error is cm.
Step 3. Probability the sample mean exceeds 177 cm:
Step 4. Even though individual heights have , the sample mean of 64 observations has . The sampling distribution is 8 times narrower than the population distribution — a direct consequence of averaging.
Key Takeaways
Summary: Sampling Distribution of the Mean
- Describes variability of across all samples of size
- Mean: , Standard Error:
- CLT: for large (typically )
- Standard error decreases with — larger samples are more precise
- is the UMVUE (uniformly minimum variance unbiased estimator) of under normality
- Foundation for all confidence intervals and hypothesis tests about the mean