Normal Distribution — The Bell Curve and Its Properties
Foundations of Statistics
The Universal Language of Randomness
The normal distribution is the cornerstone of statistical theory and practice, appearing everywhere from natural phenomena to financial markets. Its mathematical elegance makes complex probability calculations tractable and enables powerful inferential techniques.
- Quality Control — Manufacturing processes use normal distributions to set tolerance limits and detect defects
- Finance — Asset returns and risk models rely on normal distribution assumptions for portfolio optimization
- Social Sciences — Test scores, heights, and measurement errors follow approximately normal distributions
Understanding the bell curve unlocks the door to nearly all of classical statistics.
Why the Normal Distribution is Central
The normal (Gaussian) distribution is the most important probability distribution in all of statistics and the natural sciences. Three fundamental reasons account for its centrality:
- The Central Limit Theorem guarantees that sums and averages of many independent random variables converge to a normal distribution, regardless of the underlying distribution.
- Maximum entropy among all distributions with fixed mean and variance — it is the "least informative" assumption.
- Mathematical tractability — closed-form expressions exist for its moments, moment-generating function, and convolutions.
Definition and Probability Density Function
DfNormal Distribution
A continuous random variable is said to have a normal distribution with mean and variance , written , if its probability density function is:
Parameters of the Normal Distribution
Here,
- =Location parameter — controls the center of the distribution
- =Scale parameter — controls the spread (σ > 0)
- =Variance — the second central moment
- =Natural exponential function
Fundamental Properties
ThProperties of the Normal Distribution
- Symmetry: for all . The distribution is symmetric about .
- Unimodal: The single mode occurs at .
- Inflection points: The density changes curvature at .
- Total probability: .
- Mean = Median = Mode: All three measures of central tendency coincide at .
- The rule: depends only on .
The Normalizing Constant
The factor ensures the total area under the curve equals 1. It arises from the Gaussian integral:
This identity is fundamental to probability theory and connects to the Gamma function: .
The Standard Normal Distribution
DfStandard Normal Distribution
The standard normal is the special case , with PDF:
Any normal random variable can be standardized via the transformation .
Standardization (Z-score transformation)
Here,
- =Standard normal random variable
- =Original normal random variable with X ~ N(μ, σ²)
- =Mean of X
- =Standard deviation of X
Why Standardization Matters
Standardization converts any normal distribution to the standard normal, enabling the use of a single z-table (cumulative probability table) for all probability calculations. This is the foundation of all normal-based inference.
Interactive Visualization
How to Use This Visualization
The interactive visualization above shows the normal distribution PDF. The shaded region represents the 95% probability area (±1.96σ). You can adjust the parameters μ (mean) and σ (standard deviation) to see how they affect the shape of the distribution. The vertical lines show the mean, median, and mode (all equal for the normal distribution).
Cumulative Distribution Function
The CDF of the standard normal has no closed-form expression:
Standard Normal CDF
Here,
- =CDF of the standard normal
- =z-score
Key values from the standard normal table:
| Interpretation | ||
|---|---|---|
| 0 | 0.5000 | 50% of area is below the mean |
| 1 | 0.8413 | 84.13% below |
| 1.645 | 0.9500 | 95% below (one-sided) |
| 1.960 | 0.9750 | 97.50% below |
| 2 | 0.9772 | 97.72% below |
| 2.576 | 0.9950 | 99.50% below |
| 3 | 0.9987 | 99.87% below |
The Empirical Rule (68-95-99.7)
ThEmpirical Rule
For :
This is the foundation of the rule: for normally distributed data, 99.7% of observations lie within 3 standard deviations of the mean. Observations beyond this range are potential outliers.
Comparing Normal Distributions
Understanding the Spread Parameter
As σ increases, the distribution becomes wider and shorter (more spread out). The area under each curve is still 1, but the probability is distributed over a larger range. This visualization shows why σ controls the "width" of the bell curve.
Moment-Generating Function
Moment-Generating Function of Normal Distribution
Here,
- =Moment-generating function
- =Mean
- =Variance
- =Real parameter (must exist)
Why the MGF is Powerful
The MGF uniquely determines the distribution. If two random variables have the same MGF (in a neighborhood of 0), they have the same distribution. The moments are recovered via — the -th derivative evaluated at .
Reproductive Property
ThLinear Combinations of Normals
If and are independent, then:
More generally, if are independent, then:
This property is why the normal distribution is so pervasive — sums of normal random variables are always normal, making it closed under linear combinations.
Normal Approximation to the Binomial
Normal Approximation to Binomial
Here,
- =Number of trials
- =Probability of success
- =Mean of the binomial
- =Variance of the binomial
The approximation improves as increases. A standard rule of thumb: apply when and . A continuity correction () improves accuracy for finite .
Python Implementation
Example: Working with Normal Distribution
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
# Create normal distribution
mu, sigma = 0, 1
normal_dist = stats.norm(loc=mu, scale=sigma)
# Calculate statistics
mean = normal_dist.mean()
var = normal_dist.var()
std = normal_dist.std()
print(f"Mean: {mean:.4f}")
print(f"Variance: {var:.4f}")
print(f"Standard Deviation: {std:.4f}")
# Generate random samples
np.random.seed(42)
samples = normal_dist.rvs(size=10000)
# Plot histogram vs theoretical PDF
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=50, density=True, alpha=0.7, label='Samples')
x = np.linspace(-4, 4, 1000)
plt.plot(x, normal_dist.pdf(x), 'r-', lw=2, label='Theoretical PDF')
plt.title('Normal Distribution (μ=0, σ=1)')
plt.xlabel('x')
plt.ylabel('Density')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Calculate probabilities
print(f"\nP(-1 ≤ X ≤ 1) = {normal_dist.cdf(1) - normal_dist.cdf(-1):.4f}")
print(f"P(-2 ≤ X ≤ 2) = {normal_dist.cdf(2) - normal_dist.cdf(-2):.4f}")
print(f"P(-3 ≤ X ≤ 3) = {normal_dist.cdf(3) - normal_dist.cdf(-3):.4f}")
# Percentiles
print(f"\n95th percentile: {normal_dist.ppf(0.95):.4f}")
print(f"99th percentile: {normal_dist.ppf(0.99):.4f}")
Key Takeaways
Summary: Normal Distribution
- Symmetric, bell-shaped density centered at with spread
- Standardization: — converts any normal to the standard normal
- Empirical rule: approximately 68%, 95%, 99.7% within 1, 2, 3 standard deviations
- Reproductive property: linear combinations of independent normals are normal
- Central Limit Theorem: sums/means of many i.i.d. random variables converge to normal
- MGF uniquely determines the distribution:
- Foundation for inference: z-tests, t-tests, ANOVA, and regression all rely on normality