🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Normal Distribution — The Bell Curve and Its Properties

Foundations of StatisticsProbability Distributions🟢 Free Lesson

Advertisement

Normal Distribution — The Bell Curve and Its Properties

Foundations of Statistics

The Universal Language of Randomness

The normal distribution is the cornerstone of statistical theory and practice, appearing everywhere from natural phenomena to financial markets. Its mathematical elegance makes complex probability calculations tractable and enables powerful inferential techniques.

  • Quality Control — Manufacturing processes use normal distributions to set tolerance limits and detect defects
  • Finance — Asset returns and risk models rely on normal distribution assumptions for portfolio optimization
  • Social Sciences — Test scores, heights, and measurement errors follow approximately normal distributions

Understanding the bell curve unlocks the door to nearly all of classical statistics.


Why the Normal Distribution is Central

The normal (Gaussian) distribution is the most important probability distribution in all of statistics and the natural sciences. Three fundamental reasons account for its centrality:

  1. The Central Limit Theorem guarantees that sums and averages of many independent random variables converge to a normal distribution, regardless of the underlying distribution.
  2. Maximum entropy among all distributions with fixed mean and variance — it is the "least informative" assumption.
  3. Mathematical tractability — closed-form expressions exist for its moments, moment-generating function, and convolutions.

Definition and Probability Density Function

DfNormal Distribution

A continuous random variable XX is said to have a normal distribution with mean μ\mu and variance σ2\sigma^2, written XN(μ,σ2)X \sim \mathcal{N}(\mu, \sigma^2), if its probability density function is:

f(x)=1σ2πexp((xμ)22σ2),xRf(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right), \quad x \in \mathbb{R}

Parameters of the Normal Distribution

f(x)=1σ2πexp((xμ)22σ2)f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)

Here,

  • μ\mu=Location parameter — controls the center of the distribution
  • σ\sigma=Scale parameter — controls the spread (σ > 0)
  • σ2\sigma^2=Variance — the second central moment
  • exp()\exp(\cdot)=Natural exponential function

Fundamental Properties

ThProperties of the Normal Distribution

  1. Symmetry: f(μ+x)=f(μx)f(\mu + x) = f(\mu - x) for all xx. The distribution is symmetric about μ\mu.
  2. Unimodal: The single mode occurs at x=μx = \mu.
  3. Inflection points: The density changes curvature at x=μ±σx = \mu \pm \sigma.
  4. Total probability: f(x)dx=1\int_{-\infty}^{\infty} f(x)\,dx = 1.
  5. Mean = Median = Mode: All three measures of central tendency coincide at μ\mu.
  6. The ±kσ\pm k\sigma rule: P(μkσXμ+kσ)P(\mu - k\sigma \leq X \leq \mu + k\sigma) depends only on kk.

The Normalizing Constant

The factor 1σ2π\frac{1}{\sigma\sqrt{2\pi}} ensures the total area under the curve equals 1. It arises from the Gaussian integral:

et2/2dt=2π\int_{-\infty}^{\infty} e^{-t^2/2}\,dt = \sqrt{2\pi}

This identity is fundamental to probability theory and connects to the Gamma function: Γ(1/2)=π\Gamma(1/2) = \sqrt{\pi}.


The Standard Normal Distribution

DfStandard Normal Distribution

The standard normal is the special case ZN(0,1)Z \sim \mathcal{N}(0, 1), with PDF:

ϕ(z)=12πez2/2\phi(z) = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}

Any normal random variable can be standardized via the transformation Z=XμσZ = \frac{X - \mu}{\sigma}.

Standardization (Z-score transformation)

Z=XμσN(0,1)Z = \frac{X - \mu}{\sigma} \sim \mathcal{N}(0, 1)

Here,

  • ZZ=Standard normal random variable
  • XX=Original normal random variable with X ~ N(μ, σ²)
  • μ\mu=Mean of X
  • σ\sigma=Standard deviation of X

Why Standardization Matters

Standardization converts any normal distribution to the standard normal, enabling the use of a single z-table (cumulative probability table) for all probability calculations. This is the foundation of all normal-based inference.


Interactive Visualization

Normal Distribution — Interactive Explorer
-4-2.9-1.7-0.60.61.72.94x00.090.180.280.370.46f(x)μ = 0.00Normal(0, 1²)
Mean (μ) = 0.0000Var = 1.0000σ = 1.0000

How to Use This Visualization

The interactive visualization above shows the normal distribution PDF. The shaded region represents the 95% probability area (±1.96σ). You can adjust the parameters μ (mean) and σ (standard deviation) to see how they affect the shape of the distribution. The vertical lines show the mean, median, and mode (all equal for the normal distribution).


Cumulative Distribution Function

The CDF of the standard normal has no closed-form expression:

Standard Normal CDF

Φ(z)=P(Zz)=12πzet2/2dt\Phi(z) = P(Z \leq z) = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{z} e^{-t^2/2}\,dt

Here,

  • Φ(z)\Phi(z)=CDF of the standard normal
  • zz=z-score

Key values from the standard normal table:

zzΦ(z)\Phi(z)Interpretation
00.500050% of area is below the mean
10.841384.13% below μ+σ\mu + \sigma
1.6450.950095% below μ+1.645σ\mu + 1.645\sigma (one-sided)
1.9600.975097.50% below μ+1.96σ\mu + 1.96\sigma
20.977297.72% below μ+2σ\mu + 2\sigma
2.5760.995099.50% below μ+2.576σ\mu + 2.576\sigma
30.998799.87% below μ+3σ\mu + 3\sigma
Standard Normal CDF — P(Z ≤ z)

The Empirical Rule (68-95-99.7)

ThEmpirical Rule

For XN(μ,σ2)X \sim \mathcal{N}(\mu, \sigma^2):

P(μσXμ+σ)=2Φ(1)10.6827P(\mu - \sigma \leq X \leq \mu + \sigma) = 2\Phi(1) - 1 \approx 0.6827
P(μ2σXμ+2σ)=2Φ(2)10.9545P(\mu - 2\sigma \leq X \leq \mu + 2\sigma) = 2\Phi(2) - 1 \approx 0.9545
P(μ3σXμ+3σ)=2Φ(3)10.9973P(\mu - 3\sigma \leq X \leq \mu + 3\sigma) = 2\Phi(3) - 1 \approx 0.9973

This is the foundation of the 3σ3\sigma rule: for normally distributed data, 99.7% of observations lie within 3 standard deviations of the mean. Observations beyond this range are potential outliers.


Comparing Normal Distributions

Effect of Standard Deviation on Normal Distribution
-4-2.9-1.7-0.60.61.72.94x00.180.370.550.730.92f(x)μ = 0.00Normal(0, 1²)
Mean (μ) = 0.0000Var = 1.0000σ = 1.0000

Understanding the Spread Parameter

As σ increases, the distribution becomes wider and shorter (more spread out). The area under each curve is still 1, but the probability is distributed over a larger range. This visualization shows why σ controls the "width" of the bell curve.


Moment-Generating Function

Moment-Generating Function of Normal Distribution

MX(t)=E[etX]=exp(μt+σ2t22)M_X(t) = E[e^{tX}] = \exp\left(\mu t + \frac{\sigma^2 t^2}{2}\right)

Here,

  • MX(t)M_X(t)=Moment-generating function
  • μ\mu=Mean
  • σ2\sigma^2=Variance
  • tt=Real parameter (must exist)

Why the MGF is Powerful

The MGF uniquely determines the distribution. If two random variables have the same MGF (in a neighborhood of 0), they have the same distribution. The moments are recovered via E[Xk]=MX(k)(0)E[X^k] = M_X^{(k)}(0) — the kk-th derivative evaluated at t=0t=0.


Reproductive Property

ThLinear Combinations of Normals

If X1N(μ1,σ12)X_1 \sim \mathcal{N}(\mu_1, \sigma_1^2) and X2N(μ2,σ22)X_2 \sim \mathcal{N}(\mu_2, \sigma_2^2) are independent, then:

aX1+bX2N(aμ1+bμ2,  a2σ12+b2σ22)aX_1 + bX_2 \sim \mathcal{N}(a\mu_1 + b\mu_2,\; a^2\sigma_1^2 + b^2\sigma_2^2)

More generally, if XiN(μi,σi2)X_i \sim \mathcal{N}(\mu_i, \sigma_i^2) are independent, then:

i=1naiXiN(aiμi,  ai2σi2)\sum_{i=1}^n a_i X_i \sim \mathcal{N}\left(\sum a_i \mu_i, \; \sum a_i^2 \sigma_i^2\right)

This property is why the normal distribution is so pervasive — sums of normal random variables are always normal, making it closed under linear combinations.


Normal Approximation to the Binomial

Normal Approximation to Binomial

XBin(n,p)    YN(np,np(1p))X \sim \text{Bin}(n, p) \;\approx\; Y \sim \mathcal{N}(np, \, np(1-p))

Here,

  • nn=Number of trials
  • pp=Probability of success
  • npnp=Mean of the binomial
  • np(1p)np(1-p)=Variance of the binomial

The approximation improves as nn increases. A standard rule of thumb: apply when np10np \geq 10 and n(1p)10n(1-p) \geq 10. A continuity correction (±0.5\pm 0.5) improves accuracy for finite nn.


Python Implementation

Example: Working with Normal Distribution

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Create normal distribution
mu, sigma = 0, 1
normal_dist = stats.norm(loc=mu, scale=sigma)

# Calculate statistics
mean = normal_dist.mean()
var = normal_dist.var()
std = normal_dist.std()

print(f"Mean: {mean:.4f}")
print(f"Variance: {var:.4f}")
print(f"Standard Deviation: {std:.4f}")

# Generate random samples
np.random.seed(42)
samples = normal_dist.rvs(size=10000)

# Plot histogram vs theoretical PDF
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=50, density=True, alpha=0.7, label='Samples')
x = np.linspace(-4, 4, 1000)
plt.plot(x, normal_dist.pdf(x), 'r-', lw=2, label='Theoretical PDF')
plt.title('Normal Distribution (μ=0, σ=1)')
plt.xlabel('x')
plt.ylabel('Density')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Calculate probabilities
print(f"\nP(-1 ≤ X ≤ 1) = {normal_dist.cdf(1) - normal_dist.cdf(-1):.4f}")
print(f"P(-2 ≤ X ≤ 2) = {normal_dist.cdf(2) - normal_dist.cdf(-2):.4f}")
print(f"P(-3 ≤ X ≤ 3) = {normal_dist.cdf(3) - normal_dist.cdf(-3):.4f}")

# Percentiles
print(f"\n95th percentile: {normal_dist.ppf(0.95):.4f}")
print(f"99th percentile: {normal_dist.ppf(0.99):.4f}")

Key Takeaways

Summary: Normal Distribution

  • Symmetric, bell-shaped density centered at μ\mu with spread σ\sigma
  • Standardization: Z=(Xμ)/σN(0,1)Z = (X - \mu)/\sigma \sim \mathcal{N}(0,1) — converts any normal to the standard normal
  • Empirical rule: approximately 68%, 95%, 99.7% within 1, 2, 3 standard deviations
  • Reproductive property: linear combinations of independent normals are normal
  • Central Limit Theorem: sums/means of many i.i.d. random variables converge to normal
  • MGF uniquely determines the distribution: MX(t)=exp(μt+σ2t2/2)M_X(t) = \exp(\mu t + \sigma^2 t^2/2)
  • Foundation for inference: z-tests, t-tests, ANOVA, and regression all rely on normality

Premium Content

Normal Distribution — The Bell Curve and Its Properties

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement