🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

The Empirical Rule — 68-95-99.7 for Normal Distributions

Foundations of StatisticsProbability Distributions🟢 Free Lesson

Advertisement

The Empirical Rule — 68-95-99.7 for Normal Distributions

Foundations of Statistics

The 68-95-99.7 Rule Every Analyst Needs

The empirical rule provides instant intuition about data spread in normal distributions, enabling quick assessments without complex calculations. This simple framework is the foundation for outlier detection, quality control, and rapid data analysis.

  • Quality Assurance — Six Sigma methodologies use the rule to identify process deviations
  • Risk Management — Financial institutions apply it to estimate VaR and expected loss ranges
  • Clinical Research — Researchers quickly assess whether patient measurements fall within expected ranges

Three numbers that capture the essence of normal variability.


Core Concepts

The empirical rule provides exact percentages for how probability mass concentrates around the mean in a normal distribution. It is a direct consequence of the Gaussian pdf's structure.

DfEmpirical Rule

For a normal distribution XN(μ,σ2)X \sim N(\mu, \sigma^2):

  • Approximately 68.27% of data falls within μ±σ\mu \pm \sigma
  • Approximately 95.45% of data falls within μ±2σ\mu \pm 2\sigma
  • Approximately 99.73% of data falls within μ±3σ\mu \pm 3\sigma

Empirical Rule (Exact Form)

P(μkσXμ+kσ)=2Φ(k)1P(\mu - k\sigma \leq X \leq \mu + k\sigma) = 2\Phi(k) - 1

Here,

  • μ\mu=Mean of the distribution
  • σ\sigma=Standard deviation
  • kk=Number of standard deviations from mean
  • Φ(k)\Phi(k)=Standard normal CDF evaluated at k

Rigorous Derivation

ThProof of the 68-95-99.7 Rule

For XN(μ,σ2)X \sim N(\mu, \sigma^2), standardize: Z=(Xμ)/σN(0,1)Z = (X - \mu)/\sigma \sim N(0,1).

P(μkσXμ+kσ)=P(kZk)=Φ(k)Φ(k)P(\mu - k\sigma \leq X \leq \mu + k\sigma) = P(-k \leq Z \leq k) = \Phi(k) - \Phi(-k)

By symmetry of the standard normal, Φ(k)=1Φ(k)\Phi(-k) = 1 - \Phi(k):

P(kZk)=Φ(k)(1Φ(k))=2Φ(k)1P(-k \leq Z \leq k) = \Phi(k) - (1 - \Phi(k)) = 2\Phi(k) - 1

Numerical evaluation:

  • k=1k = 1: 2Φ(1)1=2(0.8413)1=0.68272\Phi(1) - 1 = 2(0.8413) - 1 = 0.6827
  • k=2k = 2: 2Φ(2)1=2(0.9772)1=0.95452\Phi(2) - 1 = 2(0.9772) - 1 = 0.9545
  • k=3k = 3: 2Φ(3)1=2(0.9987)1=0.99732\Phi(3) - 1 = 2(0.9987) - 1 = 0.9973

Connection to the Error Function

The normal CDF can be written using the error function:

Φ(z)=12[1+erf(z2)]\Phi(z) = \frac{1}{2}\left[1 + \text{erf}\left(\frac{z}{\sqrt{2}}\right)\right]

So the empirical rule becomes: P(Zk)=erf(k/2)P(|Z| \leq k) = \text{erf}(k/\sqrt{2}). For k=1k=1: erf(1/2)0.6827\text{erf}(1/\sqrt{2}) \approx 0.6827.


Chebyshev's Inequality (The Universal Bound)

Chebyshev's Inequality

P(Xμkσ)1k2P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}

Here,

  • kk=Number of standard deviations (k > 1)
  • μ,σ\mu, \sigma=Mean and std dev of X

ThProof of Chebyshev's Inequality

Let Y=XμY = |X - \mu|. We want P(Ykσ)1/k2P(Y \geq k\sigma) \leq 1/k^2.

By Markov's inequality applied to Y2Y^2 (since Y20Y^2 \geq 0):

P(Y2k2σ2)E[Y2]k2σ2=Var(X)k2σ2=σ2k2σ2=1k2P(Y^2 \geq k^2\sigma^2) \leq \frac{E[Y^2]}{k^2\sigma^2} = \frac{\text{Var}(X)}{k^2\sigma^2} = \frac{\sigma^2}{k^2\sigma^2} = \frac{1}{k^2}

Since Ykσ    Y2k2σ2Y \geq k\sigma \iff Y^2 \geq k^2\sigma^2, we have P(Xμkσ)1/k2P(|X-\mu| \geq k\sigma) \leq 1/k^2.

Empirical Rule vs. Chebyshev

The empirical rule gives tight bounds for normal data, while Chebyshev gives loose but universal bounds for any distribution. For k=2k=2:

  • Normal: P(Xμ2σ)0.9545P(|X-\mu| \leq 2\sigma) \geq 0.9545 (exact)
  • Chebyshev: P(Xμ2σ)11/4=0.75P(|X-\mu| \leq 2\sigma) \geq 1 - 1/4 = 0.75 (guaranteed for any distribution)

The gap between 75% and 95.45% shows how much structure the normal distribution provides.

Empirical Rule vs Chebyshev — P(|X−μ| ≤ kσ)

Higher-Order Concentration: The 6σ Rule

Six Sigma Quality Control

In Six Sigma methodology, "6σ" means P(Xμ>6σ)P(|X - \mu| > 6\sigma). For a normal distribution:

P(Z>6)=2(1Φ(6))=2(10.999999999013)=1.974×109P(|Z| > 6) = 2(1 - \Phi(6)) = 2(1 - 0.999999999013) = 1.974 \times 10^{-9}

This is approximately 2 defects per billion opportunities — the gold standard for manufacturing quality.


Generalization: Higher-Order Concentration Inequalations

ThVysochanskii-Petunin Inequality (Unimodal Case)

For a unimodal distribution with mean μ\mu and finite variance σ2\sigma^2, and k>8/31.633k > \sqrt{8/3} \approx 1.633:

P(Xμkσ)49k2P(|X - \mu| \geq k\sigma) \leq \frac{4}{9k^2}

This is tighter than Chebyshev's 1/k21/k^2 bound for unimodal distributions, but still looser than the empirical rule for normals.


Worked Example

Example: Exam Scores

Suppose exam scores are XN(75,100)X \sim N(75, 100) (mean 75, variance 100, so σ=10\sigma = 10).

Q: What fraction of students score between 65 and 85?

P(65X85)=P(1Z1)=2Φ(1)10.6827P(65 \leq X \leq 85) = P(-1 \leq Z \leq 1) = 2\Phi(1) - 1 \approx 0.6827

So about 68.3% of students score within one standard deviation of the mean.

Q: What fraction score above 95?

P(X>95)=P(Z>2)=1Φ(2)=10.9772=0.0228P(X > 95) = P(Z > 2) = 1 - \Phi(2) = 1 - 0.9772 = 0.0228

About 2.3% score above 95.


Relationship to the Normal Distribution Family

The empirical rule is a special property of the Gaussian family. Other distributions have their own concentration behavior:

  • Laplace (Laplace(μ,b)\text{Laplace}(\mu, b)): P(Xμkσ)=1ek2(1+k2)P(|X-\mu| \leq k\sigma) = 1 - e^{-k\sqrt{2}}(1 + k\sqrt{2})
  • Uniform (Unif(a,b)\text{Unif}(a,b)): P(Xμkσ)=min(1,k3)P(|X-\mu| \leq k\sigma) = \min(1, k\sqrt{3}) — exactly 100% for k3k \geq \sqrt{3}
  • Cauchy: No finite variance exists, so the empirical rule doesn't apply at all

Specific Applications

  1. Six Sigma manufacturing — Defect rates are computed using P(Z>6)P(|Z| > 6) from the empirical rule.
  2. Process capability indicesCpC_p and CpkC_{pk} are defined in terms of kσk\sigma tolerance limits.
  3. Outlier detection — Values beyond 3σ3\sigma are flagged as potential outliers (0.27% false positive rate for normal data).
  4. Standardized testing — IQ scores (μ=100,σ=15\mu=100, \sigma=15): 68% score between 85–115, 95% between 70–130.

Key Takeaways

Summary: Empirical Rule

  • 68.27% within μ±σ\mu \pm \sigma, 95.45% within μ±2σ\mu \pm 2\sigma, 99.73% within μ±3σ\mu \pm 3\sigma
  • Exact formula: P(Zk)=2Φ(k)1=erf(k/2)P(|Z| \leq k) = 2\Phi(k) - 1 = \text{erf}(k/\sqrt{2})
  • Applies only to normal distributions; for arbitrary distributions use Chebyshev (11/k2\geq 1-1/k^2) or Vysochanskii-Petunin (14/(9k2)\geq 1-4/(9k^2) for unimodal)
  • Six Sigma: P(Z>6)2×109P(|Z| > 6) \approx 2 \times 10^{-9} defects per opportunity
  • Foundation for outlier detection, process control, and standardized testing

Premium Content

The Empirical Rule — 68-95-99.7 for Normal Distributions

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement