The Empirical Rule — 68-95-99.7 for Normal Distributions

Foundations of Statistics

The 68-95-99.7 Rule Every Analyst Needs

The empirical rule provides instant intuition about data spread in normal distributions, enabling quick assessments without complex calculations. This simple framework is the foundation for outlier detection, quality control, and rapid data analysis.

Quality Assurance — Six Sigma methodologies use the rule to identify process deviations
Risk Management — Financial institutions apply it to estimate VaR and expected loss ranges
Clinical Research — Researchers quickly assess whether patient measurements fall within expected ranges

Three numbers that capture the essence of normal variability.

Core Concepts

The empirical rule provides exact percentages for how probability mass concentrates around the mean in a normal distribution. It is a direct consequence of the Gaussian pdf's structure.

DfEmpirical Rule

For a normal distribution $X \sim N(\mu, \sigma^2)$ :

Approximately 68.27% of data falls within $\mu \pm \sigma$
Approximately 95.45% of data falls within $\mu \pm 2\sigma$
Approximately 99.73% of data falls within $\mu \pm 3\sigma$

Empirical Rule (Exact Form)

P(\mu - k\sigma \leq X \leq \mu + k\sigma) = 2\Phi(k) - 1

Here,

$\mu$ =Mean of the distribution
$\sigma$ =Standard deviation
$k$ =Number of standard deviations from mean
$\Phi(k)$ =Standard normal CDF evaluated at k

Rigorous Derivation

ThProof of the 68-95-99.7 Rule

For $X \sim N(\mu, \sigma^2)$ , standardize: $Z = (X - \mu)/\sigma \sim N(0,1)$ .

P(\mu - k\sigma \leq X \leq \mu + k\sigma) = P(-k \leq Z \leq k) = \Phi(k) - \Phi(-k)

By symmetry of the standard normal, $\Phi(-k) = 1 - \Phi(k)$ :

P(-k \leq Z \leq k) = \Phi(k) - (1 - \Phi(k)) = 2\Phi(k) - 1

Numerical evaluation:

$k = 1$ : $2\Phi(1) - 1 = 2(0.8413) - 1 = 0.6827$
$k = 2$ : $2\Phi(2) - 1 = 2(0.9772) - 1 = 0.9545$
$k = 3$ : $2\Phi(3) - 1 = 2(0.9987) - 1 = 0.9973$

Connection to the Error Function

The normal CDF can be written using the error function:

\Phi(z) = \frac{1}{2}\left[1 + \text{erf}\left(\frac{z}{\sqrt{2}}\right)\right]

So the empirical rule becomes: $P(|Z| \leq k) = \text{erf}(k/\sqrt{2})$ . For $k=1$ : $\text{erf}(1/\sqrt{2}) \approx 0.6827$ .

Chebyshev's Inequality (The Universal Bound)

Chebyshev's Inequality

P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}

Here,

$k$ =Number of standard deviations (k > 1)
$\mu, \sigma$ =Mean and std dev of X

ThProof of Chebyshev's Inequality

Let $Y = |X - \mu|$ . We want $P(Y \geq k\sigma) \leq 1/k^2$ .

By Markov's inequality applied to $Y^2$ (since $Y^2 \geq 0$ ):

P(Y^2 \geq k^2\sigma^2) \leq \frac{E[Y^2]}{k^2\sigma^2} = \frac{\text{Var}(X)}{k^2\sigma^2} = \frac{\sigma^2}{k^2\sigma^2} = \frac{1}{k^2}

Since $Y \geq k\sigma \iff Y^2 \geq k^2\sigma^2$ , we have $P(|X-\mu| \geq k\sigma) \leq 1/k^2$ .

Empirical Rule vs. Chebyshev

The empirical rule gives tight bounds for normal data, while Chebyshev gives loose but universal bounds for any distribution. For $k=2$ :

Normal: $P(|X-\mu| \leq 2\sigma) \geq 0.9545$ (exact)
Chebyshev: $P(|X-\mu| \leq 2\sigma) \geq 1 - 1/4 = 0.75$ (guaranteed for any distribution)

The gap between 75% and 95.45% shows how much structure the normal distribution provides.

Empirical Rule vs Chebyshev — P(|X−μ| ≤ kσ)

Higher-Order Concentration: The 6σ Rule

Six Sigma Quality Control

In Six Sigma methodology, "6σ" means $P(|X - \mu| > 6\sigma)$ . For a normal distribution:

P(|Z| > 6) = 2(1 - \Phi(6)) = 2(1 - 0.999999999013) = 1.974 \times 10^{-9}

This is approximately 2 defects per billion opportunities — the gold standard for manufacturing quality.

Generalization: Higher-Order Concentration Inequalations

ThVysochanskii-Petunin Inequality (Unimodal Case)

For a unimodal distribution with mean $\mu$ and finite variance $\sigma^2$ , and $k > \sqrt{8/3} \approx 1.633$ :

P(|X - \mu| \geq k\sigma) \leq \frac{4}{9k^2}

This is tighter than Chebyshev's $1/k^2$ bound for unimodal distributions, but still looser than the empirical rule for normals.

Worked Example

Example: Exam Scores

Suppose exam scores are $X \sim N(75, 100)$ (mean 75, variance 100, so $\sigma = 10$ ).

Q: What fraction of students score between 65 and 85?

P(65 \leq X \leq 85) = P(-1 \leq Z \leq 1) = 2\Phi(1) - 1 \approx 0.6827

So about 68.3% of students score within one standard deviation of the mean.

Q: What fraction score above 95?

P(X > 95) = P(Z > 2) = 1 - \Phi(2) = 1 - 0.9772 = 0.0228

About 2.3% score above 95.

Relationship to the Normal Distribution Family

The empirical rule is a special property of the Gaussian family. Other distributions have their own concentration behavior:

Laplace ( $\text{Laplace}(\mu, b)$ ): $P(|X-\mu| \leq k\sigma) = 1 - e^{-k\sqrt{2}}(1 + k\sqrt{2})$
Uniform ( $\text{Unif}(a,b)$ ): $P(|X-\mu| \leq k\sigma) = \min(1, k\sqrt{3})$ — exactly 100% for $k \geq \sqrt{3}$
Cauchy: No finite variance exists, so the empirical rule doesn't apply at all

Specific Applications

Six Sigma manufacturing — Defect rates are computed using $P(|Z| > 6)$ from the empirical rule.
Process capability indices — $C_p$ and $C_{pk}$ are defined in terms of $k\sigma$ tolerance limits.
Outlier detection — Values beyond $3\sigma$ are flagged as potential outliers (0.27% false positive rate for normal data).
Standardized testing — IQ scores ( $\mu=100, \sigma=15$ ): 68% score between 85–115, 95% between 70–130.

Key Takeaways

Summary: Empirical Rule

68.27% within $\mu \pm \sigma$ , 95.45% within $\mu \pm 2\sigma$ , 99.73% within $\mu \pm 3\sigma$
Exact formula: $P(|Z| \leq k) = 2\Phi(k) - 1 = \text{erf}(k/\sqrt{2})$
Applies only to normal distributions; for arbitrary distributions use Chebyshev ( $\geq 1-1/k^2$ ) or Vysochanskii-Petunin ( $\geq 1-4/(9k^2)$ for unimodal)
Six Sigma: $P(|Z| > 6) \approx 2 \times 10^{-9}$ defects per opportunity
Foundation for outlier detection, process control, and standardized testing

The Empirical Rule — 68-95-99.7 for Normal Distributions

The Empirical Rule — 68-95-99.7 for Normal Distributions

The 68-95-99.7 Rule Every Analyst Needs

Core Concepts

DfEmpirical Rule

Empirical Rule (Exact Form)

Rigorous Derivation

ThProof of the 68-95-99.7 Rule

Chebyshev's Inequality (The Universal Bound)

Chebyshev's Inequality

ThProof of Chebyshev's Inequality

Higher-Order Concentration: The 6σ Rule

Generalization: Higher-Order Concentration Inequalations

ThVysochanskii-Petunin Inequality (Unimodal Case)

Worked Example

Relationship to the Normal Distribution Family

Specific Applications

Key Takeaways

Summary: Empirical Rule

Premium Content

Need Expert Statistics Help?