Variance of a Random Variable

Probability Theory

Measuring Spread — How Far Values Deviate from the Mean

Variance quantifies the average squared deviation of a random variable from its mean. It is the single most important measure of dispersion in all of statistics.

Foundation — variance underpins standard deviation, covariance, correlation, and every statistical test
Chebyshev's inequality — bounds tail probabilities using only mean and variance
Portfolio theory — in finance, variance equals risk; investors minimize variance
Quality control — Six Sigma reduces process variance to achieve near-perfection

Without variance, we cannot quantify uncertainty — and without uncertainty, statistics has no purpose.

What is Variance?

Definition

Variance is the expected squared deviation of a random variable from its mean. It measures the average spread of the distribution around its center.

"The variance is the moment of inertia of the probability distribution about its center of mass." — Persi Diaconis

Mathematical Formulation

Definition of Variance

\text{Var}(X) = E\!\left[(X - \mu)^2\right] \quad \text{where } \mu = E[X]

Here,

$X$ =Random variable
$\mu$ =Mean (expected value) of X
$(X - \mu)^2$ =Squared deviation from the mean

Computational Formula

\text{Var}(X) = E[X^2] - (E[X])^2

Here,

$E[X^2]$ =Second raw moment of X
$(E[X])^2$ =Square of the first moment

Derivation of the Computational Formula

Expand the definition directly:

E[(X-\mu)^2] = E[X^2 - 2\mu X + \mu^2] = E[X^2] - 2\mu\,E[X] + \mu^2 = E[X^2] - 2\mu^2 + \mu^2 = E[X^2] - \mu^2.

This identity is essential for computation: you only need the first two raw moments.

Properties of Variance

ThProperties of Variance

Let $X$ be a random variable with $\text{Var}(X) < \infty$ , and let $a, b$ be constants. Then:

(i) $\text{Var}(aX + b) = a^2\,\text{Var}(X)$

(ii) $\text{Var}(X) = 0$ if and only if $X$ is a.s. constant

(iii) If $X$ and $Y$ are independent, $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$

Proof Sketch of (i)

\text{Var}(aX+b) = E\!\left[((aX+b) - (a\mu+b))^2\right] = E\!\left[a^2(X-\mu)^2\right] = a^2\,\text{Var}(X).

The shift $b$ cancels inside the squared deviation — translation does not affect spread.

Proof Sketch of (iii)

\text{Var}(X+Y) = E[(X+Y)^2] - (E[X+Y])^2.

Expand: $E[X^2 + 2XY + Y^2] = E[X^2] + 2E[X]E[Y] + E[Y^2]$ (using independence, $E[XY]=E[X]E[Y]$ ). Subtracting $(E[X]+E[Y])^2 = E[X]^2 + 2E[X]E[Y] + E[Y]^2$ yields $\text{Var}(X)+\text{Var}(Y)$ .

Non-Independence Case

For dependent $X, Y$ :

\text{Var}(X+Y) = \text{Var}(X) + \text{Var}(Y) + 2\,\text{Cov}(X,Y).

Independence implies $\text{Cov}(X,Y)=0$ , but the converse is false: zero covariance does not imply independence.

Standard Deviation

The standard deviation $\sigma = \sqrt{\text{Var}(X)}$ restores the original scale of measurement, making it interpretable in the same units as $X$ .

Standard Deviation

\sigma = \sqrt{\text{Var}(X)}, \quad \sigma^2 = \text{Var}(X)

Here,

$\sigma$ =Standard deviation of X
$\sigma^2$ =Variance (square of standard deviation)

Chebyshev's Inequality

ThChebyshev's Inequality

For any random variable $X$ with finite mean $\mu$ and variance $\sigma^2$ , and for any $k > 0$ :

P\!\left(|X - \mu| \geq k\sigma\right) \leq \frac{1}{k^2}.

Proof Sketch

Let $Y = (X-\mu)^2$ . Then $E[Y] = \sigma^2$ . Apply Markov's inequality to $Y$ with threshold $a = (k\sigma)^2$ :

P(Y \geq a) \leq \frac{E[Y]}{a} = \frac{\sigma^2}{k^2\sigma^2} = \frac{1}{k^2}.

But $Y \geq a$ is equivalent to $|X-\mu| \geq k\sigma$ .

This inequality is remarkably general — it requires no assumption about the shape of the distribution. For $k=2$ , it says at most $25\%$ of the probability mass lies beyond 2 standard deviations from the mean.

$k$	Maximum beyond $k\sigma$	Practical Meaning
1	100%	Trivial bound
2	25%	At least 75% within 2 SD
3	11.1%	At least 89% within 3 SD
4	6.25%	At least 94% within 4 SD
5	4%	At least 96% within 5 SD

Worked Example: Discrete Random Variable

Example: Finding Variance from a PMF

Let $X$ have PMF: $P(X=1) = 0.2$ , $P(X=2) = 0.5$ , $P(X=3) = 0.3$ .

Step 1: Compute $E[X]$ :

E[X] = 1(0.2) + 2(0.5) + 3(0.3) = 0.2 + 1.0 + 0.9 = 2.1.

Step 2: Compute $E[X^2]$ :

E[X^2] = 1^2(0.2) + 2^2(0.5) + 3^2(0.3) = 0.2 + 2.0 + 2.7 = 4.9.

Step 3: Apply the computational formula:

\text{Var}(X) = 4.9 - (2.1)^2 = 4.9 - 4.41 = 0.49.

Step 4: Standard deviation: $\sigma = \sqrt{0.49} = 0.7$ .

Verification via definition: $E[(X-\mu)^2] = (1-2.1)^2(0.2) + (2-2.1)^2(0.5) + (3-2.1)^2(0.3) = 1.21(0.2) + 0.01(0.5) + 0.81(0.3) = 0.242 + 0.005 + 0.243 = 0.49.$ ✓

Worked Example: Continuous Random Variable

Example: Variance of Uniform(a,b)

Let $X \sim \text{Uniform}(a, b)$ with $f(x) = \frac{1}{b-a}$ for $x \in [a,b]$ .

Step 1: $E[X] = \frac{a+b}{2}$ .

Step 2: $E[X^2] = \int_a^b \frac{x^2}{b-a}\,dx = \frac{1}{b-a}\cdot\frac{x^3}{3}\Big|_a^b = \frac{b^3 - a^3}{3(b-a)} = \frac{a^2+ab+b^2}{3}.$

Step 3:

\text{Var}(X) = \frac{a^2+ab+b^2}{3} - \left(\frac{a+b}{2}\right)^2 = \frac{4(a^2+ab+b^2) - 3(a+b)^2}{12} = \frac{(b-a)^2}{12}.

This elegant result shows variance depends only on the width $b-a$ , not the location — consistent with the translation invariance property.

Worked Example: Real Data — Exam Scores

Example: Variance of Exam Scores

A class of 10 students scored: $\{72, 85, 91, 68, 78, 94, 82, 76, 88, 80\}$ .

Step 1: Compute the mean:

\bar{x} = \frac{72 + 85 + 91 + 68 + 78 + 94 + 82 + 76 + 88 + 80}{10} = \frac{814}{10} = 81.4

Step 2: Compute squared deviations:

Score $x_i$	$x_i - \bar{x}$	$(x_i - \bar{x})^2$
72	-9.4	88.36
85	3.6	12.96
91	9.6	92.16
68	-13.4	179.56
78	-3.4	11.56
94	12.6	158.76
82	0.6	0.36
76	-5.4	29.16
88	6.6	43.56
80	-1.4	1.96

Step 3: Compute variance (sample, using $n-1$ ):

s^2 = \frac{\sum(x_i - \bar{x})^2}{n-1} = \frac{618.40}{9} = 68.71

Step 4: Standard deviation: $s = \sqrt{68.71} = 8.29$

Interpretation: Scores typically deviate about 8.3 points from the class average of 81.4.

Python Implementation

import numpy as np
from scipy import stats

np.random.seed(42)

# Demonstrate variance properties with a Bernoulli(p) random variable
p = 0.6
n = 10000
samples = np.random.binomial(1, p, size=n)

# Empirical variance vs theoretical
empirical_var = np.var(samples, ddof=0)
theoretical_var = p * (1 - p)
print(f"Bernoulli(p={p}): empirical Var = {empirical_var:.4f}, theoretical Var = {theoretical_var:.4f}")

# Show Var(aX + b) = a^2 Var(X)
a, b_const = 3, 5
transformed = a * samples + b_const
print(f"Var({a}X + {b_const}) = {np.var(transformed, ddof=0):.4f}")
print(f"{a}^2 * Var(X)      = {a**2 * empirical_var:.4f}")

# Sum of independent RVs: Var(X+Y) = Var(X) + Var(Y)
samples_y = np.random.binomial(1, 0.3, size=n)
sum_var = np.var(samples + samples_y, ddof=0)
print(f"Var(X+Y) = {sum_var:.4f}")
print(f"Var(X) + Var(Y) = {np.var(samples, ddof=0) + np.var(samples_y, ddof=0):.4f}")

Python Implementation: Chebyshev Verification

import numpy as np

np.random.seed(42)

# Use an exponential distribution (skewed, not normal) to test Chebyshev
lam = 1.0
n = 100000
samples = np.random.exponential(1/lam, size=n)

mu = np.mean(samples)
sigma = np.std(samples)

# Empirical P(|X - mu| >= k*sigma) vs Chebyshev bound 1/k^2
for k in [1.5, 2, 3, 4]:
    empirical = np.mean(np.abs(samples - mu) >= k * sigma)
    bound = 1 / k**2
    print(f"k={k}: empirical P = {empirical:.4f}, Chebyshev bound = {bound:.4f}")

Python Implementation: Real Data Example

import numpy as np

# Exam scores from worked example
scores = np.array([72, 85, 91, 68, 78, 94, 82, 76, 88, 80])

# Population variance (divide by n)
pop_var = np.var(scores)
# Sample variance (divide by n-1)
sample_var = np.var(scores, ddof=1)

print(f"Mean: {np.mean(scores):.1f}")
print(f"Population variance: {pop_var:.2f}")
print(f"Sample variance:     {sample_var:.2f}")
print(f"Standard deviation:  {np.std(scores, ddof=1):.2f}")

# Manual computation for verification
mean = np.mean(scores)
manual_var = np.sum((scores - mean)**2) / (len(scores) - 1)
print(f"\nManual computation: {manual_var:.2f}")

Variance of Common Distributions

Reference Table

Distribution	PMF/PDF	$E[X]$	$\text{Var}(X)$
Bernoulli $(p)$	$p^x(1-p)^{1-x}$	$p$	$p(1-p)$
Binomial $(n,p)$	$\binom{n}{k}p^k(1-p)^{n-k}$	$np$	$np(1-p)$
Geometric $(p)$	$(1-p)^{k-1}p$	$1/p$	$(1-p)/p^2$
Poisson $(\lambda)$	$\frac{\lambda^k e^{-\lambda}}{k!}$	$\lambda$	$\lambda$
Uniform $(a,b)$	$\frac{1}{b-a}$	$\frac{a+b}{2}$	$\frac{(b-a)^2}{12}$
Normal $(\mu,\sigma^2)$	$\frac{1}{\sigma\sqrt{2\pi}}e^{-(x-\mu)^2/2\sigma^2}$	$\mu$	$\sigma^2$

Variance in Machine Learning

ML Application	Variance Usage	Why It Matters
Bias-variance tradeoff	Variance of model predictions	High variance = overfitting
Feature selection	Variance threshold	Remove low-variance features
Ensemble methods	Reduce variance via averaging	Bagging, random forests
Regularization	Penalize high-variance coefficients	Ridge, Lasso regression

Key Takeaways

Variance measures spread: $\text{Var}(X) = E[(X-\mu)^2] = E[X^2] - (E[X])^2$

Translation invariant: $\text{Var}(X+b) = \text{Var}(X)$ ; scale equivariant: $\text{Var}(aX) = a^2\text{Var}(X)$

Independence additivity: $\text{Var}(X+Y) = \text{Var}(X) + \text{Var}(Y)$ when $X \perp Y$

$\text{Var}(X) = 0 \iff X$ is a.s. constant

Chebyshev's inequality bounds tail probabilities using only $\mu$ and $\sigma^2$

Standard deviation $\sigma$ returns to original units; variance $\sigma^2$ is in squared units

"Variance is the price we pay for uncertainty." — Harry Markowitz

Variance of a Random Variable — Formula and Properties

Variance of a Random Variable

Measuring Spread — How Far Values Deviate from the Mean

What is Variance?

Definition

Mathematical Formulation

Definition of Variance

Computational Formula

Properties of Variance

ThProperties of Variance

Standard Deviation

Standard Deviation

Chebyshev's Inequality

ThChebyshev's Inequality

Worked Example: Discrete Random Variable

Worked Example: Continuous Random Variable

Worked Example: Real Data — Exam Scores

Python Implementation

Python Implementation: Chebyshev Verification

Python Implementation: Real Data Example

Variance of Common Distributions

Variance in Machine Learning

Key Takeaways

Premium Content

Need Expert Statistics Help?