Continuous Random Variables

Probability Theory

From Counts to Measurements — The World of Densities

Continuous random variables take values on uncountable sets — heights, weights, times, temperatures. Every individual outcome has probability zero; only intervals carry probability.

Heights — $P(X = 170 \text{ cm}) = 0$ , but $P(168 < X < 172) > 0$
Time — the exact moment of an event has zero probability
Temperature — measured on a continuous scale
Money — can be modeled continuously for large amounts

The density function $f(x)$ is not a probability — it is a rate of probability accumulation.

Core Concepts

Continuous random variables take values in an uncountable set (typically an interval of $\mathbb{R}$ ). Unlike discrete random variables, every individual outcome has probability zero — probability is only meaningful over intervals. This necessitates the density function as the fundamental object of study.

DfProbability Density Function (PDF)

A function $f: \mathbb{R} \to [0, \infty)$ is the PDF of a continuous random variable $X$ if:

P(X \in A) = \int_A f(x)\,dx \quad \text{for every Borel set } A \subseteq \mathbb{R}.

Equivalently, $f$ satisfies: (i) $f(x) \geq 0$ for all $x$ , and (ii) $\int_{-\infty}^{\infty} f(x)\,dx = 1$ .

PDF vs Probability

The PDF value $f(x)$ is not a probability — it is a density. It is possible for $f(x) > 1$ as long as the total integral is 1. For example, $X \sim \text{Uniform}(0, 0.1)$ has $f(x) = 10$ for $x \in [0, 0.1]$ . The probability of any single point is always zero: $P(X = c) = \int_c^c f(x)\,dx = 0$ .

Cumulative Distribution Function (CDF)

DfCDF

The cumulative distribution function of $X$ is:

F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t)\,dt.

ThProperties of the CDF

For any CDF $F: \mathbb{R} \to [0,1]$ :

(i) $F$ is non-decreasing: $x_1 \leq x_2 \implies F(x_1) \leq F(x_2)$

(ii) $\lim_{x \to -\infty} F(x) = 0$ and $\lim_{x \to +\infty} F(x) = 1$

(iii) $F$ is right-continuous: $\lim_{x \to a^+} F(x) = F(a)$

(iv) $P(a < X \leq b) = F(b) - F(a)$

(v) $P(X = c) = 0$ for all $c$ when $X$ is continuous (since $F$ is continuous)

Probability Over an Interval

P(a \leq X \leq b) = \int_a^b f(x)\,dx = F(b) - F(a)

Here,

$f(x)$ =Probability density function
$F(x)$ =Cumulative distribution function
$a, b$ =Interval endpoints with a < b

The Fundamental Theorem of Calculus Connection

PDF-CDF Relationship

When $f$ is continuous at $x$ , the fundamental theorem of calculus gives:

F'(x) = f(x).

This means the PDF is the derivative of the CDF. In cases where $F$ has jump discontinuities (mixed distributions), we use the generalized derivative, which includes Dirac delta contributions.

Expectation

Continuous Expectation

E[X] = \int_{-\infty}^{\infty} x\,f(x)\,dx

Here,

$f(x)$ =PDF of X
$E[X]$ =Expected value (first moment)

Derivation of the Change of Variables Formula

For a measurable function $g$ , the law of the unconscious statistician states:

E[g(X)] = \int_{-\infty}^{\infty} g(x)\,f(x)\,dx.

Proof sketch: For a simple function $g = \sum_i c_i \mathbf{1}_{A_i}$ , this follows from the definition of the integral. For general $g$ , approximate by simple functions and use monotone convergence.

This is powerful: to find $E[g(X)]$ , you don't need the distribution of $Y = g(X)$ — you integrate $g(x)$ against the PDF of $X$ directly.

Variance

Continuous Variance

\text{Var}(X) = E[(X - \mu)^2] = \int_{-\infty}^{\infty} (x - \mu)^2 f(x)\,dx = E[X^2] - (E[X])^2

Here,

$\mu = E[X]$ =Mean of X
$E[X^2]$ =Second raw moment

The computational formula $\text{Var}(X) = E[X^2] - (E[X])^2$ is identical to the discrete case, derived in the same way from the definition.

Moments and Moment Generating Functions

k-th Moment

E[X^k] = \int_{-\infty}^{\infty} x^k f(x)\,dx

Here,

$k$ =Moment order (positive integer)

Moment Generating Function

M_X(t) = E[e^{tX}] = \int_{-\infty}^{\infty} e^{tx} f(x)\,dx

Here,

$t$ =Real parameter in neighborhood of 0

Why MGFs Matter

If $M_X(t)$ exists in a neighborhood of $t = 0$ , it uniquely determines the distribution. All moments can be recovered:

E[X^k] = M_X^{(k)}(0) = \frac{d^k}{dt^k}M_X(t)\bigg|_{t=0}.

Furthermore, if $X$ and $Y$ are independent, $M_{X+Y}(t) = M_X(t) \cdot M_Y(t)$ — the convolution becomes multiplication.

Quantile Function and Inverse CDF

DfQuantile Function

The quantile function (inverse CDF) of $X$ is:

F^{-1}(p) = \inf\{x \in \mathbb{R} : F(x) \geq p\}, \quad p \in (0,1).

It satisfies $P(X \leq F^{-1}(p)) \geq p$ and $P(X \geq F^{-1}(p)) \geq 1-p$ .

Probability Integral Transform

If $X$ has continuous CDF $F$ , then $U = F(X) \sim \text{Uniform}(0,1)$ . Conversely, if $U \sim \text{Uniform}(0,1)$ and $F$ is any CDF, then $X = F^{-1}(U)$ has CDF $F$ . This is the foundation of inverse transform sampling for random variate generation.

Worked Example: Exponential Distribution

Example: Full Analysis of Exp($\lambda$)

Let $X \sim \text{Exp}(\lambda)$ with $f(x) = \lambda e^{-\lambda x}$ for $x \geq 0$ .

CDF: $F(x) = 1 - e^{-\lambda x}$ for $x \geq 0$ .

Mean: $E[X] = \int_0^{\infty} x \lambda e^{-\lambda x}\,dx = \frac{1}{\lambda}$ (integration by parts).

Second moment: $E[X^2] = \int_0^{\infty} x^2 \lambda e^{-\lambda x}\,dx = \frac{2}{\lambda^2}$ (two applications of integration by parts).

Variance: $\text{Var}(X) = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}$ .

MGF: $M_X(t) = \int_0^{\infty} e^{tx}\lambda e^{-\lambda x}\,dx = \frac{\lambda}{\lambda - t}$ for $t < \lambda$ .

Memoryless property: $P(X > s+t \mid X > s) = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}} = e^{-\lambda t} = P(X > t)$ .

Hazard rate: $h(x) = \frac{f(x)}{1-F(x)} = \frac{\lambda e^{-\lambda x}}{e^{-\lambda x}} = \lambda$ (constant).

This shows the exponential distribution is the continuous analogue of the geometric distribution: both are memoryless with constant hazard rates.

Worked Example: Beta Distribution

Example: Beta$(\alpha, \beta)$ on [0,1]

Let $X \sim \text{Beta}(\alpha, \beta)$ with $f(x) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}$ for $x \in [0,1]$ , where $B(\alpha,\beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}$ .

Mean: $E[X] = \frac{\alpha}{\alpha+\beta}$ .

Variance: $\text{Var}(X) = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}$ .

Special cases:

$\text{Beta}(1,1) = \text{Uniform}(0,1)$
$\text{Beta}(\alpha, \alpha)$ is symmetric about $1/2$ for all $\alpha$
As $\alpha, \beta \to \infty$ with $\alpha/\beta$ fixed, the distribution concentrates at $\alpha/(\alpha+\beta)$

The Beta distribution is the conjugate prior for the Binomial likelihood in Bayesian inference.

Worked Example: Change of Variables

Example: Linear Transformation

Let $X$ have PDF $f_X(x)$ and let $Y = aX + b$ with $a \neq 0$ . Then:

f_Y(y) = f_X\!\left(\frac{y-b}{a}\right) \cdot \frac{1}{|a|}.

Application: If $X \sim N(\mu, \sigma^2)$ , then $Z = \frac{X-\mu}{\sigma} \sim N(0,1)$ :

f_Z(z) = f_X(\sigma z + \mu) \cdot \sigma = \frac{1}{\sigma\sqrt{2\pi}}e^{-(\sigma z + \mu - \mu)^2/2\sigma^2} \cdot \sigma = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}.

This is the standard normal PDF — the normalization constant $1/\sqrt{2\pi}$ emerges naturally from the transformation.

Python Implementation

import numpy as np
from scipy import stats

np.random.seed(42)

# Demonstrate PDF properties with exponential distribution
lam = 2.0
x = np.linspace(0, 4, 1000)
pdf_values = stats.expon.pdf(x, scale=1/lam)
cdf_values = stats.expon.cdf(x, scale=1/lam)

# Verify PDF integrates to 1
from scipy.integrate import quad
integral, _ = quad(lambda t: stats.expon.pdf(t, scale=1/lam), 0, np.inf)
print(f"Exponential(lambda={lam})")
print(f"  PDF integral: {integral:.6f}  (should be 1.0)")

# Verify mean and variance
mean_theory = 1/lam
var_theory = 1/lam**2
print(f"  Mean: {mean_theory:.4f}, Variance: {var_theory:.4f}")

# Verify P(X = c) = 0 for continuous RV
print(f"  P(X = 1.0): {stats.expon.cdf(1.0, scale=1/lam) - stats.expon.cdf(1.0, scale=1/lam):.6f}")

# Demonstrate probability integral transform
samples = np.random.exponential(1/lam, size=5000)
u_samples = stats.expon.cdf(samples, scale=1/lam)
print(f"\nProbability Integral Transform:")
print(f"  Mean of F(X): {np.mean(u_samples):.4f}  (should be 0.5)")
print(f"  Variance of F(X): {np.var(u_samples, ddof=0):.4f}  (should be 1/12 ≈ 0.0833)")

Python Implementation: MGF and Moments

import numpy as np
from scipy import stats
from scipy.integrate import quad

# Compute moments numerically for a standard normal
lam = 1.0  # standard normal: mu=0, sigma=1

# E[X^k] for k = 1, 2, 3, 4
print("Standard Normal Moments:")
for k in range(1, 5):
    moment, _ = quad(lambda x: x**k * stats.norm.pdf(x), -np.inf, np.inf)
    theory = 0 if k % 2 == 1 else np.math.factorial(k-1)  # (k-1)!! for even k
    print(f"  E[X^{k}] = {moment:.6f}  (theoretical: {theory})")

# Verify MGF: M_X(t) = exp(t^2/2) for standard normal
t_values = [0.1, 0.5, 1.0]
print("\nStandard Normal MGF:")
for t in t_values:
    mgf_numerical, _ = quad(lambda x: np.exp(t*x) * stats.norm.pdf(x), -np.inf, np.inf)
    mgf_theory = np.exp(t**2 / 2)
    print(f"  M({t}) = {mgf_numerical:.6f}  (theoretical: {mgf_theory:.6f})")

# Change of variables: if X ~ N(0,1), then Y = 2X + 3 ~ N(3, 4)
samples_x = np.random.standard_normal(10000)
samples_y = 2 * samples_x + 3
print(f"\nLinear transformation Y = 2X + 3:")
print(f"  E[Y] = {np.mean(samples_y):.4f}  (theoretical: 3)")
print(f"  Var(Y) = {np.var(samples_y, ddof=0):.4f}  (theoretical: 4)")

Key Takeaways

Summary: Continuous Random Variables

PDF $f(x) \geq 0$ with $\int f(x)\,dx = 1$ ; probability is area under the curve: $P(a \leq X \leq b) = \int_a^b f(x)\,dx$
CDF: $F(x) = P(X \leq x) = \int_{-\infty}^x f(t)\,dt$ ; satisfies $F'(x) = f(x)$ where $f$ is continuous
$P(X = c) = 0$ for any exact value $c$ — probability is only meaningful over intervals
Expectation: $E[g(X)] = \int g(x)f(x)\,dx$ (law of the unconscious statistician)
Variance: $\text{Var}(X) = E[X^2] - (E[X])^2$ (same formula as discrete case)
MGF: $M_X(t) = E[e^{tX}]$ uniquely determines the distribution and converts convolution to multiplication
Probability integral transform: $F(X) \sim \text{Uniform}(0,1)$ — the basis for random variate generation
Change of variables: $f_Y(y) = f_X(g^{-1}(y)) \cdot \left|\frac{d}{dy}g^{-1}(y)\right|$ for $Y = g(X)$

Continuous Random Variables — PDF and CDF

Continuous Random Variables

From Counts to Measurements — The World of Densities

Core Concepts

DfProbability Density Function (PDF)

Cumulative Distribution Function (CDF)

DfCDF

ThProperties of the CDF

Probability Over an Interval

The Fundamental Theorem of Calculus Connection

Expectation

Continuous Expectation

Variance

Continuous Variance

Moments and Moment Generating Functions

k-th Moment

Moment Generating Function

Quantile Function and Inverse CDF

DfQuantile Function

Worked Example: Exponential Distribution

Worked Example: Beta Distribution

Worked Example: Change of Variables

Python Implementation

Python Implementation: MGF and Moments

Key Takeaways

Summary: Continuous Random Variables

Premium Content

Need Expert Statistics Help?