🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Continuous Random Variables — PDF and CDF

Foundations of StatisticsProbability Theory🟢 Free Lesson

Advertisement

Continuous Random Variables

Probability Theory

From Counts to Measurements — The World of Densities

Continuous random variables take values on uncountable sets — heights, weights, times, temperatures. Every individual outcome has probability zero; only intervals carry probability.

  • HeightsP(X=170 cm)=0P(X = 170 \text{ cm}) = 0, but P(168<X<172)>0P(168 < X < 172) > 0
  • Time — the exact moment of an event has zero probability
  • Temperature — measured on a continuous scale
  • Money — can be modeled continuously for large amounts

The density function f(x)f(x) is not a probability — it is a rate of probability accumulation.


Core Concepts

Continuous random variables take values in an uncountable set (typically an interval of R\mathbb{R}). Unlike discrete random variables, every individual outcome has probability zero — probability is only meaningful over intervals. This necessitates the density function as the fundamental object of study.

DfProbability Density Function (PDF)

A function f:R[0,)f: \mathbb{R} \to [0, \infty) is the PDF of a continuous random variable XX if:

P(XA)=Af(x)dxfor every Borel set AR.P(X \in A) = \int_A f(x)\,dx \quad \text{for every Borel set } A \subseteq \mathbb{R}.

Equivalently, ff satisfies: (i) f(x)0f(x) \geq 0 for all xx, and (ii) f(x)dx=1\int_{-\infty}^{\infty} f(x)\,dx = 1.

PDF vs Probability

The PDF value f(x)f(x) is not a probability — it is a density. It is possible for f(x)>1f(x) > 1 as long as the total integral is 1. For example, XUniform(0,0.1)X \sim \text{Uniform}(0, 0.1) has f(x)=10f(x) = 10 for x[0,0.1]x \in [0, 0.1]. The probability of any single point is always zero: P(X=c)=ccf(x)dx=0P(X = c) = \int_c^c f(x)\,dx = 0.


Cumulative Distribution Function (CDF)

DfCDF

The cumulative distribution function of XX is:

F(x)=P(Xx)=xf(t)dt.F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t)\,dt.

ThProperties of the CDF

For any CDF F:R[0,1]F: \mathbb{R} \to [0,1]:

(i) FF is non-decreasing: x1x2    F(x1)F(x2)x_1 \leq x_2 \implies F(x_1) \leq F(x_2)

(ii) limxF(x)=0\lim_{x \to -\infty} F(x) = 0 and limx+F(x)=1\lim_{x \to +\infty} F(x) = 1

(iii) FF is right-continuous: limxa+F(x)=F(a)\lim_{x \to a^+} F(x) = F(a)

(iv) P(a<Xb)=F(b)F(a)P(a < X \leq b) = F(b) - F(a)

(v) P(X=c)=0P(X = c) = 0 for all cc when XX is continuous (since FF is continuous)

Probability Over an Interval

P(aXb)=abf(x)dx=F(b)F(a)P(a \leq X \leq b) = \int_a^b f(x)\,dx = F(b) - F(a)

Here,

  • f(x)f(x)=Probability density function
  • F(x)F(x)=Cumulative distribution function
  • a,ba, b=Interval endpoints with a < b

The Fundamental Theorem of Calculus Connection

PDF-CDF Relationship

When ff is continuous at xx, the fundamental theorem of calculus gives:

F(x)=f(x).F'(x) = f(x).

This means the PDF is the derivative of the CDF. In cases where FF has jump discontinuities (mixed distributions), we use the generalized derivative, which includes Dirac delta contributions.


Expectation

Continuous Expectation

E[X]=xf(x)dxE[X] = \int_{-\infty}^{\infty} x\,f(x)\,dx

Here,

  • f(x)f(x)=PDF of X
  • E[X]E[X]=Expected value (first moment)

Derivation of the Change of Variables Formula

For a measurable function gg, the law of the unconscious statistician states:

E[g(X)]=g(x)f(x)dx.E[g(X)] = \int_{-\infty}^{\infty} g(x)\,f(x)\,dx.

Proof sketch: For a simple function g=ici1Aig = \sum_i c_i \mathbf{1}_{A_i}, this follows from the definition of the integral. For general gg, approximate by simple functions and use monotone convergence.

This is powerful: to find E[g(X)]E[g(X)], you don't need the distribution of Y=g(X)Y = g(X) — you integrate g(x)g(x) against the PDF of XX directly.


Variance

Continuous Variance

Var(X)=E[(Xμ)2]=(xμ)2f(x)dx=E[X2](E[X])2\text{Var}(X) = E[(X - \mu)^2] = \int_{-\infty}^{\infty} (x - \mu)^2 f(x)\,dx = E[X^2] - (E[X])^2

Here,

  • μ=E[X]\mu = E[X]=Mean of X
  • E[X2]E[X^2]=Second raw moment

The computational formula Var(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2 is identical to the discrete case, derived in the same way from the definition.


Moments and Moment Generating Functions

k-th Moment

E[Xk]=xkf(x)dxE[X^k] = \int_{-\infty}^{\infty} x^k f(x)\,dx

Here,

  • kk=Moment order (positive integer)

Moment Generating Function

MX(t)=E[etX]=etxf(x)dxM_X(t) = E[e^{tX}] = \int_{-\infty}^{\infty} e^{tx} f(x)\,dx

Here,

  • tt=Real parameter in neighborhood of 0

Why MGFs Matter

If MX(t)M_X(t) exists in a neighborhood of t=0t = 0, it uniquely determines the distribution. All moments can be recovered:

E[Xk]=MX(k)(0)=dkdtkMX(t)t=0.E[X^k] = M_X^{(k)}(0) = \frac{d^k}{dt^k}M_X(t)\bigg|_{t=0}.

Furthermore, if XX and YY are independent, MX+Y(t)=MX(t)MY(t)M_{X+Y}(t) = M_X(t) \cdot M_Y(t) — the convolution becomes multiplication.


Quantile Function and Inverse CDF

DfQuantile Function

The quantile function (inverse CDF) of XX is:

F1(p)=inf{xR:F(x)p},p(0,1).F^{-1}(p) = \inf\{x \in \mathbb{R} : F(x) \geq p\}, \quad p \in (0,1).

It satisfies P(XF1(p))pP(X \leq F^{-1}(p)) \geq p and P(XF1(p))1pP(X \geq F^{-1}(p)) \geq 1-p.

Probability Integral Transform

If XX has continuous CDF FF, then U=F(X)Uniform(0,1)U = F(X) \sim \text{Uniform}(0,1). Conversely, if UUniform(0,1)U \sim \text{Uniform}(0,1) and FF is any CDF, then X=F1(U)X = F^{-1}(U) has CDF FF. This is the foundation of inverse transform sampling for random variate generation.


Worked Example: Exponential Distribution

Example: Full Analysis of Exp($\lambda$)

Let XExp(λ)X \sim \text{Exp}(\lambda) with f(x)=λeλxf(x) = \lambda e^{-\lambda x} for x0x \geq 0.

CDF: F(x)=1eλxF(x) = 1 - e^{-\lambda x} for x0x \geq 0.

Mean: E[X]=0xλeλxdx=1λE[X] = \int_0^{\infty} x \lambda e^{-\lambda x}\,dx = \frac{1}{\lambda} (integration by parts).

Second moment: E[X2]=0x2λeλxdx=2λ2E[X^2] = \int_0^{\infty} x^2 \lambda e^{-\lambda x}\,dx = \frac{2}{\lambda^2} (two applications of integration by parts).

Variance: Var(X)=2λ21λ2=1λ2\text{Var}(X) = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}.

MGF: MX(t)=0etxλeλxdx=λλtM_X(t) = \int_0^{\infty} e^{tx}\lambda e^{-\lambda x}\,dx = \frac{\lambda}{\lambda - t} for t<λt < \lambda.

Memoryless property: P(X>s+tX>s)=eλ(s+t)eλs=eλt=P(X>t)P(X > s+t \mid X > s) = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}} = e^{-\lambda t} = P(X > t).

Hazard rate: h(x)=f(x)1F(x)=λeλxeλx=λh(x) = \frac{f(x)}{1-F(x)} = \frac{\lambda e^{-\lambda x}}{e^{-\lambda x}} = \lambda (constant).

This shows the exponential distribution is the continuous analogue of the geometric distribution: both are memoryless with constant hazard rates.


Worked Example: Beta Distribution

Example: Beta$(\alpha, \beta)$ on [0,1]

Let XBeta(α,β)X \sim \text{Beta}(\alpha, \beta) with f(x)=xα1(1x)β1B(α,β)f(x) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)} for x[0,1]x \in [0,1], where B(α,β)=Γ(α)Γ(β)Γ(α+β)B(\alpha,\beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}.

Mean: E[X]=αα+βE[X] = \frac{\alpha}{\alpha+\beta}.

Variance: Var(X)=αβ(α+β)2(α+β+1)\text{Var}(X) = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}.

Special cases:

  • Beta(1,1)=Uniform(0,1)\text{Beta}(1,1) = \text{Uniform}(0,1)
  • Beta(α,α)\text{Beta}(\alpha, \alpha) is symmetric about 1/21/2 for all α\alpha
  • As α,β\alpha, \beta \to \infty with α/β\alpha/\beta fixed, the distribution concentrates at α/(α+β)\alpha/(\alpha+\beta)

The Beta distribution is the conjugate prior for the Binomial likelihood in Bayesian inference.


Worked Example: Change of Variables

Example: Linear Transformation

Let XX have PDF fX(x)f_X(x) and let Y=aX+bY = aX + b with a0a \neq 0. Then:

fY(y)=fX ⁣(yba)1a.f_Y(y) = f_X\!\left(\frac{y-b}{a}\right) \cdot \frac{1}{|a|}.

Application: If XN(μ,σ2)X \sim N(\mu, \sigma^2), then Z=XμσN(0,1)Z = \frac{X-\mu}{\sigma} \sim N(0,1):

fZ(z)=fX(σz+μ)σ=1σ2πe(σz+μμ)2/2σ2σ=12πez2/2.f_Z(z) = f_X(\sigma z + \mu) \cdot \sigma = \frac{1}{\sigma\sqrt{2\pi}}e^{-(\sigma z + \mu - \mu)^2/2\sigma^2} \cdot \sigma = \frac{1}{\sqrt{2\pi}}e^{-z^2/2}.

This is the standard normal PDF — the normalization constant 1/2π1/\sqrt{2\pi} emerges naturally from the transformation.


Python Implementation

import numpy as np
from scipy import stats

np.random.seed(42)

# Demonstrate PDF properties with exponential distribution
lam = 2.0
x = np.linspace(0, 4, 1000)
pdf_values = stats.expon.pdf(x, scale=1/lam)
cdf_values = stats.expon.cdf(x, scale=1/lam)

# Verify PDF integrates to 1
from scipy.integrate import quad
integral, _ = quad(lambda t: stats.expon.pdf(t, scale=1/lam), 0, np.inf)
print(f"Exponential(lambda={lam})")
print(f"  PDF integral: {integral:.6f}  (should be 1.0)")

# Verify mean and variance
mean_theory = 1/lam
var_theory = 1/lam**2
print(f"  Mean: {mean_theory:.4f}, Variance: {var_theory:.4f}")

# Verify P(X = c) = 0 for continuous RV
print(f"  P(X = 1.0): {stats.expon.cdf(1.0, scale=1/lam) - stats.expon.cdf(1.0, scale=1/lam):.6f}")

# Demonstrate probability integral transform
samples = np.random.exponential(1/lam, size=5000)
u_samples = stats.expon.cdf(samples, scale=1/lam)
print(f"\nProbability Integral Transform:")
print(f"  Mean of F(X): {np.mean(u_samples):.4f}  (should be 0.5)")
print(f"  Variance of F(X): {np.var(u_samples, ddof=0):.4f}  (should be 1/12 ≈ 0.0833)")

Python Implementation: MGF and Moments

import numpy as np
from scipy import stats
from scipy.integrate import quad

# Compute moments numerically for a standard normal
lam = 1.0  # standard normal: mu=0, sigma=1

# E[X^k] for k = 1, 2, 3, 4
print("Standard Normal Moments:")
for k in range(1, 5):
    moment, _ = quad(lambda x: x**k * stats.norm.pdf(x), -np.inf, np.inf)
    theory = 0 if k % 2 == 1 else np.math.factorial(k-1)  # (k-1)!! for even k
    print(f"  E[X^{k}] = {moment:.6f}  (theoretical: {theory})")

# Verify MGF: M_X(t) = exp(t^2/2) for standard normal
t_values = [0.1, 0.5, 1.0]
print("\nStandard Normal MGF:")
for t in t_values:
    mgf_numerical, _ = quad(lambda x: np.exp(t*x) * stats.norm.pdf(x), -np.inf, np.inf)
    mgf_theory = np.exp(t**2 / 2)
    print(f"  M({t}) = {mgf_numerical:.6f}  (theoretical: {mgf_theory:.6f})")

# Change of variables: if X ~ N(0,1), then Y = 2X + 3 ~ N(3, 4)
samples_x = np.random.standard_normal(10000)
samples_y = 2 * samples_x + 3
print(f"\nLinear transformation Y = 2X + 3:")
print(f"  E[Y] = {np.mean(samples_y):.4f}  (theoretical: 3)")
print(f"  Var(Y) = {np.var(samples_y, ddof=0):.4f}  (theoretical: 4)")

Key Takeaways

Summary: Continuous Random Variables

  • PDF f(x)0f(x) \geq 0 with f(x)dx=1\int f(x)\,dx = 1; probability is area under the curve: P(aXb)=abf(x)dxP(a \leq X \leq b) = \int_a^b f(x)\,dx
  • CDF: F(x)=P(Xx)=xf(t)dtF(x) = P(X \leq x) = \int_{-\infty}^x f(t)\,dt; satisfies F(x)=f(x)F'(x) = f(x) where ff is continuous
  • P(X=c)=0P(X = c) = 0 for any exact value cc — probability is only meaningful over intervals
  • Expectation: E[g(X)]=g(x)f(x)dxE[g(X)] = \int g(x)f(x)\,dx (law of the unconscious statistician)
  • Variance: Var(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2 (same formula as discrete case)
  • MGF: MX(t)=E[etX]M_X(t) = E[e^{tX}] uniquely determines the distribution and converts convolution to multiplication
  • Probability integral transform: F(X)Uniform(0,1)F(X) \sim \text{Uniform}(0,1) — the basis for random variate generation
  • Change of variables: fY(y)=fX(g1(y))ddyg1(y)f_Y(y) = f_X(g^{-1}(y)) \cdot \left|\frac{d}{dy}g^{-1}(y)\right| for Y=g(X)Y = g(X)

Premium Content

Continuous Random Variables — PDF and CDF

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement