🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Poisson Distribution — Modeling Rare Events

Foundations of StatisticsProbability Distributions🟢 Free Lesson

Advertisement

Poisson Distribution

Probability Distributions

The Law of Rare Events — Counting Occurrences

The Poisson distribution models the number of events occurring in a fixed interval when events happen independently at a constant average rate. It is the mathematics of rarity.

  • Customer arrivals — calls per hour at a call center
  • Defect detection — potholes per mile of highway
  • Particle physics — radioactive decay counts
  • Network traffic — packets arriving at a server per millisecond

The Poisson distribution is the language of random arrivals and rare occurrences.


What is the Poisson Distribution?

Definition

A random variable XX has a Poisson distribution with parameter λ>0\lambda > 0, written XPois(λ)X \sim \text{Pois}(\lambda), if its probability mass function is:

P(X=k)=λkeλk!,k=0,1,2,P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}, \quad k = 0, 1, 2, \ldots

where λ\lambda is the average rate of events per interval.

Poisson PMF

P(X=k)=λkeλk!,k=0,1,2,P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}, \quad k = 0, 1, 2, \ldots

Here,

  • λ\lambda=Rate parameter — expected number of events per interval
  • kk=Number of events (non-negative integer)
  • ee=Euler's number ≈ 2.71828

Derivation from First Principles

ThPoisson as a Limit of the Binomial

Consider nn independent Bernoulli trials, each with success probability p=λ/np = \lambda/n. As nn \to \infty with λ=np\lambda = np fixed:

P(X=k)=(nk)(λn)k(1λn)nkλkeλk!P(X = k) = \binom{n}{k} \left(\frac{\lambda}{n}\right)^k \left(1 - \frac{\lambda}{n}\right)^{n-k} \to \frac{\lambda^k e^{-\lambda}}{k!}

Proof. Expand (nk)=n!k!(nk)!\binom{n}{k} = \frac{n!}{k!(n-k)!} and take the limit. The key steps:

  • n!(nk)!nk=n(n1)(nk+1)nk1\frac{n!}{(n-k)!} \cdot n^{-k} = \frac{n(n-1)\cdots(n-k+1)}{n^k} \to 1
  • (1λ/n)neλ(1 - \lambda/n)^n \to e^{-\lambda}
  • (1λ/n)k1(1 - \lambda/n)^{-k} \to 1

Therefore P(X=k)λkk!eλP(X = k) \to \frac{\lambda^k}{k!} e^{-\lambda}. \square

This derivation also explains when the Poisson approximation to the binomial is valid: when nn is large and pp is small, with λ=np\lambda = np moderate.


Moments

ThMean and Variance of Poisson

For XPois(λ)X \sim \text{Pois}(\lambda):

E[X]=λE[X] = \lambda
Var(X)=λ\text{Var}(X) = \lambda

The mean equals the variance — this is the defining equidispersion property of the Poisson distribution.

Poisson Mean and Variance

E[X]=Var(X)=λE[X] = \text{Var}(X) = \lambda

Here,

  • λ\lambda=Rate parameter (simultaneously mean and variance)

Equidispersion Test

If the sample variance significantly exceeds the sample mean, the data are overdispersed and the Poisson model is inappropriate. Alternatives include the negative binomial or quasi-Poisson models.


Higher Moments and Skewness

Moment-Generating Function

MX(t)=E[etX]=exp(λ(et1))M_X(t) = E[e^{tX}] = \exp\left(\lambda(e^t - 1)\right)

Here,

  • MX(t)M_X(t)=Moment-generating function
  • λ\lambda=Rate parameter

From the MGF:

E[Xr]=i=1rS(r,i)λiE[X^r] = \sum_{i=1}^r S(r, i) \, \lambda^i

where S(r,i)S(r, i) are Stirling numbers of the second kind. Specifically:

  • Skewness: γ1=1/λ\gamma_1 = 1/\sqrt{\lambda} (always positively skewed)
  • Kurtosis: γ2=1/λ\gamma_2 = 1/\lambda (excess kurtosis)

As λ\lambda \to \infty, the skewness and kurtosis approach 0, and the Poisson converges to the normal.


Normal Approximation

Normal Approximation to Poisson

XPois(λ)    YN(λ,λ)X \sim \text{Pois}(\lambda) \;\approx\; Y \sim \mathcal{N}(\lambda, \lambda)

Here,

  • λ\lambda=Rate parameter

The approximation is adequate when λ20\lambda \geq 20. Apply a continuity correction (±0.5\pm 0.5) for improved accuracy.


Assumptions

AssumptionWhat It MeansExample Violation
Events occur independentlyOne event doesn't trigger or prevent anotherEarthquake aftershocks
Constant rateλ\lambda doesn't change over the intervalBus arrivals (rush hour vs. midnight)
No simultaneous eventsTwo events can't happen at exactly the same timeWeb server requests (may need ϵ\epsilon-model)
Countable eventsEvents are countable (not continuous)Waiting times (use exponential instead)

Worked Example: Call Center

Example: Customer Calls Per Hour

A call center receives an average of λ=12\lambda = 12 calls per hour. Assuming calls arrive independently:

What is the probability of receiving exactly 15 calls in an hour?

P(X=15)=1215e1215!=1215×6.144×1061.3077×1012=0.0724P(X = 15) = \frac{12^{15} e^{-12}}{15!} = \frac{12^{15} \times 6.144 \times 10^{-6}}{1.3077 \times 10^{12}} = 0.0724

What is the probability of receiving more than 15 calls?

P(X>15)=1P(X15)=1k=01512ke12k!0.1550P(X > 15) = 1 - P(X \leq 15) = 1 - \sum_{k=0}^{15} \frac{12^k e^{-12}}{k!} \approx 0.1550

Expected calls per 8-hour shift: E[X]=12×8=96E[X] = 12 \times 8 = 96 calls Standard deviation per shift: σ=96=9.80\sigma = \sqrt{96} = 9.80 calls

So during an 8-hour shift, we expect 96±9.8096 \pm 9.80 calls (approximately).


Worked Example: Rare Disease Screening

Example: Disease Incidence

A rare disease affects 2 people per 10,000 in the population (λ=2\lambda = 2 per 10,000).

In a city of 500,000, how many cases do we expect?

λcity=210000×500000=100 cases\lambda_{\text{city}} = \frac{2}{10000} \times 500000 = 100 \text{ cases}

What is the probability of observing fewer than 80 cases?

Using the normal approximation (λ=10020\lambda = 100 \geq 20):

P(X<80)P(Z<79.5100100)=P(Z<2.05)0.0202P(X < 80) \approx P\left(Z < \frac{79.5 - 100}{\sqrt{100}}\right) = P(Z < -2.05) \approx 0.0202

Only a 2% chance of observing fewer than 80 cases — if we do, something is wrong (under-reporting, migration, etc.).


Worked Example: Software Bugs

Example: Bug Detection in Code Modules

A software company finds an average of λ=3.5\lambda = 3.5 bugs per 1,000 lines of code. A new module has 4,500 lines.

Expected bugs in the module: λmodule=3.5×4.5=15.75\lambda_{\text{module}} = 3.5 \times 4.5 = 15.75

Probability of zero bugs (perfect code):

P(X=0)=e15.751.45×107P(X = 0) = e^{-15.75} \approx 1.45 \times 10^{-7}

Essentially impossible! This illustrates why code review and testing are essential.

Probability of more than 20 bugs (needs major refactoring):

P(X>20)=1P(X20)0.081P(X > 20) = 1 - P(X \leq 20) \approx 0.081

About an 8% chance — not negligible for critical software.


Relationship to Other Distributions

DistributionRelationship
BinomialPoisson is the limit as nn \to \infty, p0p \to 0, np=λnp = \lambda
ExponentialInter-arrival times of a Poisson process are Exp(λ)\text{Exp}(\lambda)
GammaSum of kk i.i.d. Exp(λ)\text{Exp}(\lambda) is Gamma(k,λ)\text{Gamma}(k, \lambda)
NormalLimit as λ\lambda \to \infty (CLT)
Negative BinomialAlternative for overdispersed count data

Python Implementation

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

np.random.seed(42)

# Poisson distribution: lambda = 5
lam = 5
x = np.arange(0, 20)
pmf = stats.poisson.pmf(x, lam)

print(f"Poisson(λ={lam})")
print(f"Mean:     {stats.poisson.mean(lam):.1f}")
print(f"Variance: {stats.poisson.var(lam):.2f}")
print(f"Std Dev:  {stats.poisson.std(lam):.2f}")
print(f"P(X=5):   {stats.poisson.pmf(5, lam):.4f}")
print(f"P(X<=5):  {stats.poisson.cdf(5, lam):.4f}")

# Compare different lambda values
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# PMF for different lambdas
for lam_val, color in [(2, '#ef4444'), (5, '#6366f1'), (10, '#22c55e')]:
    x_vals = np.arange(0, 25)
    pmf_vals = stats.poisson.pmf(x_vals, lam_val)
    axes[0].bar(x_vals, pmf_vals, alpha=0.6, label=f'λ={lam_val}', color=color)
axes[0].set_xlabel('k')
axes[0].set_ylabel('P(X=k)')
axes[0].set_title('Poisson Distribution for Different λ')
axes[0].legend()

# Mean = Variance check
np.random.seed(42)
lambdas = [1, 3, 5, 10, 20, 50]
means = []
vars_ = []
for l in lambdas:
    samples = np.random.poisson(l, 10000)
    means.append(np.mean(samples))
    vars_.append(np.var(samples))

axes[1].scatter(means, vars_, s=100, c='steelblue', edgecolors='black')
axes[1].plot([0, 55], [0, 55], 'r--', lw=2, label='y=x (Mean=Var)')
axes[1].set_xlabel('Sample Mean')
axes[1].set_ylabel('Sample Variance')
axes[1].set_title('Poisson: Mean = Variance')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('poisson_distribution.png', dpi=150)
plt.show()

Key Takeaways

Models count of events in a fixed interval at constant rate λ\lambda

PMF: P(X=k)=λkeλk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}

Mean = Variance = λ\lambda (equidispersion) — a quick diagnostic check

Derived as the limit of Bin(n,pn, p) as nn \to \infty, p0p \to 0, np=λnp = \lambda

Normal approximation valid for λ20\lambda \geq 20

Always positively skewed with skewness =1/λ= 1/\sqrt{\lambda}

"The Poisson distribution is the true distribution of the rare." — L.J. Bortkiewicz

Premium Content

Poisson Distribution — Modeling Rare Events

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement