Poisson Distribution
Probability Distributions
The Law of Rare Events — Counting Occurrences
The Poisson distribution models the number of events occurring in a fixed interval when events happen independently at a constant average rate. It is the mathematics of rarity.
- Customer arrivals — calls per hour at a call center
- Defect detection — potholes per mile of highway
- Particle physics — radioactive decay counts
- Network traffic — packets arriving at a server per millisecond
The Poisson distribution is the language of random arrivals and rare occurrences.
What is the Poisson Distribution?
Definition
A random variable has a Poisson distribution with parameter , written , if its probability mass function is:
where is the average rate of events per interval.
Poisson PMF
Here,
- =Rate parameter — expected number of events per interval
- =Number of events (non-negative integer)
- =Euler's number ≈ 2.71828
Derivation from First Principles
ThPoisson as a Limit of the Binomial
Consider independent Bernoulli trials, each with success probability . As with fixed:
Proof. Expand and take the limit. The key steps:
Therefore .
This derivation also explains when the Poisson approximation to the binomial is valid: when is large and is small, with moderate.
Moments
ThMean and Variance of Poisson
For :
The mean equals the variance — this is the defining equidispersion property of the Poisson distribution.
Poisson Mean and Variance
Here,
- =Rate parameter (simultaneously mean and variance)
Equidispersion Test
If the sample variance significantly exceeds the sample mean, the data are overdispersed and the Poisson model is inappropriate. Alternatives include the negative binomial or quasi-Poisson models.
Higher Moments and Skewness
Moment-Generating Function
Here,
- =Moment-generating function
- =Rate parameter
From the MGF:
where are Stirling numbers of the second kind. Specifically:
- Skewness: (always positively skewed)
- Kurtosis: (excess kurtosis)
As , the skewness and kurtosis approach 0, and the Poisson converges to the normal.
Normal Approximation
Normal Approximation to Poisson
Here,
- =Rate parameter
The approximation is adequate when . Apply a continuity correction () for improved accuracy.
Assumptions
| Assumption | What It Means | Example Violation |
|---|---|---|
| Events occur independently | One event doesn't trigger or prevent another | Earthquake aftershocks |
| Constant rate | doesn't change over the interval | Bus arrivals (rush hour vs. midnight) |
| No simultaneous events | Two events can't happen at exactly the same time | Web server requests (may need -model) |
| Countable events | Events are countable (not continuous) | Waiting times (use exponential instead) |
Worked Example: Call Center
Example: Customer Calls Per Hour
A call center receives an average of calls per hour. Assuming calls arrive independently:
What is the probability of receiving exactly 15 calls in an hour?
What is the probability of receiving more than 15 calls?
Expected calls per 8-hour shift: calls Standard deviation per shift: calls
So during an 8-hour shift, we expect calls (approximately).
Worked Example: Rare Disease Screening
Example: Disease Incidence
A rare disease affects 2 people per 10,000 in the population ( per 10,000).
In a city of 500,000, how many cases do we expect?
What is the probability of observing fewer than 80 cases?
Using the normal approximation ():
Only a 2% chance of observing fewer than 80 cases — if we do, something is wrong (under-reporting, migration, etc.).
Worked Example: Software Bugs
Example: Bug Detection in Code Modules
A software company finds an average of bugs per 1,000 lines of code. A new module has 4,500 lines.
Expected bugs in the module:
Probability of zero bugs (perfect code):
Essentially impossible! This illustrates why code review and testing are essential.
Probability of more than 20 bugs (needs major refactoring):
About an 8% chance — not negligible for critical software.
Relationship to Other Distributions
| Distribution | Relationship |
|---|---|
| Binomial | Poisson is the limit as , , |
| Exponential | Inter-arrival times of a Poisson process are |
| Gamma | Sum of i.i.d. is |
| Normal | Limit as (CLT) |
| Negative Binomial | Alternative for overdispersed count data |
Python Implementation
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
np.random.seed(42)
# Poisson distribution: lambda = 5
lam = 5
x = np.arange(0, 20)
pmf = stats.poisson.pmf(x, lam)
print(f"Poisson(λ={lam})")
print(f"Mean: {stats.poisson.mean(lam):.1f}")
print(f"Variance: {stats.poisson.var(lam):.2f}")
print(f"Std Dev: {stats.poisson.std(lam):.2f}")
print(f"P(X=5): {stats.poisson.pmf(5, lam):.4f}")
print(f"P(X<=5): {stats.poisson.cdf(5, lam):.4f}")
# Compare different lambda values
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
# PMF for different lambdas
for lam_val, color in [(2, '#ef4444'), (5, '#6366f1'), (10, '#22c55e')]:
x_vals = np.arange(0, 25)
pmf_vals = stats.poisson.pmf(x_vals, lam_val)
axes[0].bar(x_vals, pmf_vals, alpha=0.6, label=f'λ={lam_val}', color=color)
axes[0].set_xlabel('k')
axes[0].set_ylabel('P(X=k)')
axes[0].set_title('Poisson Distribution for Different λ')
axes[0].legend()
# Mean = Variance check
np.random.seed(42)
lambdas = [1, 3, 5, 10, 20, 50]
means = []
vars_ = []
for l in lambdas:
samples = np.random.poisson(l, 10000)
means.append(np.mean(samples))
vars_.append(np.var(samples))
axes[1].scatter(means, vars_, s=100, c='steelblue', edgecolors='black')
axes[1].plot([0, 55], [0, 55], 'r--', lw=2, label='y=x (Mean=Var)')
axes[1].set_xlabel('Sample Mean')
axes[1].set_ylabel('Sample Variance')
axes[1].set_title('Poisson: Mean = Variance')
axes[1].legend()
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('poisson_distribution.png', dpi=150)
plt.show()
Key Takeaways
Models count of events in a fixed interval at constant rate
PMF:
Mean = Variance = (equidispersion) — a quick diagnostic check
Derived as the limit of Bin() as , ,
Normal approximation valid for
Always positively skewed with skewness
"The Poisson distribution is the true distribution of the rare." — L.J. Bortkiewicz