Negative Binomial Distribution
Probability Distributions
Waiting for the r-th Success — Overdispersed Counts
The negative binomial distribution generalizes the geometric: instead of waiting for the first success, we wait for the -th success. It is the go-to model for overdispersed count data.
- Insurance — claims per policyholder (variance > mean)
- Epidemiology — disease cases per region (heterogeneous rates)
- Ecology — species counts per quadrat (aggregated populations)
- Transportation — passengers per bus (bursty arrivals)
When the Poisson's mean-equals-variance assumption fails, the negative binomial saves the day.
Core Concepts
The negative binomial distribution generalizes the geometric distribution: instead of waiting for the first success, we wait for the -th success. It arises naturally as a sum of independent geometric random variables and serves as a flexible model for overdispersed count data.
DfNegative Binomial Distribution (Number of Failures)
A random variable follows a negative binomial distribution with parameters , written , if its PMF is:
Here counts the number of failures before the -th success, with success probability per trial.
Alternative Parametrization
Some authors use to count the total number of trials (successes + failures) needed to achieve successes. Under that convention, with PMF for Be careful to check which convention a textbook uses.
PMF Derivation
Why This PMF Is Correct
To observe exactly failures before the -th success, two conditions must hold:
-
The last trial must be a success (the -th success). This contributes probability .
-
Among the first trials, there must be exactly successes and failures. The number of ways to arrange these is , and each arrangement has probability .
Combining:
Verification via the negative binomial series:
This uses the negative binomial expansion .
Mean and Variance
Negative Binomial Mean and Variance
Here,
- =Target number of successes
- =Success probability
- =Failure probability
Derivation
is the sum of independent geometric random variables , each counting failures before one success. By linearity:
By independence:
Variance-to-Mean Ratio
The mean-to-variance relationship is:
This shows (overdispersion) for all finite . As with fixed, , recovering the Poisson limit.
Cumulant Generating Function
Moment Generating Function
Here,
- =Number of successes
- =Success probability
Derivation
Since with iid, and :
The cumulant generating function is:
yielding cumulants:
The Negative Binomial as a Poisson-Gamma Mixture
ThGamma-Poisson Mixture Representation
If and (shape , rate ), then the marginal distribution of is .
Proof
Setting (so ) gives the PMF.
Why This Matters
This representation explains overdispersion in count data. If counts are Poisson but the rate varies across observations (heterogeneity), the marginal distribution becomes negative binomial. The parameter controls the degree of heterogeneity: smaller means more variation in , hence more overdispersion.
Overdispersion in Practice
Poisson vs Negative Binomial
For Poisson data, . When the empirical variance substantially exceeds the mean, the Poisson model is inadequate and the negative binomial provides a better fit.
Rule of thumb: If , consider the negative binomial.
Estimation: Given data , the method of moments estimators are:
where is the sample mean and is the sample variance.
Worked Example: Call Center Modeling
Example: Customer Calls Per Hour
A call center receives an average of calls per hour, but the variance is (overdispersed). We model calls .
Method of moments:
Verification: ✓
✓
Compare with Poisson: A Poisson(8) model would predict , severely underestimating the true variability of 18. This has direct consequences for staffing: the negative binomial predicts more extreme fluctuations (busy and idle periods).
Python Implementation
import numpy as np
from scipy import stats
np.random.seed(42)
# Negative binomial parameters
r, p = 6, 0.4
n = 10000
# Simulate
samples = np.random.negative_binomial(r, p, size=n)
# Verify mean and variance
mean_theory = r * (1 - p) / p
var_theory = r * (1 - p) / p**2
print(f"NB(r={r}, p={p}):")
print(f" Theoretical mean: {mean_theory:.4f}, variance: {var_theory:.4f}")
print(f" Empirical mean: {np.mean(samples):.4f}, variance: {np.var(samples, ddof=0):.4f}")
# Show relationship to geometric sum
geom_samples = np.random.geometric(p, size=(n, r))
geom_sum = geom_samples.sum(axis=1) - r # convert from "trials" to "failures"
print(f"\n Sum of {r} Geometric(p={p}): mean={np.mean(geom_sum):.4f}, var={np.var(geom_sum, ddof=0):.4f}")
print(f" (Should match NB values above)")
# Show Poisson-Gamma mixture
lam = np.random.gamma(shape=r, scale=(1-p)/p, size=n)
poisson_samples = np.random.poisson(lam)
print(f"\n Poisson-Gamma mixture: mean={np.mean(poisson_samples):.4f}, var={np.var(poisson_samples, ddof=0):.4f}")
Python Implementation: Overdispersion Detection
import numpy as np
from scipy import stats
np.random.seed(42)
# Generate overdispersed count data (NB instead of Poisson)
true_mu = 5
true_alpha = 2.0 # dispersion parameter: r = 1/alpha
r = 1 / true_alpha
p = r / (r + true_mu)
n = 500
data = np.random.negative_binomial(r, p, size=n)
sample_mean = np.mean(data)
sample_var = np.var(data, ddof=1)
dispersion_ratio = sample_var / sample_mean
print(f"Overdispersion Test")
print(f" Sample mean: {sample_mean:.4f}")
print(f" Sample var: {sample_var:.4f}")
print(f" Var/Mean: {dispersion_ratio:.4f}")
print(f" (Var/Mean > 1 suggests overdispersion; Poisson requires Var/Mean ≈ 1)")
# Method of moments estimates for NB
p_hat = sample_mean / sample_var
r_hat = sample_mean * p_hat / (1 - p_hat)
print(f"\n Method of moments estimates:")
print(f" p̂ = {p_hat:.4f}, r̂ = {r_hat:.4f}")
print(f" Estimated mean: {r_hat*(1-p_hat)/p_hat:.4f}")
print(f" Estimated var: {r_hat*(1-p_hat)/p_hat**2:.4f}")
Key Takeaways
Summary: Negative Binomial Distribution
- Counts failures before the -th success:
- Mean: ; Variance:
- Sum of iid geometrics: with
- When , reduces to the geometric distribution
- Poisson-Gamma mixture: arises from Poisson with Gamma-distributed rate
- Models overdispersed count data where variance exceeds mean:
- As with fixed, converges to Poisson (law of large numbers for Gamma)
- Variance-to-mean ratio: , controlled by