Point Estimation — Estimating Population Parameters
Foundations of Statistics
The Art of Single-Number Guesswork
Point estimation provides the best single guess for unknown population parameters, forming the basis for all statistical inference. Understanding estimator properties ensures your estimates are trustworthy and meaningful.
- Survey Research — Producing point estimates of population characteristics from samples
- Finance — Estimating expected returns and volatility from historical data
- Manufacturing — Calculating process parameters for quality control
Good estimation is the foundation of good statistical practice.
What Is Point Estimation?
DfPoint Estimation
A point estimator is a function of the data that produces a single value as a guess for an unknown population parameter . The goal is to find estimators with desirable properties: they should be close to the true value on average, have low variability, and converge to the truth as data accumulates.
The Method of Moments
ThMethod of Moments (MoM)
Set the first sample moments equal to the first population moments and solve for the unknown parameters. That is, solve:
for .
Worked example — Exponential distribution: Let . We have and . Setting gives .
Worked example — Normal distribution: For , match and :
MoM vs MLE
MoM is simpler (just solving equations) but generally less efficient than MLE. MoM does not require specifying the full likelihood — only the first moments. It is useful as a starting value for MLE algorithms.
Maximum Likelihood Estimation
Maximum Likelihood Estimator
Here,
- =Likelihood function
- =Probability density (or mass) function
Equivalently, maximize the log-likelihood:
Log-Likelihood
Here,
- =Log-likelihood function
ThMLE for the Normal Distribution
For , the log-likelihood is:
Setting : .
Setting : .
Note: is biased — it divides by , not .
Worked Example: MLE for the Poisson Distribution
Let . The PMF is .
Step 1: Write the log-likelihood:
Step 2: Differentiate and set to zero:
Step 3: Verify it's a maximum: .
The MLE for is the sample mean — the same as the MoM estimator for the Poisson.
Asymptotic Properties of MLEs
ThConsistency of MLE
Under regularity conditions, the MLE is consistent: as .
ThAsymptotic Normality of MLE
Under regularity conditions, the MLE is asymptotically normal:
where is the Fisher information per observation.
ThAsymptotic Efficiency of MLE
The MLE achieves the Cramér-Rao lower bound asymptotically: among all regular estimators, the MLE has the smallest possible asymptotic variance .
Proof sketch (sketch of Cramér-Rao): For any unbiased estimator , the Cauchy-Schwarz inequality applied to gives . The MLE achieves equality asymptotically because the score equation is asymptotically equivalent to a linear function of the data.
Fisher Information
Fisher Information
Here,
- =Total Fisher information for sample of size $n$
- =Fisher information per observation
Example: For with known:
So and the Cramér-Rao bound gives . Since , the sample mean achieves the bound — it is the MVUE for .
Python Implementation: Comparing MoM and MLE
import numpy as np
from scipy import stats
np.random.seed(42)
n = 50
# --- Exponential distribution ---
true_lambda = 2.5
data = np.random.exponential(1/true_lambda, size=n)
# MoM estimator
mom_lambda = 1 / np.mean(data)
# MLE (same form for exponential)
mle_lambda = 1 / np.mean(data) # MoM = MLE for exponential
print(f"Exponential: true λ = {true_lambda}")
print(f" MoM = MLE = {mle_lambda:.4f}")
# --- Normal distribution ---
true_mu, true_sigma = 5.0, 3.0
data_normal = np.random.normal(true_mu, true_sigma, size=n)
# MoM
mom_mu = np.mean(data_normal)
mom_sigma2 = np.mean(data_normal**2) - mom_mu**2
# MLE
mle_mu = np.mean(data_normal)
mle_sigma2 = np.mean((data_normal - mle_mu)**2) # biased (divides by n)
# Unbiased
unbiased_sigma2 = np.var(data_normal, ddof=1) # divides by n-1
print(f"\nNormal: true μ = {true_mu}, σ² = {true_sigma**2}")
print(f" MoM μ̂ = {mom_mu:.4f}, MLE μ̂ = {mle_mu:.4f}")
print(f" MoM σ̂² = {mom_sigma2:.4f}, MLE σ̂² = {mle_sigma2:.4f}, Unbiased = {unbiased_sigma2:.4f}")
# --- Demonstrate MLE consistency ---
print(f"\nConsistency demo (MLE σ̂² vs true σ² = {true_sigma**2}):")
for n_small in [10, 50, 200, 1000, 5000]:
samples = np.random.normal(true_mu, true_sigma, size=n_small)
mle_var = np.mean((samples - np.mean(samples))**2)
print(f" n={n_small:5d}: MLE σ̂² = {mle_var:.4f} (error = {abs(mle_var - true_sigma**2):.4f})")
Key Takeaways
Summary: Point Estimation
- Method of Moments: match sample moments to population moments; simple but generally less efficient
- MLE: maximize the likelihood function; asymptotically efficient and consistent
- MLE for divides by (biased); use for an unbiased estimator
- Fisher information quantifies how much data tells us about
- The Cramér-Rao bound sets the minimum variance for unbiased estimators:
- MLE achieves this bound asymptotically, making it the best large-sample estimator