Spatial Statistics
Advanced Statistical Methods
Analyzing Data Where Location Matters
Spatial statistics accounts for geographic proximity and autocorrelation in data, using tools like Moran's I, variograms, and kriging to model spatial dependencies that standard methods ignore.
- Epidemiology — Map disease clusters and identify environmental risk factors by location
- Precision agriculture — Optimize fertilizer application using spatially-varying soil measurements
- Real estate — Model housing prices with spatial lag and spatial error models
Spatial statistics ensure your analysis respects the first law of geography: everything is related to nearby things.
Spatial data arise whenever observations are indexed by geographical location. Unlike classical statistics, spatial observations are typically not independent: nearby locations tend to exhibit similar values—a phenomenon known as spatial autocorrelation. Ignoring this structure leads to invalid standard errors, inflated Type I error rates, and inefficient predictions. Spatial statistics provides the formal apparatus for modeling, estimating, and predicting under dependence.
Spatial Autocorrelation
Spatial autocorrelation measures the degree to which nearby observations are similar (positive) or dissimilar (negative).
DfSpatial Autocorrelation
Given observations at locations , spatial autocorrelation quantifies the statistical relationship between and as a function of their separation vector .
Moran's I
Moran's I Statistic
where are spatial weights and . Under the null hypothesis of no spatial autocorrelation, , and for large , is approximately standard normal after standardization.
Geary's C
Geary's C Statistic
Values indicate positive spatial autocorrelation; indicates negative autocorrelation. Unlike Moran's I (which is correlation-like), Geary's C is based on squared differences and is more sensitive to local variability.
Spatial Weights Matrix
The spatial weights matrix encodes the neighborhood structure. Common constructions include:
- Contiguity-based: if and share a boundary (rook or queen)
- Distance-based: if for some cutoff
- K-nearest neighbors: if is among the closest to
- Inverse distance:
Rows are typically row-standardized: .
Testing Moran's I
For a dataset of soil samples with , the expected value under is . The variance of under normality is:
where , , . The z-score is , giving —strong evidence of positive spatial autocorrelation.
Variogram Estimation
The variogram is the fundamental tool for quantifying spatial continuity.
DfSemivariogram
The theoretical semivariogram is defined as:
for a second-order stationary process with .
For an intrinsically stationary process (where and ), the semivariogram exists even when the process is non-stationary in the strict sense.
Empirical Semivariogram
where is the set of pairs within a tolerance window of the lag vector .
Parametric Variogram Models
Common Variogram Models
- Spherical: for , otherwise
- Exponential:
- Gaussian:
- Matérn:
where is the nugget, is the sill, and is the range parameter. The Matérn family parameterized by nests the exponential () and Gaussian () models.
DfAnisotropy
- Geometric anisotropy: The range varies with direction. Addressed by applying a coordinate transformation (affine rotation/scaling) to achieve isotropy.
- Zonal anisotropy: The sill varies with direction while the range remains constant. Cannot be corrected by simple transformation; requires separate variogram models per direction.
- Geometric + Zonal: Combined structure, common in practice.
Kriging
Kriging is the best linear unbiased prediction (BLUP) for spatially correlated data.
Simple Kriging
Simple Kriging Predictor
where is known, is the covariance matrix, and is the covariance between the target and observed locations. The kriging variance is .
Ordinary Kriging
When the mean is unknown but assumed constant, ordinary kriging jointly estimates and :
Ordinary Kriging System
The weights satisfy (unbiasedness), and the kriging variance is .
Universal Kriging
When the mean is modeled as a spatial trend , universal kriging accounts for this drift:
Universal Kriging
where is the generalized least squares estimator and is the residual covariance matrix.
Spatial Regression
Spatial Lag Model
Spatial Error Model
Omitted Variable Bias
Ignoring spatial dependence in the error terms leads to underestimation of standard errors by a factor of approximately , inflating t-statistics and increasing false positive rates. Lagrange multiplier tests (LM-lag and LM-error) help diagnose the appropriate model specification.
Point Process Models
For data recorded as event locations in a study region , point process models characterize the stochastic mechanism generating these events.
DfPoisson Point Process
A homogeneous Poisson point process on with intensity satisfies:
- For any Borel set ,
- For disjoint , and are independent
The inhomogeneous extension allows to vary over space, with .
Quadrat Analysis
Quadrat Analysis
Divide into quadrats of area . Under complete spatial randomness (CSR), for each quadrat . The variance-to-mean ratio (VMR) is:
VMR = 1 indicates CSR; VMR > 1 suggests clustering; VMR < 1 suggests regularity. The chi-squared statistic tests against the Poisson assumption.
Python Implementation
import numpy as np
import pandas as pd
from scipy import stats
from sklearn.neighbors import NearestNeighbors
import matplotlib.pyplot as plt
np.random.seed(42)
# --- Generate spatial data with autocorrelation ---
n = 200
coords = np.column_stack([
np.random.uniform(0, 10, n),
np.random.uniform(0, 10, n)
])
# True surface with spatial trend
def true_surface(x, y):
return 3 * np.sin(x / 3) + 2 * np.cos(y / 4) + np.random.normal(0, 0.3, len(x))
z = true_surface(coords[:, 0], coords[:, 1])
# --- Spatial weights matrix (KNN) ---
k = 8
nn = NearestNeighbors(n_neighbors=k + 1).fit(coords)
distances, indices = nn.kneighbors(coords)
W = np.zeros((n, n))
for i in range(n):
for j_idx in range(1, k + 1):
j = indices[i, j_idx]
W[i, j] = 1.0 / max(distances[i, j_idx], 1e-10)
# Row-standardize
row_sums = W.sum(axis=1, keepdims=True)
W = W / row_sums
# --- Moran's I ---
z_centered = z - z.mean()
numerator = n * z_centered @ W @ z_centered
denominator = W.sum() * (z_centered ** 2).sum()
morans_i = numerator / denominator
# Expected value and variance under H0
S0 = W.sum()
S1 = 0.5 * ((W + W.T) ** 2).sum()
S2 = ((W.sum(axis=1) + W.sum(axis=0)) ** 2).sum()
EI = -1.0 / (n - 1)
VI = ((n**2 - 3*n + 3)*S1 - n*S2 + 3*S0) / ((n**2 - 1)*S0) - 1.0/(n-1)**2
z_moran = (morans_i - EI) / np.sqrt(VI)
print(f"Moran's I: {morans_i:.4f}")
print(f"Expected I: {EI:.4f}")
print(f"z-score: {z_moran:.4f}")
print(f"p-value: {2 * (1 - stats.norm.cdf(abs(z_moran))):.6f}")
# --- Empirical Semivariogram ---
def empirical_variogram(coords, z, n_bins=15, max_dist=5.0):
from scipy.spatial.distance import pdist, squareform
D = squareform(pdist(coords))
pairs = np.triu_indices(len(z), k=1)
dists = D[pairs]
diffs_sq = (z[pairs[0]] - z[pairs[1]]) ** 2
bins = np.linspace(0, max_dist, n_bins + 1)
bin_centers = 0.5 * (bins[:-1] + bins[1:])
gamma = np.zeros(n_bins)
for i in range(n_bins):
mask = (dists >= bins[i]) & (dists < bins[i + 1])
if mask.sum() > 0:
gamma[i] = 0.5 * diffs_sq[mask].mean()
else:
gamma[i] = np.nan
return bin_centers, gamma
h, gamma_hat = empirical_variogram(coords, z)
valid = ~np.isnan(gamma_hat)
# Fit exponential variogram via least squares
def exponential_variogram(h, nugget, sill, range_param):
return nugget + sill * (1 - np.exp(-3 * h / range_param))
from scipy.optimize import curve_fit
popt, pcov = curve_fit(
exponential_variogram, h[valid], gamma_hat[valid],
p0=[0.1, 2.0, 2.0], maxfev=10000
)
print(f"\nFitted variogram: nugget={popt[0]:.3f}, sill={popt[1]:.3f}, range={popt[2]:.3f}")
# --- Simple Kriging prediction at a new point ---
def sk_predict(coords, z, new_point, cov_func, mu, **kwargs):
from scipy.spatial.distance import cdist
c_0 = cov_func(0, **kwargs)
C = cov_func(cdist(coords, coords), **kwargs)
c = cov_func(cdist(new_point, coords), **kwargs)
C_inv = np.linalg.inv(C + 1e-8 * np.eye(len(C)))
weights = C_inv @ c
pred = mu + weights @ (z - mu)
variance = c_0 - c @ C_inv @ c
return pred, variance
def cov_exponential(dist, sill, range_param):
return sill * np.exp(-3 * dist / range_param)
new_pt = np.array([[5.0, 5.0]])
mu_hat = z.mean()
pred, var = sk_predict(coords, z, new_pt, cov_exponential, mu_hat,
sill=popt[1], range_param=popt[2])
print(f"\nSimple Kriging prediction at (5,5): {pred[0]:.3f} ± {np.sqrt(max(var, 0)):.3f}")
Summary
Key Takeaways: Spatial Statistics
- Spatial autocorrelation invalidates classical independence assumptions; test with Moran's I or Geary's C before proceeding.
- The semivariogram quantifies spatial continuity: nugget, sill, and range are the three key parameters. The Matérn family provides flexible, nested modeling.
- Kriging is the BLUP under Gaussian assumptions: simple kriging (known mean), ordinary kriging (constant unknown mean), and universal kriging (trend + residual).
- Spatial regression models (lag and error) account for spatial dependence in outcome or residuals; omitting it inflates Type I error.
- Point process models and quadrat analysis characterize event locations; the Poisson process is the null model for complete spatial randomness.