🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Spatial Statistics

Advanced Statistical MethodsSpatial Methods🟢 Free Lesson

Advertisement

Spatial Statistics

Advanced Statistical Methods

Analyzing Data Where Location Matters

Spatial statistics accounts for geographic proximity and autocorrelation in data, using tools like Moran's I, variograms, and kriging to model spatial dependencies that standard methods ignore.

  • Epidemiology — Map disease clusters and identify environmental risk factors by location
  • Precision agriculture — Optimize fertilizer application using spatially-varying soil measurements
  • Real estate — Model housing prices with spatial lag and spatial error models

Spatial statistics ensure your analysis respects the first law of geography: everything is related to nearby things.


Spatial data arise whenever observations are indexed by geographical location. Unlike classical statistics, spatial observations are typically not independent: nearby locations tend to exhibit similar values—a phenomenon known as spatial autocorrelation. Ignoring this structure leads to invalid standard errors, inflated Type I error rates, and inefficient predictions. Spatial statistics provides the formal apparatus for modeling, estimating, and predicting under dependence.


Spatial Autocorrelation

Spatial autocorrelation measures the degree to which nearby observations are similar (positive) or dissimilar (negative).

DfSpatial Autocorrelation

Given observations {z(si)}i=1n\{z(\mathbf{s}_i)\}_{i=1}^{n} at locations siDRd\mathbf{s}_i \in \mathcal{D} \subset \mathbb{R}^d, spatial autocorrelation quantifies the statistical relationship between z(si)z(\mathbf{s}_i) and z(sj)z(\mathbf{s}_j) as a function of their separation vector sisj\mathbf{s}_i - \mathbf{s}_j.

Moran's I

Moran's I Statistic

I=nijwijijwij(zizˉ)(zjzˉ)i(zizˉ)2I = \frac{n}{\sum_{i}\sum_{j} w_{ij}} \cdot \frac{\sum_{i}\sum_{j} w_{ij}\,(z_i - \bar{z})(z_j - \bar{z})}{\sum_{i}(z_i - \bar{z})^2}

where wijw_{ij} are spatial weights and zˉ=1nizi\bar{z} = \frac{1}{n}\sum_i z_i. Under the null hypothesis of no spatial autocorrelation, E[I]1n1\mathbb{E}[I] \approx -\frac{1}{n-1}, and for large nn, II is approximately standard normal after standardization.

Geary's C

Geary's C Statistic

C=(n1)ijwij(zizj)22ijwiji(zizˉ)2C = \frac{(n-1)\sum_{i}\sum_{j} w_{ij}\,(z_i - z_j)^2}{2\,\sum_{i}\sum_{j} w_{ij} \cdot \sum_{i}(z_i - \bar{z})^2}

Values C<1C < 1 indicate positive spatial autocorrelation; C>1C > 1 indicates negative autocorrelation. Unlike Moran's I (which is correlation-like), Geary's C is based on squared differences and is more sensitive to local variability.

Spatial Weights Matrix

The spatial weights matrix W=[wij]\mathbf{W} = [w_{ij}] encodes the neighborhood structure. Common constructions include:

  • Contiguity-based: wij=1w_{ij} = 1 if ii and jj share a boundary (rook or queen)
  • Distance-based: wij=1w_{ij} = 1 if sisj<d\|\mathbf{s}_i - \mathbf{s}_j\| < d for some cutoff dd
  • K-nearest neighbors: wij=1w_{ij} = 1 if jj is among the kk closest to ii
  • Inverse distance: wij=sisjαw_{ij} = \|\mathbf{s}_i - \mathbf{s}_j\|^{-\alpha}

Rows are typically row-standardized: wij=wij/kwikw_{ij}^* = w_{ij} / \sum_k w_{ik}.

Testing Moran's I

For a dataset of n=100n = 100 soil samples with I=0.42I = 0.42, the expected value under H0H_0 is E[I]=1/99=0.0101\mathbb{E}[I] = -1/99 = -0.0101. The variance of II under normality is:

Var(I)=1S0[(n23n+3)S1nS2+3S0(n21)S0]1(n1)2\text{Var}(I) = \frac{1}{S_0}\left[\frac{(n^2 - 3n + 3)S_1 - nS_2 + 3S_0}{(n^2 - 1)S_0}\right] - \frac{1}{(n-1)^2}

where S0=ijwijS_0 = \sum_{ij}w_{ij}, S1=12ij(wij+wji)2S_1 = \frac{1}{2}\sum_{ij}(w_{ij}+w_{ji})^2, S2=i(jwij+jwji)2S_2 = \sum_i(\sum_j w_{ij} + \sum_j w_{ji})^2. The z-score is z=(IE[I])/Var(I)z = (I - \mathbb{E}[I])/\sqrt{\text{Var}(I)}, giving z12.4z \approx 12.4—strong evidence of positive spatial autocorrelation.


Variogram Estimation

The variogram is the fundamental tool for quantifying spatial continuity.

DfSemivariogram

The theoretical semivariogram is defined as:

γ(h)=12Var[Z(s+h)Z(s)]\gamma(\mathbf{h}) = \frac{1}{2}\,\text{Var}\big[Z(\mathbf{s} + \mathbf{h}) - Z(\mathbf{s})\big]

for a second-order stationary process {Z(s)}\{Z(\mathbf{s})\} with Var[Z(s)]=c(0)\text{Var}[Z(\mathbf{s})] = c(0).

For an intrinsically stationary process (where E[Z(s+h)Z(s)]=0\mathbb{E}[Z(\mathbf{s}+\mathbf{h}) - Z(\mathbf{s})] = 0 and Var[Z(s+h)Z(s)]=2γ(h)\text{Var}[Z(\mathbf{s}+\mathbf{h}) - Z(\mathbf{s})] = 2\gamma(\mathbf{h})), the semivariogram exists even when the process is non-stationary in the strict sense.

Empirical Semivariogram

γ^(h)=12N(h)(i,j)N(h)(z(si)z(sj))2\hat{\gamma}(\mathbf{h}) = \frac{1}{2|N(\mathbf{h})|}\sum_{(i,j) \in N(\mathbf{h})} \big(z(\mathbf{s}_i) - z(\mathbf{s}_j)\big)^2

where N(h)={(i,j):sisjh}N(\mathbf{h}) = \{(i,j) : \mathbf{s}_i - \mathbf{s}_j \approx \mathbf{h}\} is the set of pairs within a tolerance window of the lag vector h\mathbf{h}.

Parametric Variogram Models

Common Variogram Models

  • Spherical: γ(h)=c0+c1[3h2a12(ha)3]\gamma(h) = c_0 + c_1\left[\frac{3h}{2a} - \frac{1}{2}\left(\frac{h}{a}\right)^3\right] for hah \leq a, γ(h)=c0+c1\gamma(h) = c_0 + c_1 otherwise
  • Exponential: γ(h)=c0+c1[1exp(3ha)]\gamma(h) = c_0 + c_1\left[1 - \exp\left(-\frac{3h}{a}\right)\right]
  • Gaussian: γ(h)=c0+c1[1exp(3h2a2)]\gamma(h) = c_0 + c_1\left[1 - \exp\left(-\frac{3h^2}{a^2}\right)\right]
  • Matérn: γ(h)=c0+c1[121νΓ(ν)(2νha)νKν(2νha)]\gamma(h) = c_0 + c_1\left[1 - \frac{2^{1-\nu}}{\Gamma(\nu)}\left(\frac{\sqrt{2\nu}\,h}{a}\right)^\nu K_\nu\left(\frac{\sqrt{2\nu}\,h}{a}\right)\right]

where c0c_0 is the nugget, c0+c1c_0 + c_1 is the sill, and aa is the range parameter. The Matérn family parameterized by ν\nu nests the exponential (ν=0.5\nu = 0.5) and Gaussian (ν\nu \to \infty) models.

DfAnisotropy

  • Geometric anisotropy: The range varies with direction. Addressed by applying a coordinate transformation (affine rotation/scaling) to achieve isotropy.
  • Zonal anisotropy: The sill varies with direction while the range remains constant. Cannot be corrected by simple transformation; requires separate variogram models per direction.
  • Geometric + Zonal: Combined structure, common in practice.

Kriging

Kriging is the best linear unbiased prediction (BLUP) for spatially correlated data.

Simple Kriging

Simple Kriging Predictor

Z^(s0)=μ+cC1(Zμ1)\hat{Z}(\mathbf{s}_0) = \mu + \mathbf{c}^\top \mathbf{C}^{-1}(\mathbf{Z} - \mu\mathbf{1})

where μ=E[Z(s)]\mu = \mathbb{E}[Z(\mathbf{s})] is known, C=[c(sisj)]\mathbf{C} = [c(\mathbf{s}_i - \mathbf{s}_j)] is the covariance matrix, and c=[c(s0si)]\mathbf{c} = [c(\mathbf{s}_0 - \mathbf{s}_i)] is the covariance between the target and observed locations. The kriging variance is σSK2=c(0)cC1c\sigma^2_{SK} = c(0) - \mathbf{c}^\top \mathbf{C}^{-1}\mathbf{c}.

Ordinary Kriging

When the mean is unknown but assumed constant, ordinary kriging jointly estimates μ\mu and Z^(s0)\hat{Z}(\mathbf{s}_0):

Ordinary Kriging System

(C110)(λψ)=(c1)\begin{pmatrix} \mathbf{C} & \mathbf{1} \\ \mathbf{1}^\top & 0 \end{pmatrix} \begin{pmatrix} \boldsymbol{\lambda} \\ \psi \end{pmatrix} = \begin{pmatrix} \mathbf{c} \\ 1 \end{pmatrix}

The weights λ\boldsymbol{\lambda} satisfy iλi=1\sum_i \lambda_i = 1 (unbiasedness), and the kriging variance is σOK2=c(0)cλ\sigma^2_{OK} = c(0) - \mathbf{c}^\top\boldsymbol{\lambda}.

Universal Kriging

When the mean is modeled as a spatial trend m(s)=k=0Kfk(s)βkm(\mathbf{s}) = \sum_{k=0}^{K} f_k(\mathbf{s})\beta_k, universal kriging accounts for this drift:

Universal Kriging

Z^(s0)=f(s0)β^+c0Ψ1(ZFβ^)\hat{Z}(\mathbf{s}_0) = \mathbf{f}(\mathbf{s}_0)^\top\hat{\boldsymbol{\beta}} + \mathbf{c}_0^\top \mathbf{\Psi}^{-1}(\mathbf{Z} - \mathbf{F}\hat{\boldsymbol{\beta}})

where β^=(FΨ1F)1FΨ1Z\hat{\boldsymbol{\beta}} = (\mathbf{F}^\top\mathbf{\Psi}^{-1}\mathbf{F})^{-1}\mathbf{F}^\top\mathbf{\Psi}^{-1}\mathbf{Z} is the generalized least squares estimator and Ψ\mathbf{\Psi} is the residual covariance matrix.


Spatial Regression

Spatial Lag Model

y=ρWy+Xβ+ε,εN(0,σ2In)y = \rho W y + X\beta + \varepsilon, \quad \varepsilon \sim N(0, \sigma^2 I_n)

Spatial Error Model

y=Xβ+u,u=λWu+ε,εN(0,σ2In)y = X\beta + u, \quad u = \lambda W u + \varepsilon, \quad \varepsilon \sim N(0, \sigma^2 I_n)

Omitted Variable Bias

Ignoring spatial dependence in the error terms leads to underestimation of standard errors by a factor of approximately (1ρ^)2(1 - \hat{\rho})^{-2}, inflating t-statistics and increasing false positive rates. Lagrange multiplier tests (LM-lag and LM-error) help diagnose the appropriate model specification.


Point Process Models

For data recorded as event locations {s1,,sn}\{\mathbf{s}_1, \dots, \mathbf{s}_n\} in a study region D\mathcal{D}, point process models characterize the stochastic mechanism generating these events.

DfPoisson Point Process

A homogeneous Poisson point process on D\mathcal{D} with intensity λ>0\lambda > 0 satisfies:

  1. For any Borel set BDB \subset \mathcal{D}, N(B)Poisson(λB)N(B) \sim \text{Poisson}(\lambda |B|)
  2. For disjoint B1,B2B_1, B_2, N(B1)N(B_1) and N(B2)N(B_2) are independent

The inhomogeneous extension allows λ(s)\lambda(\mathbf{s}) to vary over space, with N(B)Poisson(Bλ(s)ds)N(B) \sim \text{Poisson}\left(\int_B \lambda(\mathbf{s})\,d\mathbf{s}\right).

Quadrat Analysis

Quadrat Analysis

Divide D\mathcal{D} into mm quadrats of area aa. Under complete spatial randomness (CSR), NiPoisson(λa)N_i \sim \text{Poisson}(\lambda a) for each quadrat ii. The variance-to-mean ratio (VMR) is:

VMR=s2Nˉ=1m1i(NiNˉ)21miNi\text{VMR} = \frac{s^2}{\bar{N}} = \frac{\frac{1}{m-1}\sum_i(N_i - \bar{N})^2}{\frac{1}{m}\sum_i N_i}

VMR = 1 indicates CSR; VMR > 1 suggests clustering; VMR < 1 suggests regularity. The chi-squared statistic χ2=(m1)VMR\chi^2 = (m-1) \cdot \text{VMR} tests against the Poisson assumption.


Python Implementation

import numpy as np
import pandas as pd
from scipy import stats
from sklearn.neighbors import NearestNeighbors
import matplotlib.pyplot as plt

np.random.seed(42)

# --- Generate spatial data with autocorrelation ---
n = 200
coords = np.column_stack([
    np.random.uniform(0, 10, n),
    np.random.uniform(0, 10, n)
])

# True surface with spatial trend
def true_surface(x, y):
    return 3 * np.sin(x / 3) + 2 * np.cos(y / 4) + np.random.normal(0, 0.3, len(x))

z = true_surface(coords[:, 0], coords[:, 1])

# --- Spatial weights matrix (KNN) ---
k = 8
nn = NearestNeighbors(n_neighbors=k + 1).fit(coords)
distances, indices = nn.kneighbors(coords)

W = np.zeros((n, n))
for i in range(n):
    for j_idx in range(1, k + 1):
        j = indices[i, j_idx]
        W[i, j] = 1.0 / max(distances[i, j_idx], 1e-10)

# Row-standardize
row_sums = W.sum(axis=1, keepdims=True)
W = W / row_sums

# --- Moran's I ---
z_centered = z - z.mean()
numerator = n * z_centered @ W @ z_centered
denominator = W.sum() * (z_centered ** 2).sum()
morans_i = numerator / denominator

# Expected value and variance under H0
S0 = W.sum()
S1 = 0.5 * ((W + W.T) ** 2).sum()
S2 = ((W.sum(axis=1) + W.sum(axis=0)) ** 2).sum()
EI = -1.0 / (n - 1)
VI = ((n**2 - 3*n + 3)*S1 - n*S2 + 3*S0) / ((n**2 - 1)*S0) - 1.0/(n-1)**2
z_moran = (morans_i - EI) / np.sqrt(VI)

print(f"Moran's I: {morans_i:.4f}")
print(f"Expected I: {EI:.4f}")
print(f"z-score: {z_moran:.4f}")
print(f"p-value: {2 * (1 - stats.norm.cdf(abs(z_moran))):.6f}")

# --- Empirical Semivariogram ---
def empirical_variogram(coords, z, n_bins=15, max_dist=5.0):
    from scipy.spatial.distance import pdist, squareform
    D = squareform(pdist(coords))
    pairs = np.triu_indices(len(z), k=1)
    dists = D[pairs]
    diffs_sq = (z[pairs[0]] - z[pairs[1]]) ** 2

    bins = np.linspace(0, max_dist, n_bins + 1)
    bin_centers = 0.5 * (bins[:-1] + bins[1:])
    gamma = np.zeros(n_bins)

    for i in range(n_bins):
        mask = (dists >= bins[i]) & (dists < bins[i + 1])
        if mask.sum() > 0:
            gamma[i] = 0.5 * diffs_sq[mask].mean()
        else:
            gamma[i] = np.nan

    return bin_centers, gamma

h, gamma_hat = empirical_variogram(coords, z)
valid = ~np.isnan(gamma_hat)

# Fit exponential variogram via least squares
def exponential_variogram(h, nugget, sill, range_param):
    return nugget + sill * (1 - np.exp(-3 * h / range_param))

from scipy.optimize import curve_fit
popt, pcov = curve_fit(
    exponential_variogram, h[valid], gamma_hat[valid],
    p0=[0.1, 2.0, 2.0], maxfev=10000
)
print(f"\nFitted variogram: nugget={popt[0]:.3f}, sill={popt[1]:.3f}, range={popt[2]:.3f}")

# --- Simple Kriging prediction at a new point ---
def sk_predict(coords, z, new_point, cov_func, mu, **kwargs):
    from scipy.spatial.distance import cdist
    c_0 = cov_func(0, **kwargs)
    C = cov_func(cdist(coords, coords), **kwargs)
    c = cov_func(cdist(new_point, coords), **kwargs)

    C_inv = np.linalg.inv(C + 1e-8 * np.eye(len(C)))
    weights = C_inv @ c
    pred = mu + weights @ (z - mu)
    variance = c_0 - c @ C_inv @ c
    return pred, variance

def cov_exponential(dist, sill, range_param):
    return sill * np.exp(-3 * dist / range_param)

new_pt = np.array([[5.0, 5.0]])
mu_hat = z.mean()
pred, var = sk_predict(coords, z, new_pt, cov_exponential, mu_hat,
                        sill=popt[1], range_param=popt[2])
print(f"\nSimple Kriging prediction at (5,5): {pred[0]:.3f} ± {np.sqrt(max(var, 0)):.3f}")

Summary

Key Takeaways: Spatial Statistics

  1. Spatial autocorrelation invalidates classical independence assumptions; test with Moran's I or Geary's C before proceeding.
  2. The semivariogram quantifies spatial continuity: nugget, sill, and range are the three key parameters. The Matérn family provides flexible, nested modeling.
  3. Kriging is the BLUP under Gaussian assumptions: simple kriging (known mean), ordinary kriging (constant unknown mean), and universal kriging (trend + residual).
  4. Spatial regression models (lag and error) account for spatial dependence in outcome or residuals; omitting it inflates Type I error.
  5. Point process models and quadrat analysis characterize event locations; the Poisson process is the null model for complete spatial randomness.

Premium Content

Spatial Statistics

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement