Permutation Tests

Hypothesis Testing

Let the Data Generate the Null Distribution

Permutation tests create null distributions by rearranging the data itself, making no assumptions about population distributions. They provide exact p-values and work with any test statistic.

Non-Normal Data — Testing hypotheses when distributional assumptions are violated
Small Samples — Obtaining valid inference with limited observations
Complex Statistics — Testing custom statistics that lack standard distributional theory

Permutation tests are the ultimate fallback when standard methods fail.

DfPermutation Tests

Permutation tests (randomization tests) generate the null distribution by permuting the data itself, making no distributional assumptions.

How They Work

Compute the observed test statistic
Repeatedly shuffle (permute) group labels
Compute the test statistic for each permutation
P-value = proportion of permuted statistics as extreme as the observed

Python Implementation

Permutation Test for Two Groups

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

np.random.seed(42)

# Example: Do two groups have different means?
# (Without assuming normality)
group_a = np.array([23, 28, 31, 25, 27, 29, 24, 30, 26, 28])
group_b = np.array([30, 35, 33, 28, 37, 32, 34, 36, 31, 33])

observed_diff = group_a.mean() - group_b.mean()
print(f"Observed difference: {observed_diff:.4f}")

# Permutation test
n_permutations = 10000
combined = np.concatenate([group_a, group_b])
n_a = len(group_a)

perm_diffs = np.array([
    np.random.permutation(combined)[:n_a].mean() -
    np.random.permutation(combined)[n_a:].mean()
    for _ in range(n_permutations)
])

# Two-tailed p-value
p_perm = (np.abs(perm_diffs) >= np.abs(observed_diff)).mean()

# Compare with t-test
t_stat, p_ttest = stats.ttest_ind(group_a, group_b)

print(f"Permutation test p-value: {p_perm:.4f}")
print(f"Welch's t-test p-value:   {p_ttest:.4f}")

Visualization

# Visualize permutation distribution
fig, ax = plt.subplots(figsize=(10, 5))
ax.hist(perm_diffs, bins=50, density=True, color='lightblue', edgecolor='black', alpha=0.7)
ax.axvline(observed_diff, color='red', linewidth=2, linestyle='--',
           label=f'Observed = {observed_diff:.3f}')
ax.axvline(-observed_diff, color='red', linewidth=2, linestyle='--')
ax.fill_betweenx([0, 0.15], -20, -abs(observed_diff), alpha=0.3, color='red', label=f'p = {p_perm:.4f}')
ax.fill_betweenx([0, 0.15], abs(observed_diff), 20, alpha=0.3, color='red')
ax.set_title(f'Permutation Distribution of Difference in Means\n(n_perm={n_permutations:,})')
ax.set_xlabel('Difference in Means (A - B)')
ax.legend()
plt.tight_layout()
plt.savefig('permutation_test.png', dpi=150)
plt.show()

Permutation Test for Correlation

Distribution-Free Correlation Test

# Permutation test for correlation (also distribution-free)
x = np.random.normal(0, 1, 30)
y = 0.5 * x + np.random.normal(0, 1, 30)

obs_r, _ = stats.pearsonr(x, y)
perm_r = [stats.pearsonr(np.random.permutation(x), y)[0] for _ in range(10000)]
p_corr_perm = (np.abs(perm_r) >= np.abs(obs_r)).mean()

print(f"\nCorrelation: r = {obs_r:.4f}")
print(f"Permutation p-value: {p_corr_perm:.4f}")

Advantages of Permutation Tests

Permutation vs Parametric Tests

Aspect	Parametric Test	Permutation Test
Distributional assumptions	Required	None
Works for any statistic	Limited	✅ Yes
Exact test	Approximate	Approximately exact
Computational cost	Fast	Slower (but fast on modern hardware)
Small samples	Problematic	Works well
Custom statistics	Difficult	Easy

Key Takeaways

Summary: Permutation Tests

No distributional assumptions — valid for any distribution
Works for any test statistic — median difference, correlation, R², etc.
P-value = fraction of permutations as extreme as observed
For small datasets, exact permutation (enumerate all arrangements) is possible
Computationally intensive but negligible with modern CPUs (10,000 permutations in less than 1 second)

Permutation Tests — Distribution-Free Hypothesis Testing