Friedman Test
Nonparametric Tests
Nonparametric Repeated Measures ANOVA
When the same subjects are measured under multiple conditions and normality cannot be assumed, the Friedman test ranks observations within each block to detect systematic differences across treatments.
- Psychology — Compare response times across multiple stimuli within the same participants
- Food Science — Rate product preferences when panelists taste several brands
- Ergonomics — Assess workstation designs with repeated measurements on the same workers
Ranks within blocks reveal treatment differences without distributional assumptions.
The Friedman test is the nonparametric alternative to repeated-measures ANOVA. Used when the same subjects are measured under k conditions and normality cannot be assumed.
DfFriedman Test
A nonparametric test for comparing related samples across multiple conditions, ranking observations within each subject before comparing across conditions.
import numpy as np
from scipy import stats
np.random.seed(42)
# Scenario: 15 athletes tested under 3 training conditions
n_subjects = 15
# Performance scores (not normally distributed)
condition_a = np.array([8.2, 7.5, 9.1, 6.8, 8.9, 7.2, 9.5, 6.9, 8.3, 7.8, 9.2, 7.1, 8.7, 7.4, 9.0])
condition_b = np.array([7.8, 7.1, 8.6, 6.3, 8.5, 6.9, 9.1, 6.5, 7.9, 7.4, 8.8, 6.7, 8.3, 7.0, 8.6])
condition_c = np.array([9.1, 8.4, 9.8, 7.7, 9.6, 8.2, 10.2, 7.8, 9.4, 8.7, 9.9, 8.0, 9.6, 8.3, 9.7])
# Friedman test
stat, p = stats.friedmanchisquare(condition_a, condition_b, condition_c)
k, n = 3, n_subjects
df = k - 1
print(f"Friedman χ²({df}) = {stat:.4f}, p = {p:.6f}")
print(f"Decision: {'Reject H₀ — conditions differ' if p < 0.05 else 'Fail to reject H₀'}")
# Kendall's W (effect size)
data = np.column_stack([condition_a, condition_b, condition_c])
W = stat / (n * (k - 1))
print(f"Kendall's W = {W:.4f} ({'small' if W < 0.1 else 'medium' if W < 0.3 else 'large'} effect)")
# Post-hoc: pairwise Wilcoxon signed-rank tests (Bonferroni corrected)
print("\nPost-hoc: Pairwise Wilcoxon signed-rank (Bonferroni corrected):")
pairs = [('A vs B', condition_a, condition_b),
('A vs C', condition_a, condition_c),
('B vs C', condition_b, condition_c)]
alpha_bonf = 0.05 / len(pairs)
for name, g1, g2 in pairs:
_, p_w = stats.wilcoxon(g1, g2, alternative='two-sided')
print(f" {name}: p = {p_w:.6f} -> {'*' if p_w < alpha_bonf else 'ns'}")
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(8, 5))
ax.boxplot([condition_a, condition_b, condition_c],
labels=['Condition A', 'Condition B', 'Condition C'],
patch_artist=True)
ax.set_title(f'Performance by Condition (n={n} athletes)\nFriedman χ²={stat:.3f}, p={p:.4f}')
ax.set_ylabel('Performance Score')
plt.tight_layout()
plt.savefig('friedman_test.png', dpi=150)
plt.show()
Kendall's W
Kendall's W measures agreement across raters or conditions. It ranges from 0 (no agreement) to 1 (perfect agreement).
Key Takeaways
Summary: Friedman Test
- Nonparametric repeated-measures ANOVA — for within-subjects designs
- Ranks within each subject (row-wise), then compares across conditions
- Kendall's W measures agreement across raters/conditions (0 = no agreement, 1 = perfect)
- Post-hoc: pairwise Wilcoxon with Bonferroni or Nemenyi test
- Use instead of repeated-measures ANOVA when normality fails