Mann-Whitney U Test
Nonparametric Tests
Comparing Two Groups Without Normality
The Mann-Whitney U test determines whether one group systematically produces larger values than another, without requiring normal distributions. It's the go-to nonparametric alternative when two-sample t-test assumptions fail.
-
Medical Research — Compare pain levels between treatment and control groups
-
Marketing Analytics — Test whether customer satisfaction differs between two product versions
-
Environmental Science — Compare pollution levels across two geographic regions
Ranks replace raw values when distributions defy the bell curve.
The Mann-Whitney U test (Wilcoxon rank-sum test) is the nonparametric alternative to the two-sample t-test. Tests whether one group tends to have larger values than the other.
DfMann-Whitney U Test
A nonparametric test that determines whether values from one group tend to be larger than values from another group, without assuming normality.
U Statistic
Here,
- =The Mann-Whitney U statistic
- =Sample sizes of the two groups
- =Individual observations from each group
import numpy as np
from scipy import stats
np.random.seed(42)
# Non-normal data: response times (milliseconds)
group_a = np.array([250, 280, 230, 310, 270, 290, 260, 320, 245, 275, 1200]) # outlier!
group_b = np.array([200, 220, 195, 235, 215, 205, 240, 210, 225, 190])
print(f"Group A: median={np.median(group_a):.1f}, mean={np.mean(group_a):.1f}")
print(f"Group B: median={np.median(group_b):.1f}, mean={np.mean(group_b):.1f}")
# Mann-Whitney U test
U_stat, p_mw = stats.mannwhitneyu(group_a, group_b, alternative='two-sided')
print(f"\nMann-Whitney: U={U_stat:.2f}, p={p_mw:.4f}")
# Compare: independent t-test (affected by outlier)
t_stat, p_t = stats.ttest_ind(group_a, group_b)
print(f"Independent t-test: t={t_stat:.4f}, p={p_t:.4f}")
# Effect size: rank biserial correlation
n1, n2 = len(group_a), len(group_b)
r_rb = 1 - 2*U_stat/(n1*n2)
print(f"Effect size (rank biserial r) = {r_rb:.4f}")
# Manual U computation
U_manual = sum(1 for a in group_a for b in group_b if a > b)
print(f"Manual U = {U_manual} (fraction > : {U_manual/(n1*n2):.3f})")
Common Misconception
The Mann-Whitney U test does NOT directly test medians unless distributions are identical in shape. It tests stochastic dominance — whether values from one group tend to be larger.
Key Takeaways
Summary: Mann-Whitney U Test
-
Tests whether values from one group tend to be larger — stochastic dominance
-
Robust to outliers — uses ranks, not raw values
-
Does NOT test medians directly (common misconception) unless distributions are identical in shape
-
U ranges from 0 to n1×n2 — U = n1n2/2 means complete overlap
-
Effect size r = 1 - 2U/(n1n2): |r|<0.1 small, <0.3 medium, greater than 0.5 large