Mann-Whitney U Test

Nonparametric Tests

Comparing Two Groups Without Normality

The Mann-Whitney U test determines whether one group systematically produces larger values than another, without requiring normal distributions. It's the go-to nonparametric alternative when two-sample t-test assumptions fail.

Medical Research — Compare pain levels between treatment and control groups
Marketing Analytics — Test whether customer satisfaction differs between two product versions
Environmental Science — Compare pollution levels across two geographic regions

Ranks replace raw values when distributions defy the bell curve.

The Mann-Whitney U test (Wilcoxon rank-sum test) is the nonparametric alternative to the two-sample t-test. Tests whether one group tends to have larger values than the other.

DfMann-Whitney U Test

A nonparametric test that determines whether values from one group tend to be larger than values from another group, without assuming normality.

U Statistic

U = \sum_{i=1}^{n_1}\sum_{j=1}^{n_2} \mathbb{1}(x_{1i} > x_{2j})

Here,

$U$ =The Mann-Whitney U statistic
$n_1, n_2$ =Sample sizes of the two groups
$x_{1i}, x_{2j}$ =Individual observations from each group


import numpy as np

from scipy import stats



np.random.seed(42)



# Non-normal data: response times (milliseconds)

group_a = np.array([250, 280, 230, 310, 270, 290, 260, 320, 245, 275, 1200])  # outlier!

group_b = np.array([200, 220, 195, 235, 215, 205, 240, 210, 225, 190])



print(f"Group A: median={np.median(group_a):.1f}, mean={np.mean(group_a):.1f}")

print(f"Group B: median={np.median(group_b):.1f}, mean={np.mean(group_b):.1f}")



# Mann-Whitney U test

U_stat, p_mw = stats.mannwhitneyu(group_a, group_b, alternative='two-sided')

print(f"\nMann-Whitney: U={U_stat:.2f}, p={p_mw:.4f}")



# Compare: independent t-test (affected by outlier)

t_stat, p_t = stats.ttest_ind(group_a, group_b)

print(f"Independent t-test: t={t_stat:.4f}, p={p_t:.4f}")



# Effect size: rank biserial correlation

n1, n2 = len(group_a), len(group_b)

r_rb = 1 - 2*U_stat/(n1*n2)

print(f"Effect size (rank biserial r) = {r_rb:.4f}")


# Manual U computation

U_manual = sum(1 for a in group_a for b in group_b if a > b)

print(f"Manual U = {U_manual} (fraction > : {U_manual/(n1*n2):.3f})")

Common Misconception

The Mann-Whitney U test does NOT directly test medians unless distributions are identical in shape. It tests stochastic dominance — whether values from one group tend to be larger.

Key Takeaways

Summary: Mann-Whitney U Test

Tests whether values from one group tend to be larger — stochastic dominance
Robust to outliers — uses ranks, not raw values
Does NOT test medians directly (common misconception) unless distributions are identical in shape
U ranges from 0 to n1×n2 — U = n1n2/2 means complete overlap
Effect size r = 1 - 2U/(n1n2): |r|<0.1 small, <0.3 medium, greater than 0.5 large

Mann-Whitney U Test — Nonparametric Two-Sample Test