Effect Size

Hypothesis Testing

How Big Is the Real Effect?

Effect size measures the magnitude of an effect independent of sample size, answering the practical significance question that p-values cannot. It transforms statistical significance into meaningful conclusions.

Meta-Analysis — Combining results across studies using standardized effect measures
Education Research — Evaluating whether interventions produce practically important gains
Business Decisions — Assessing whether improvements justify implementation costs

Statistical significance tells you if an effect exists; effect size tells you if it matters.

DfEffect Size

Effect size quantifies the magnitude of an effect, independent of sample size. A statistically significant result without effect size is incomplete.

Cohen's d (Mean Difference)

Cohen's d

d = \frac{\bar{x}_1 - \bar{x}_2}{s_p}, \quad s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}

Here,

$d$ =Cohen's d effect size
$\bar{x}_1, \bar{x}_2$ =Sample means of each group
$s_p$ =Pooled standard deviation
$n_1, n_2$ =Sample sizes of each group

Python Implementation

Computing Effect Sizes

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Cohen's d guidelines
def cohens_d(group1, group2):
    n1, n2 = len(group1), len(group2)
    s1, s2 = group1.std(ddof=1), group2.std(ddof=1)
    sp = np.sqrt(((n1-1)*s1**2 + (n2-1)*s2**2) / (n1+n2-2))
    d = (group1.mean() - group2.mean()) / sp
    size = 'negligible' if abs(d)<0.2 else 'small' if abs(d)<0.5 else 'medium' if abs(d)<0.8 else 'large'
    print(f"Cohen's d = {d:.4f} ({size} effect)")
    return d

np.random.seed(42)
group_a = np.random.normal(75, 10, 50)
group_b = np.random.normal(80, 10, 50)
d = cohens_d(group_a, group_b)

Overlap Visualization

# Overlap visualization
fig, ax = plt.subplots(figsize=(10, 5))
x = np.linspace(40, 120, 500)
ax.plot(x, stats.norm.pdf(x, group_a.mean(), group_a.std()), 'b-', linewidth=2, label=f'Group A (μ={group_a.mean():.1f})')
ax.plot(x, stats.norm.pdf(x, group_b.mean(), group_b.std()), 'r-', linewidth=2, label=f'Group B (μ={group_b.mean():.1f})')
ax.fill_between(x, stats.norm.pdf(x, group_a.mean(), group_a.std()), alpha=0.3, color='blue')
ax.fill_between(x, stats.norm.pdf(x, group_b.mean(), group_b.std()), alpha=0.3, color='red')
ax.set_title(f"Cohen's d = {d:.3f} (medium effect)\nOverlap illustrates practical difference")
ax.legend()
plt.tight_layout()
plt.savefig('effect_size_overlap.png', dpi=150)
plt.show()

Other Effect Size Measures

Pearson r, η², ω²

# ==========================================
# Other effect sizes
# ==========================================

# Pearson r (for correlation)
r_val = 0.40
print(f"\nPearson r = {r_val} -> {'small' if abs(r_val)<0.3 else 'medium' if abs(r_val)<0.5 else 'large'} effect")

# Eta-squared (ANOVA: proportion of variance explained)
from scipy.stats import f_oneway
g1, g2, g3 = np.random.normal(50,5,30), np.random.normal(55,5,30), np.random.normal(60,5,30)
F, p = f_oneway(g1, g2, g3)
all_data = np.concatenate([g1, g2, g3])
grand_mean = all_data.mean()
ss_between = sum(len(g)*(g.mean()-grand_mean)**2 for g in [g1,g2,g3])
ss_total = sum((x - grand_mean)**2 for x in all_data)
eta_sq = ss_between / ss_total
omega_sq = (ss_between - (3-1)*all_data.var()) / (ss_total + all_data.var())  # approx
print(f"\nANOVA effect sizes:")
print(f"η² (eta-squared) = {eta_sq:.4f} -> {'small' if eta_sq<0.06 else 'medium' if eta_sq<0.14 else 'large'}")
print(f"ω² (omega-squared, less biased) ≈ {max(omega_sq, 0):.4f}")

Effect Size Reference Table

Measure	Small	Medium	Large	Used For
Cohen's d	0.2	0.5	0.8	Mean differences (t-test)
Pearson r	0.1	0.3	0.5	Correlations
R²	0.01	0.09	0.25	Regression
η²	0.01	0.06	0.14	ANOVA
Cramér's V	0.1	0.3	0.5	Chi-square
Cohen's f	0.1	0.25	0.40	ANOVA (alternative)

Key Takeaways

Summary: Effect Size

Always report effect size alongside p-values — p alone is incomplete
Cohen's d is the go-to for t-test comparisons
Eta-squared (η²) tells you % of variance explained in ANOVA
Small effects can matter in large populations (e.g., 0.2 SD improvement in public health)
Context matters — what's "small" varies by field

Effect Size — Cohen's d, r, η², ω² and When They Matter