Kendall's Tau
Descriptive Statistics
Count Pairs, Not Pixels — A More Intuitive Correlation
Kendall's τ takes a fundamentally different approach to correlation: instead of squaring differences, it simply counts how many pairs of observations agree or disagree. This makes it one of the most intuitive measures of rank association.
Key things Kendall's tau helps you understand:
- Concordant vs discordant pairs — A pair agrees (concordant) if both variables rank in the same order; it disagrees (discordant) if they flip.
- Robust small-sample behavior — Kendall's τ has better statistical properties than Spearman for small datasets with many tied ranks.
- Tied rank handling — The τ-b variant gracefully adjusts for ties, making it the default choice in most software.
In small samples or datasets full of ties, Kendall's τ often tells the more honest story.
What is Kendall's Tau?
Definition
Kendall's τ (tau) measures the ordinal association between two variables by counting concordant and discordant pairs.
DfKendall's Tau
Kendall's tau is a non-parametric measure of rank correlation that assesses the similarity of the orderings of data when ranked by each of the quantities. It is based on the number of concordant and discordant pairs.
Kendall's τ-b Formula
Here,
- =Number of concordant pairs
- =Number of discordant pairs
- =Number of tied pairs in X
- =Number of tied pairs in Y
import numpy as np
from scipy import stats
np.random.seed(42)
x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([2, 1, 4, 3, 6, 5, 8, 7])
tau, p_value = stats.kendalltau(x, y)
print(f"Kendall's tau = {tau:.4f}")
print(f"p-value = {p_value:.4f}")
Concordant vs Discordant Pairs
A pair (i, j) is concordant if both x and y rank in the same order: and (or both >).
A pair is discordant if the ranks disagree: but .
| Pair Type | Condition | Effect on τ |
|---|---|---|
| Concordant | Same order in X and Y | Increases τ |
| Discordant | Opposite order in X and Y | Decreases τ |
| Tied | Same rank in X or Y | Excluded from calculation |
Kendall's τ-a vs τ-b
| Variant | Formula | When to Use |
|---|---|---|
| τ-a | No ties in data | |
| τ-b | Handles ties (default in scipy) | |
| τ-c | Rectangular tables |
# τ-a (no ties correction)
n = len(x)
tau_a = 2 * (tau) # Simplified; scipy does not directly compute τ-a
print(f"τ-b = {tau:.4f}")
Comparison with Spearman
np.random.seed(42)
n = 20
x = np.random.normal(0, 1, n)
y = x + np.random.normal(0, 0.5, n)
rho, _ = stats.spearmanr(x, y)
tau, _ = stats.kendalltau(x, y)
print(f"Spearman ρ = {rho:.4f}")
print(f"Kendall τ = {tau:.4f}")
Kendall vs Spearman
Kendall's τ is generally more robust with small sample sizes and has better statistical properties. For large samples, τ and ρ are very similar. Kendall is preferred when the dataset has many tied ranks.
Kendall Tau in Machine Learning
| ML Application | Kendall Tau Usage | Why |
|---|---|---|
| Learning to rank | Evaluate ranking agreement | Handles ties better than Spearman |
| Recommendation systems | User preference ordering | Rank-based evaluation |
| Feature selection | Ordinal feature relationships | Robust to outliers |
| Inter-rater reliability | Agreement between annotators | Label quality assessment |
import numpy as np
from scipy.stats import kendalltau, spearmanr
# Compare rankings
model_a_ranks = [1, 2, 3, 4, 5]
model_b_ranks = [1, 3, 2, 5, 4]
kendall_tau, _ = kendalltau(model_a_ranks, model_b_ranks)
spearman_rho, _ = spearmanr(model_a_ranks, model_b_ranks)
print(f"Ranking agreement: Kendall τ = {kendall_tau:.3f}, Spearman ρ = {spearman_rho:.3f}")
print("Kendall tau is more conservative (fewer pairwise comparisons)")
Key Takeaways
Kendall's τ counts concordant vs discordant pairs — an intuitive, pair-wise interpretation of rank association.
τ ranges from -1 to +1 — same scale as Spearman's ρ, making comparison straightforward.
τ-b is the default — handles tied ranks gracefully, unlike the simpler τ-a variant.
More robust than Spearman for small samples and many ties, but expect lower absolute values — this is expected, not a flaw.
"When your sample is small or full of ties, let Kendall's τ do the counting — concordant pairs agree, discordant pairs disagree, and the truth falls somewhere in between."
Summary: Kendall's Tau
- Kendall's τ counts concordant vs discordant pairs — intuitive interpretation
- τ ranges from -1 to +1 — same scale as Spearman's ρ
- τ-b is the default — handles tied ranks gracefully
- More robust than Spearman for small samples and many ties
- Lower absolute values than Spearman for the same data — this is expected and not a difference in strength
- Use when: sample size is small, ties are frequent, or you need a more robust measure