Kendall's Tau

Descriptive Statistics

Count Pairs, Not Pixels — A More Intuitive Correlation

Kendall's τ takes a fundamentally different approach to correlation: instead of squaring differences, it simply counts how many pairs of observations agree or disagree. This makes it one of the most intuitive measures of rank association.

Key things Kendall's tau helps you understand:

Concordant vs discordant pairs — A pair agrees (concordant) if both variables rank in the same order; it disagrees (discordant) if they flip.
Robust small-sample behavior — Kendall's τ has better statistical properties than Spearman for small datasets with many tied ranks.
Tied rank handling — The τ-b variant gracefully adjusts for ties, making it the default choice in most software.

In small samples or datasets full of ties, Kendall's τ often tells the more honest story.

What is Kendall's Tau?

Definition

Kendall's τ (tau) measures the ordinal association between two variables by counting concordant and discordant pairs.

DfKendall's Tau

Kendall's tau is a non-parametric measure of rank correlation that assesses the similarity of the orderings of data when ranked by each of the quantities. It is based on the number of concordant and discordant pairs.

Kendall's τ-b Formula

\tau_b = \frac{C - D}{\sqrt{(C + D + T_x)(C + D + T_y)}}

Here,

$C$ =Number of concordant pairs
$D$ =Number of discordant pairs
$T_x$ =Number of tied pairs in X
$T_y$ =Number of tied pairs in Y

import numpy as np
from scipy import stats

np.random.seed(42)

x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([2, 1, 4, 3, 6, 5, 8, 7])

tau, p_value = stats.kendalltau(x, y)
print(f"Kendall's tau = {tau:.4f}")
print(f"p-value       = {p_value:.4f}")

Concordant vs Discordant Pairs

A pair (i, j) is concordant if both x and y rank in the same order: $x_i < x_j$ and $y_i < y_j$ (or both >).

A pair is discordant if the ranks disagree: $x_i < x_j$ but $y_i > y_j$ .

Pair Type	Condition	Effect on τ
Concordant	Same order in X and Y	Increases τ
Discordant	Opposite order in X and Y	Decreases τ
Tied	Same rank in X or Y	Excluded from calculation

Kendall's τ-a vs τ-b

Variant	Formula	When to Use
τ-a	$\frac{C - D}{\frac{1}{2}n(n-1)}$	No ties in data
τ-b	$\frac{C - D}{\sqrt{(C+D+T_x)(C+D+T_y)}}$	Handles ties (default in scipy)
τ-c	$\frac{2(C-D)}{n^2 \cdot \frac{m-1}{m}}$	Rectangular tables

# τ-a (no ties correction)
n = len(x)
tau_a = 2 * (tau)  # Simplified; scipy does not directly compute τ-a
print(f"τ-b = {tau:.4f}")

Comparison with Spearman

np.random.seed(42)
n = 20
x = np.random.normal(0, 1, n)
y = x + np.random.normal(0, 0.5, n)

rho, _ = stats.spearmanr(x, y)
tau, _ = stats.kendalltau(x, y)

print(f"Spearman ρ = {rho:.4f}")
print(f"Kendall τ  = {tau:.4f}")

Kendall vs Spearman

Kendall's τ is generally more robust with small sample sizes and has better statistical properties. For large samples, τ and ρ are very similar. Kendall is preferred when the dataset has many tied ranks.

Kendall Tau in Machine Learning

ML Application	Kendall Tau Usage	Why
Learning to rank	Evaluate ranking agreement	Handles ties better than Spearman
Recommendation systems	User preference ordering	Rank-based evaluation
Feature selection	Ordinal feature relationships	Robust to outliers
Inter-rater reliability	Agreement between annotators	Label quality assessment

import numpy as np
from scipy.stats import kendalltau, spearmanr

# Compare rankings
model_a_ranks = [1, 2, 3, 4, 5]
model_b_ranks = [1, 3, 2, 5, 4]

kendall_tau, _ = kendalltau(model_a_ranks, model_b_ranks)
spearman_rho, _ = spearmanr(model_a_ranks, model_b_ranks)
print(f"Ranking agreement: Kendall τ = {kendall_tau:.3f}, Spearman ρ = {spearman_rho:.3f}")
print("Kendall tau is more conservative (fewer pairwise comparisons)")

Key Takeaways

Kendall's τ counts concordant vs discordant pairs — an intuitive, pair-wise interpretation of rank association.

τ ranges from -1 to +1 — same scale as Spearman's ρ, making comparison straightforward.

τ-b is the default — handles tied ranks gracefully, unlike the simpler τ-a variant.

More robust than Spearman for small samples and many ties, but expect lower absolute values — this is expected, not a flaw.

"When your sample is small or full of ties, let Kendall's τ do the counting — concordant pairs agree, discordant pairs disagree, and the truth falls somewhere in between."

Summary: Kendall's Tau

Kendall's τ counts concordant vs discordant pairs — intuitive interpretation
τ ranges from -1 to +1 — same scale as Spearman's ρ
τ-b is the default — handles tied ranks gracefully
More robust than Spearman for small samples and many ties
Lower absolute values than Spearman for the same data — this is expected and not a difference in strength
Use when: sample size is small, ties are frequent, or you need a more robust measure

Kendall's Tau — Concordance-Based Correlation