Introduction
Advanced Statistical Methods
Visualizing Similarity as Spatial Maps
Multidimensional scaling transforms proximity data into a geometric configuration where distances reflect similarities. Both metric and non-metric MDS preserve relational structure in low-dimensional maps.
- Marketing research β Visualize how consumers perceive brand similarities in perceptual maps
- Psychology β Map cognitive distances between stimuli in multidimensional space
- Bioinformatics β Visualize genetic distances between populations in two or three dimensions
MDS turns abstract similarity judgments into intuitive spatial representations.
Multidimensional Scaling (MDS) is a family of techniques that transform a matrix of pairwise distances or dissimilarities into a configuration of points in a low-dimensional space. The goal is to preserve the original distance structure as faithfully as possible, yielding a spatial representation that reveals hidden patterns in proximity data.
MDS is particularly valuable when the underlying variables are latent or when direct measurement is infeasible β for example, when subjects can judge the similarity between stimuli but cannot articulate the dimensions on which they differ.
Distance and Dissimilarity
Mathematical Preliminaries
Let be an matrix of observed dissimilarities between objects. MDS seeks points (with ) such that the Euclidean distances approximate .
DfDissimilarity Matrix Properties
A dissimilarity matrix is:
- Symmetric: for all
- Zero diagonal: for all
- Non-negative:
- Metric (if applicable): satisfies the triangle inequality
Classical MDS (Principal Coordinates Analysis)
Eigendecomposition Method
Classical MDS (also called Principal Coordinates Analysis, Torgerson, 1952) provides an exact solution when dissimilarities are Euclidean distances:
where , is the centering matrix, and is the doubly centered matrix of squared distances.
ThClassical MDS Solution
If contains Euclidean distances, then is positive semi-definite and equals where is the configuration matrix. The -dimensional MDS solution is:
where contains the leading eigenvectors of and the corresponding eigenvalues.
The quality of representation is assessed by the proportion of variance explained:
where are the positive eigenvalues of .
Torgerson's Double Centering
Classical MDS can be expressed as a Gram matrix decomposition:
The eigendecomposition of yields the configuration , where each row is the coordinate vector for object .
Classical MDS preserves metric distance relationships. When the original distances are not Euclidean (e.g., city-block, path distances), the centered matrix may have negative eigenvalues. In such cases, the negative eigenvalues are typically set to zero, and the remaining positive eigenvalues define the best Euclidean approximation.
Metric MDS
Stress Function
When the dissimilarities are not Euclidean, or when a general monotonic relationship is desired, metric MDS minimizes a stress function:
DfMetric Stress (Kruskal's Stress-1)
Kruskal's Stress-1 for -dimensional MDS is:
where are the configuration distances and are the disparities (transformed dissimilarities). In metric MDS, for a known function (typically the identity).
Common stress measures include:
Strain (Smallest Space Analysis)
An alternative to stress is the strain function (Lingoes, 1971; Guttman, 1968):
DfStrain
where are the elements of the centered matrix . Minimizing strain is equivalent to classical MDS and can be solved via eigendecomposition.
Non-Metric MDS
Ordinal Stress
When only the rank order of dissimilarities is meaningful (ordinal data), non-metric MDS (Kruskal, 1964) finds a monotonic transformation:
DfNon-Metric MDS
Non-metric MDS minimizes:
subject to the monotonicity constraint:
where are the disparities obtained by isotonic regression of on .
The Primary Approach places ties in the dissimilarities at the same distance level. The Secondary Approach places ties at different distances.
Algorithm: SMACOF
The SMACOF (Scaling by Majorizing a Complicated Function) algorithm (de Leeuw, 1977) provides guaranteed convergence to a local minimum:
where is the Moore-Penrose pseudoinverse of the configuration matrix, is the step size, and is the Guttman transform.
SMACOF guarantees monotone decrease of stress at each iteration. The convergence rate is linear, and the algorithm typically converges in 50β200 iterations. Multiple random starts are recommended to avoid local minima.
Shepard Diagram
The Shepard diagram is the primary diagnostic tool for MDS. It plots the original dissimilarities (x-axis) against the configuration distances (y-axis).
DfShepard Diagram
A Shepard diagram displays the relationship between original dissimilarities and MDS distances:
- Metric MDS: Points cluster around a linear (or known function) relationship
- Non-Metric MDS: Points cluster around a monotonically increasing function
- Stress: Proportional to the vertical scatter of points around the monotonic curve
- Degenerate solutions: All distances equal, appearing as a horizontal line
PROXSCAL
PROXSCAL (Commandeur & Heiser, 1993) is a general-purpose MDS algorithm that accommodates:
- Metric, non-metric, and polynomial transformations
- Individual differences (INDSCAL model)
- Weighted Euclidean models
- Multiple dissimilarity matrices
where is the number of sources (replications), are weights, and in the weighted Euclidean model.
Interpretation and Dimensional Interpretation
Procrustean Rotation
To facilitate interpretation, the MDS configuration can be rotated to match a target configuration:
where is an orthogonal rotation matrix that minimizes .
Property Fitting
External variables can be regressed onto MDS dimensions to label the axes:
where are regression coefficients indicating how strongly variable relates to MDS dimension .
Python Implementation
import numpy as np
from sklearn.manifold import MDS
from sklearn.metrics import pairwise_distances
import matplotlib.pyplot as plt
np.random.seed(42)
# Generate synthetic dissimilarity data
n = 30
X_true = np.random.randn(n, 3)
D_true = pairwise_distances(X_true, metric='euclidean')
# --- Classical MDS (sklearn) ---
mds_classical = MDS(n_components=2, dissimilarity='euclidean', random_state=42)
X_classical = mds_classical.fit_transform(D_true)
print("Classical MDS stress:", mds_classical.stress_)
# --- Metric MDS ---
mds_metric = MDS(
n_components=2,
metric=True,
dissimilarity='euclidean',
random_state=42,
n_init=10,
max_iter=300,
)
X_metric = mds_metric.fit_transform(D_true)
print("Metric MDS stress:", mds_metric.stress_)
# --- Non-metric MDS ---
mds_nonmetric = MDS(
n_components=2,
metric=False,
dissimilarity='euclidean',
random_state=42,
n_init=10,
max_iter=300,
)
X_nonmetric = mds_nonmetric.fit_transform(D_true)
print("Non-metric MDS stress:", mds_nonmetric.stress_)
# --- Shepard Diagram ---
def shepard_diagram(D_true, X_mds, title="Shepard Diagram"):
d_mds = pairwise_distances(X_mds)
triu_idx = np.triu_indices(n, k=1)
fig, ax = plt.subplots(figsize=(6, 6))
ax.scatter(D_true[triu_idx], d_mds[triu_idx], alpha=0.3, s=10)
# Monotonic line (isotonic regression)
from scipy.stats import rankdata
from sklearn.isotonic import IsotonicRegression
ir = IsotonicRegression()
d_flat = D_true[triu_idx].reshape(-1)
m_flat = d_mds[triu_idx].reshape(-1)
ir.fit(d_flat, m_flat)
x_mono = np.linspace(d_flat.min(), d_flat.max(), 200)
ax.plot(x_mono, ir.predict(x_mono), 'r-', linewidth=2, label='Monotonic fit')
ax.plot([0, d_flat.max()], [0, d_flat.max()], 'k--', alpha=0.5, label='Perfect')
ax.set_xlabel("Original dissimilarities")
ax.set_ylabel("MDS distances")
ax.set_title(title)
ax.legend()
plt.tight_layout()
return fig
# --- Classical MDS via eigendecomposition ---
def classical_mds_eigen(D, m=2):
n = D.shape[0]
D2 = D ** 2
J = np.eye(n) - np.ones((n, n)) / n
B = -0.5 * J @ D2 @ J
eigenvalues, eigenvectors = np.linalg.eigh(B)
idx = np.argsort(eigenvalues)[::-1][:m]
Lambda_m = np.diag(np.sqrt(np.maximum(eigenvalues[idx], 0)))
X = eigenvectors[:, idx] @ Lambda_m
return X, eigenvalues
X_manual, evals = classical_mds_eigen(D_true, m=2)
proportion = np.sum(evals[evals > 0][:2]) / np.sum(evals[evals > 0])
print(f"Classical MDS variance explained: {proportion:.4f}")
# --- Stress computation ---
def compute_stress(D, X, metric=True):
n = D.shape[0]
d_mds = pairwise_distances(X)
if metric:
return np.sqrt(np.sum((D - d_mds)**2) / np.sum(D**2))
else:
from scipy.stats import rankdata
D_rank = rankdata(D, method='average')
d_rank = rankdata(d_mds, method='average')
return np.sqrt(np.sum((D_rank - d_rank)**2) / np.sum(D_rank**2))
stress_manual = compute_stress(D_true, X_manual)
print(f"Manual classical MDS stress: {stress_manual:.4f}")
# --- Individual Differences (INDSCAL) ---
def indscal(K_matrices, m=2):
n = K_matrices[0].shape[0]
K = len(K_matrices)
B_avg = np.zeros((n, n))
weights = []
for Dk in K_matrices:
Dk2 = Dk ** 2
J = np.eye(n) - np.ones((n, n)) / n
B_k = -0.5 * J @ Dk2 @ J
B_avg += B_k
B_avg /= K
eigenvalues, eigenvectors = np.linalg.eigh(B_avg)
idx = np.argsort(eigenvalues)[::-1][:m]
X_group = eigenvectors[:, idx] @ np.diag(np.sqrt(np.maximum(eigenvalues[idx], 0)))
for Dk in K_matrices:
Dk2 = Dk ** 2
J = np.eye(n) - np.ones((n, n)) / n
B_k = -0.5 * J @ Dk2 @ J
w_k = np.diag(X_group.T @ B_k @ X_group) / np.diag(X_group.T @ X_group)
weights.append(w_k)
return X_group, np.array(weights)
# Simulate 3 judges
K_matrices = [D_true + 0.3 * np.random.randn(n, n) for _ in range(3)]
K_matrices = [(k + k.T) / 2 for k in K_matrices] # symmetrize
X_indscal, w_indscal = indscal(K_matrices, m=2)
print("INDSCAL weights:\n", np.round(w_indscal, 3))
Quality Assessment
Assessing MDS Solution Quality:
-
Stress value: Stress-1 < 0.025 (excellent), < 0.05 (good), < 0.10 (fair), > 0.15 (poor). Use scree plot of stress vs. dimensionality.
-
Shepard diagram: Examine the scatter of points around the monotonic curve. Tight clustering indicates good fit.
-
Variance accounted for (VAF): In classical MDS, the proportion of positive eigenvalues captured by the -dimensional solution.
-
R-squared: , the proportion of variance in disparities explained by distances.
-
Number of dimensions: Use scree plot (stress vs. ), elbow criterion, or interpretability criterion.
-
Local minima: Run MDS with multiple random starts (10β50) and verify convergence to the same stress level.
-
Stability: Perturb the dissimilarities slightly and check if the configuration remains similar (Procrustes correlation > 0.90).
-
Degenerate solutions: If all distances are equal or the configuration collapses to a single point, the MDS solution is degenerate β indicating severe misfit.
Applications
Psychology: Mapping perceived similarity between stimuli (e.g., colors, facial expressions, concepts). The spatial configuration reveals the psychological dimensions underlying perception.
Marketing: Product positioning β mapping consumer perceptions of brands in a space where distances reflect perceived dissimilarity. Competitive strategies are informed by the proximity structure.
Ecology: Ordination of species communities along environmental gradients. Non-metric MDS (NMDS) is the standard method for community ecology because it handles non-Euclidean ecological distances (Bray-Curtis, Jaccard).
Bioinformatics: Mapping protein structures, gene expression profiles, or drug response patterns where direct Euclidean distance is inappropriate.