πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Multidimensional Scaling (MDS)

Advanced Statistical MethodsDimensionality Reduction🟒 Free Lesson

Advertisement

Introduction

Advanced Statistical Methods

Visualizing Similarity as Spatial Maps

Multidimensional scaling transforms proximity data into a geometric configuration where distances reflect similarities. Both metric and non-metric MDS preserve relational structure in low-dimensional maps.

  • Marketing research β€” Visualize how consumers perceive brand similarities in perceptual maps
  • Psychology β€” Map cognitive distances between stimuli in multidimensional space
  • Bioinformatics β€” Visualize genetic distances between populations in two or three dimensions

MDS turns abstract similarity judgments into intuitive spatial representations.


Multidimensional Scaling (MDS) is a family of techniques that transform a matrix of pairwise distances or dissimilarities into a configuration of points in a low-dimensional space. The goal is to preserve the original distance structure as faithfully as possible, yielding a spatial representation that reveals hidden patterns in proximity data.

MDS is particularly valuable when the underlying variables are latent or when direct measurement is infeasible β€” for example, when subjects can judge the similarity between stimuli but cannot articulate the dimensions on which they differ.

Distance and Dissimilarity

Mathematical Preliminaries

Let Ξ”=[Ξ΄ij]\boldsymbol{\Delta} = [\delta_{ij}] be an nΓ—nn \times n matrix of observed dissimilarities between nn objects. MDS seeks points x1,…,xn∈Rm\mathbf{x}_1, \dots, \mathbf{x}_n \in \mathbb{R}^m (with mβ‰ͺnm \ll n) such that the Euclidean distances dij=βˆ₯xiβˆ’xjβˆ₯d_{ij} = \|\mathbf{x}_i - \mathbf{x}_j\| approximate Ξ΄ij\delta_{ij}.

DfDissimilarity Matrix Properties

A dissimilarity matrix Ξ”\boldsymbol{\Delta} is:

  • Symmetric: Ξ΄ij=Ξ΄ji\delta_{ij} = \delta_{ji} for all i,ji, j
  • Zero diagonal: Ξ΄ii=0\delta_{ii} = 0 for all ii
  • Non-negative: Ξ΄ijβ‰₯0\delta_{ij} \geq 0
  • Metric (if applicable): satisfies the triangle inequality Ξ΄ik≀δij+Ξ΄jk\delta_{ik} \leq \delta_{ij} + \delta_{jk}

Classical MDS (Principal Coordinates Analysis)

Eigendecomposition Method

Classical MDS (also called Principal Coordinates Analysis, Torgerson, 1952) provides an exact solution when dissimilarities are Euclidean distances:

B=βˆ’12JΞ”(2)J\mathbf{B} = -\frac{1}{2}\mathbf{J}\boldsymbol{\Delta}^{(2)}\mathbf{J}

where Ξ”(2)=[Ξ΄ij2]\boldsymbol{\Delta}^{(2)} = [\delta_{ij}^2], J=Inβˆ’1n11⊀\mathbf{J} = \mathbf{I}_n - \frac{1}{n}\mathbf{1}\mathbf{1}^{\top} is the centering matrix, and B\mathbf{B} is the doubly centered matrix of squared distances.

ThClassical MDS Solution

If Ξ”\boldsymbol{\Delta} contains Euclidean distances, then B\mathbf{B} is positive semi-definite and equals XX⊀\mathbf{X}\mathbf{X}^{\top} where X\mathbf{X} is the nΓ—mn \times m configuration matrix. The mm-dimensional MDS solution is:

Xm=VmΞ›m1/2\mathbf{X}_m = \mathbf{V}_m \boldsymbol{\Lambda}_m^{1/2}

where Vm\mathbf{V}_m contains the mm leading eigenvectors of B\mathbf{B} and Ξ›m=diag(Ξ»1,…,Ξ»m)\boldsymbol{\Lambda}_m = \text{diag}(\lambda_1, \dots, \lambda_m) the corresponding eigenvalues.

The quality of representation is assessed by the proportion of variance explained:

ProportionΒ ofΒ variance=βˆ‘k=1mΞ»kβˆ‘k=1nβˆ’1Ξ»k+\text{Proportion of variance} = \frac{\sum_{k=1}^{m} \lambda_k}{\sum_{k=1}^{n-1} \lambda_k^+}

where Ξ»k+\lambda_k^+ are the positive eigenvalues of B\mathbf{B}.

Torgerson's Double Centering

Classical MDS can be expressed as a Gram matrix decomposition:

G=XX⊀=βˆ’12JΞ”(2)J=VΞ›V⊀\mathbf{G} = \mathbf{X}\mathbf{X}^{\top} = -\frac{1}{2}\mathbf{J}\boldsymbol{\Delta}^{(2)}\mathbf{J} = \mathbf{V}\boldsymbol{\Lambda}\mathbf{V}^{\top}

The eigendecomposition of G\mathbf{G} yields the configuration X=VmΞ›m1/2\mathbf{X} = \mathbf{V}_m\boldsymbol{\Lambda}_m^{1/2}, where each row xi\mathbf{x}_i is the coordinate vector for object ii.

Classical MDS preserves metric distance relationships. When the original distances are not Euclidean (e.g., city-block, path distances), the centered matrix B\mathbf{B} may have negative eigenvalues. In such cases, the negative eigenvalues are typically set to zero, and the remaining positive eigenvalues define the best Euclidean approximation.

Metric MDS

Stress Function

When the dissimilarities are not Euclidean, or when a general monotonic relationship is desired, metric MDS minimizes a stress function:

DfMetric Stress (Kruskal's Stress-1)

Kruskal's Stress-1 for mm-dimensional MDS is:

Stressm=βˆ‘i<j(dijβˆ’Ξ΄^ij)2βˆ‘i<jdij2\text{Stress}_m = \sqrt{\frac{\sum_{i<j} (d_{ij} - \hat{\delta}_{ij})^2}{\sum_{i<j} d_{ij}^2}}

where dij=βˆ₯xiβˆ’xjβˆ₯d_{ij} = \|\mathbf{x}_i - \mathbf{x}_j\| are the configuration distances and Ξ΄^ij\hat{\delta}_{ij} are the disparities (transformed dissimilarities). In metric MDS, Ξ΄^ij=f(Ξ΄ij)\hat{\delta}_{ij} = f(\delta_{ij}) for a known function ff (typically the identity).

Common stress measures include:

STRESS-1=βˆ‘i<j(Ξ΄ijβˆ’dij)2βˆ‘i<jΞ΄ij2\text{STRESS-1} = \sqrt{\frac{\sum_{i<j}(\delta_{ij} - d_{ij})^2}{\sum_{i<j}\delta_{ij}^2}}
STRESS-2=βˆ‘i<j(Ξ΄ijβˆ’dij)2βˆ‘i<j(dijβˆ’dΛ‰)2\text{STRESS-2} = \sqrt{\frac{\sum_{i<j}(\delta_{ij} - d_{ij})^2}{\sum_{i<j}(d_{ij} - \bar{d})^2}}
RSS=βˆ‘i<j(Ξ΄ijβˆ’dij)2(RawΒ Stress)\text{RSS} = \sum_{i<j} (\delta_{ij} - d_{ij})^2 \quad \text{(Raw Stress)}

Strain (Smallest Space Analysis)

An alternative to stress is the strain function (Lingoes, 1971; Guttman, 1968):

DfStrain

Strain=βˆ‘i,j(bijβˆ’xi⊀xj)2\text{Strain} = \sum_{i,j} \left(b_{ij} - \mathbf{x}_i^{\top}\mathbf{x}_j\right)^2

where bijb_{ij} are the elements of the centered matrix B\mathbf{B}. Minimizing strain is equivalent to classical MDS and can be solved via eigendecomposition.

Non-Metric MDS

Ordinal Stress

When only the rank order of dissimilarities is meaningful (ordinal data), non-metric MDS (Kruskal, 1964) finds a monotonic transformation:

DfNon-Metric MDS

Non-metric MDS minimizes:

Stressnm=βˆ‘i<j(dijβˆ’d^ij)2βˆ‘i<jdij2\text{Stress}_{nm} = \sqrt{\frac{\sum_{i<j}(d_{ij} - \hat{d}_{ij})^2}{\sum_{i<j} d_{ij}^2}}

subject to the monotonicity constraint:

Ξ΄ij<Ξ΄klβ€…β€ŠβŸΉβ€…β€Šd^ij≀d^kl\delta_{ij} < \delta_{kl} \implies \hat{d}_{ij} \leq \hat{d}_{kl}

where d^ij\hat{d}_{ij} are the disparities obtained by isotonic regression of dijd_{ij} on Ξ΄ij\delta_{ij}.

The Primary Approach places ties in the dissimilarities at the same distance level. The Secondary Approach places ties at different distances.

Algorithm: SMACOF

The SMACOF (Scaling by Majorizing a Complicated Function) algorithm (de Leeuw, 1977) provides guaranteed convergence to a local minimum:

X(t+1)=V†[X(t)βˆ’1Ο„G(X(t))X(t)]\mathbf{X}^{(t+1)} = \mathbf{V}^{\dagger}\left[\mathbf{X}^{(t)} - \frac{1}{\tau}\mathbf{G}(\mathbf{X}^{(t)})\mathbf{X}^{(t)}\right]

where V†\mathbf{V}^{\dagger} is the Moore-Penrose pseudoinverse of the configuration matrix, Ο„\tau is the step size, and G(X)\mathbf{G}(\mathbf{X}) is the Guttman transform.

SMACOF guarantees monotone decrease of stress at each iteration. The convergence rate is linear, and the algorithm typically converges in 50–200 iterations. Multiple random starts are recommended to avoid local minima.

Shepard Diagram

The Shepard diagram is the primary diagnostic tool for MDS. It plots the original dissimilarities Ξ΄ij\delta_{ij} (x-axis) against the configuration distances dijd_{ij} (y-axis).

DfShepard Diagram

A Shepard diagram displays the relationship between original dissimilarities and MDS distances:

  • Metric MDS: Points cluster around a linear (or known function) relationship
  • Non-Metric MDS: Points cluster around a monotonically increasing function
  • Stress: Proportional to the vertical scatter of points around the monotonic curve
  • Degenerate solutions: All distances equal, appearing as a horizontal line

PROXSCAL

PROXSCAL (Commandeur & Heiser, 1993) is a general-purpose MDS algorithm that accommodates:

  • Metric, non-metric, and polynomial transformations
  • Individual differences (INDSCAL model)
  • Weighted Euclidean models
  • Multiple dissimilarity matrices
Οƒ(X,D)=βˆ‘k=1Kβˆ‘i<jwijk(Ξ΄ijkβˆ’dijk)2\sigma(\mathbf{X}, \mathbf{D}) = \sum_{k=1}^{K} \sum_{i<j} w_{ijk} \left(\delta_{ijk} - d_{ijk}\right)^2

where KK is the number of sources (replications), wijkw_{ijk} are weights, and dijk=βˆ₯xikβˆ’xjkβˆ₯d_{ijk} = \|\mathbf{x}_{ik} - \mathbf{x}_{jk}\| in the weighted Euclidean model.

Interpretation and Dimensional Interpretation

Procrustean Rotation

To facilitate interpretation, the MDS configuration can be rotated to match a target configuration:

Xrotated=XT\mathbf{X}_{\text{rotated}} = \mathbf{X}\mathbf{T}

where T\mathbf{T} is an orthogonal rotation matrix that minimizes βˆ₯Xtargetβˆ’XTβˆ₯F\|\mathbf{X}_{\text{target}} - \mathbf{X}\mathbf{T}\|_F.

Property Fitting

External variables can be regressed onto MDS dimensions to label the axes:

Ξ΄ijβ‰ˆΞ²0+βˆ‘d=1mΞ²d∣xidβˆ’xjd∣\delta_{ij} \approx \beta_0 + \sum_{d=1}^{m} \beta_d |x_{id} - x_{jd}|

where Ξ²d\beta_d are regression coefficients indicating how strongly variable dd relates to MDS dimension dd.

Python Implementation

import numpy as np
from sklearn.manifold import MDS
from sklearn.metrics import pairwise_distances
import matplotlib.pyplot as plt

np.random.seed(42)

# Generate synthetic dissimilarity data
n = 30
X_true = np.random.randn(n, 3)
D_true = pairwise_distances(X_true, metric='euclidean')

# --- Classical MDS (sklearn) ---
mds_classical = MDS(n_components=2, dissimilarity='euclidean', random_state=42)
X_classical = mds_classical.fit_transform(D_true)
print("Classical MDS stress:", mds_classical.stress_)

# --- Metric MDS ---
mds_metric = MDS(
    n_components=2,
    metric=True,
    dissimilarity='euclidean',
    random_state=42,
    n_init=10,
    max_iter=300,
)
X_metric = mds_metric.fit_transform(D_true)
print("Metric MDS stress:", mds_metric.stress_)

# --- Non-metric MDS ---
mds_nonmetric = MDS(
    n_components=2,
    metric=False,
    dissimilarity='euclidean',
    random_state=42,
    n_init=10,
    max_iter=300,
)
X_nonmetric = mds_nonmetric.fit_transform(D_true)
print("Non-metric MDS stress:", mds_nonmetric.stress_)

# --- Shepard Diagram ---
def shepard_diagram(D_true, X_mds, title="Shepard Diagram"):
    d_mds = pairwise_distances(X_mds)
    triu_idx = np.triu_indices(n, k=1)

    fig, ax = plt.subplots(figsize=(6, 6))
    ax.scatter(D_true[triu_idx], d_mds[triu_idx], alpha=0.3, s=10)

    # Monotonic line (isotonic regression)
    from scipy.stats import rankdata
    from sklearn.isotonic import IsotonicRegression
    ir = IsotonicRegression()
    d_flat = D_true[triu_idx].reshape(-1)
    m_flat = d_mds[triu_idx].reshape(-1)
    ir.fit(d_flat, m_flat)
    x_mono = np.linspace(d_flat.min(), d_flat.max(), 200)
    ax.plot(x_mono, ir.predict(x_mono), 'r-', linewidth=2, label='Monotonic fit')
    ax.plot([0, d_flat.max()], [0, d_flat.max()], 'k--', alpha=0.5, label='Perfect')
    ax.set_xlabel("Original dissimilarities")
    ax.set_ylabel("MDS distances")
    ax.set_title(title)
    ax.legend()
    plt.tight_layout()
    return fig

# --- Classical MDS via eigendecomposition ---
def classical_mds_eigen(D, m=2):
    n = D.shape[0]
    D2 = D ** 2
    J = np.eye(n) - np.ones((n, n)) / n
    B = -0.5 * J @ D2 @ J

    eigenvalues, eigenvectors = np.linalg.eigh(B)
    idx = np.argsort(eigenvalues)[::-1][:m]
    Lambda_m = np.diag(np.sqrt(np.maximum(eigenvalues[idx], 0)))
    X = eigenvectors[:, idx] @ Lambda_m
    return X, eigenvalues

X_manual, evals = classical_mds_eigen(D_true, m=2)
proportion = np.sum(evals[evals > 0][:2]) / np.sum(evals[evals > 0])
print(f"Classical MDS variance explained: {proportion:.4f}")

# --- Stress computation ---
def compute_stress(D, X, metric=True):
    n = D.shape[0]
    d_mds = pairwise_distances(X)
    if metric:
        return np.sqrt(np.sum((D - d_mds)**2) / np.sum(D**2))
    else:
        from scipy.stats import rankdata
        D_rank = rankdata(D, method='average')
        d_rank = rankdata(d_mds, method='average')
        return np.sqrt(np.sum((D_rank - d_rank)**2) / np.sum(D_rank**2))

stress_manual = compute_stress(D_true, X_manual)
print(f"Manual classical MDS stress: {stress_manual:.4f}")

# --- Individual Differences (INDSCAL) ---
def indscal(K_matrices, m=2):
    n = K_matrices[0].shape[0]
    K = len(K_matrices)
    B_avg = np.zeros((n, n))

    weights = []
    for Dk in K_matrices:
        Dk2 = Dk ** 2
        J = np.eye(n) - np.ones((n, n)) / n
        B_k = -0.5 * J @ Dk2 @ J
        B_avg += B_k

    B_avg /= K
    eigenvalues, eigenvectors = np.linalg.eigh(B_avg)
    idx = np.argsort(eigenvalues)[::-1][:m]
    X_group = eigenvectors[:, idx] @ np.diag(np.sqrt(np.maximum(eigenvalues[idx], 0)))

    for Dk in K_matrices:
        Dk2 = Dk ** 2
        J = np.eye(n) - np.ones((n, n)) / n
        B_k = -0.5 * J @ Dk2 @ J
        w_k = np.diag(X_group.T @ B_k @ X_group) / np.diag(X_group.T @ X_group)
        weights.append(w_k)

    return X_group, np.array(weights)

# Simulate 3 judges
K_matrices = [D_true + 0.3 * np.random.randn(n, n) for _ in range(3)]
K_matrices = [(k + k.T) / 2 for k in K_matrices]  # symmetrize
X_indscal, w_indscal = indscal(K_matrices, m=2)
print("INDSCAL weights:\n", np.round(w_indscal, 3))

Quality Assessment

Assessing MDS Solution Quality:

  1. Stress value: Stress-1 < 0.025 (excellent), < 0.05 (good), < 0.10 (fair), > 0.15 (poor). Use scree plot of stress vs. dimensionality.

  2. Shepard diagram: Examine the scatter of points around the monotonic curve. Tight clustering indicates good fit.

  3. Variance accounted for (VAF): In classical MDS, the proportion of positive eigenvalues captured by the mm-dimensional solution.

  4. R-squared: RSQ=1βˆ’Stress2\text{RSQ} = 1 - \text{Stress}^2, the proportion of variance in disparities explained by distances.

  5. Number of dimensions: Use scree plot (stress vs. mm), elbow criterion, or interpretability criterion.

  6. Local minima: Run MDS with multiple random starts (10–50) and verify convergence to the same stress level.

  7. Stability: Perturb the dissimilarities slightly and check if the configuration remains similar (Procrustes correlation > 0.90).

  8. Degenerate solutions: If all distances are equal or the configuration collapses to a single point, the MDS solution is degenerate β€” indicating severe misfit.

Applications

Psychology: Mapping perceived similarity between stimuli (e.g., colors, facial expressions, concepts). The spatial configuration reveals the psychological dimensions underlying perception.

Marketing: Product positioning β€” mapping consumer perceptions of brands in a space where distances reflect perceived dissimilarity. Competitive strategies are informed by the proximity structure.

Ecology: Ordination of species communities along environmental gradients. Non-metric MDS (NMDS) is the standard method for community ecology because it handles non-Euclidean ecological distances (Bray-Curtis, Jaccard).

Bioinformatics: Mapping protein structures, gene expression profiles, or drug response patterns where direct Euclidean distance is inappropriate.

⭐

Premium Content

Multidimensional Scaling (MDS)

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement