πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Multivariate Analysis of Variance (MANOVA)

Advanced Statistical MethodsMultivariate Methods🟒 Free Lesson

Advertisement

Multivariate Analysis of Variance (MANOVA)

Advanced Statistical Methods

Comparing Groups Across Multiple Outcomes Simultaneously

MANOVA extends ANOVA to multiple dependent variables, testing whether group means differ across a vector of outcomes while accounting for correlations among them. Wilks' Lambda and Pillai's Trace are key test statistics.

  • Psychology β€” Compare treatment groups on multiple behavioral outcomes simultaneously
  • Education β€” Assess whether teaching methods differ across several performance measures
  • Clinical research β€” Evaluate treatment effects on correlated biomarkers and symptoms

MANOVA tests the right question: do groups differ when all outcomes are considered together?


Multivariate Analysis of Variance (MANOVA) extends the univariate ANOVA framework to situations where multiple dependent variables are measured simultaneously. MANOVA tests whether group means differ across multiple correlated outcome variables, controlling the overall Type I error rate while exploiting the correlation structure among dependent variables. The method provides greater statistical power than conducting separate ANOVAs when dependent variables are correlated and the hypothesis involves simultaneous group differences.

Mathematical Foundation

DfMANOVA Model

The MANOVA model for gg groups and pp dependent variables is:

Yij=ΞΌ+Ο„j+Ξ΅ij\mathbf{Y}_{ij} = \boldsymbol{\mu} + \boldsymbol{\tau}_j + \boldsymbol{\varepsilon}_{ij}

where:

  • Yij\mathbf{Y}_{ij} is the pΓ—1p \times 1 vector of observations for subject ii in group jj
  • ΞΌ\boldsymbol{\mu} is the pΓ—1p \times 1 vector of overall means
  • Ο„j\boldsymbol{\tau}_j is the pΓ—1p \times 1 vector of treatment effects for group jj
  • Ξ΅ij∼Np(0,Ξ£)\boldsymbol{\varepsilon}_{ij} \sim N_p(\mathbf{0}, \boldsymbol{\Sigma}) are independent error vectors

Assumptions:

  1. Multivariate normality: Yij∼Np(ΞΌ+Ο„j,Ξ£)\mathbf{Y}_{ij} \sim N_p(\boldsymbol{\mu} + \boldsymbol{\tau}_j, \boldsymbol{\Sigma})
  2. Homogeneity of covariance matrices: Ξ£1=Ξ£2=β‹―=Ξ£g=Ξ£\boldsymbol{\Sigma}_1 = \boldsymbol{\Sigma}_2 = \cdots = \boldsymbol{\Sigma}_g = \boldsymbol{\Sigma}
  3. Independence of observations
  4. No perfect multicollinearity among dependent variables

Hypothesis in MANOVA

The null and alternative hypotheses in matrix form:

H0:Ο„1=Ο„2=β‹―=Ο„g=0H_0: \boldsymbol{\tau}_1 = \boldsymbol{\tau}_2 = \cdots = \boldsymbol{\tau}_g = \mathbf{0}
H1:At least one τj≠0H_1: \text{At least one } \boldsymbol{\tau}_j \neq \mathbf{0}

The test statistic is based on two matrices:

  • H (Hypothesis/Between-groups matrix): H=βˆ‘j=1gnj(YΛ‰jβˆ’YΛ‰)(YΛ‰jβˆ’YΛ‰)T\mathbf{H} = \sum_{j=1}^{g} n_j (\bar{\mathbf{Y}}_j - \bar{\mathbf{Y}})(\bar{\mathbf{Y}}_j - \bar{\mathbf{Y}})^T
  • E (Error/Within-groups matrix): E=βˆ‘j=1gβˆ‘i=1nj(Yijβˆ’YΛ‰j)(Yijβˆ’YΛ‰j)T\mathbf{E} = \sum_{j=1}^{g} \sum_{i=1}^{n_j} (\mathbf{Y}_{ij} - \bar{\mathbf{Y}}_j)(\mathbf{Y}_{ij} - \bar{\mathbf{Y}}_j)^T

Test Statistics

MANOVA test statistics are functions of the eigenvalues of Eβˆ’1H\mathbf{E}^{-1}\mathbf{H} or related matrices. Let s=min⁑(p,gβˆ’1)s = \min(p, g-1) and Ξ»1β‰₯Ξ»2β‰₯β‹―β‰₯Ξ»s\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_s be the eigenvalues of Eβˆ’1H\mathbf{E}^{-1}\mathbf{H}.

DfWilks' Lambda ($\Lambda$)

Wilks' Lambda is the most commonly used MANOVA statistic:

Ξ›=∣E∣∣H+E∣=∏i=1s11+Ξ»i\Lambda = \frac{|\mathbf{E}|}{|\mathbf{H} + \mathbf{E}|} = \prod_{i=1}^{s} \frac{1}{1 + \lambda_i}

where βˆ£β‹…βˆ£|\cdot| denotes the determinant. Ξ›βˆˆ[0,1]\Lambda \in [0, 1] with smaller values indicating greater group separation (stronger evidence against H0H_0).

Exact and Approximate Distributions of Wilks' Lambda

Exact distributions:

  • p=1p = 1: Ξ›=F\Lambda = F (standard F-test for one-way ANOVA)
  • p=2p = 2: Ξ›\Lambda transforms to an F-distribution with 2(gβˆ’1)2(g-1) and 2(nβˆ’g)2(n-g) df
  • s=1s = 1 (i.e., g=2g = 2): Ξ›\Lambda transforms to an F-distribution

Rao's approximation (general case):

Fapprox=1βˆ’Ξ›1/tΞ›1/tβ‹…df2df1F_{\text{approx}} = \frac{1 - \sqrt{\Lambda^{1/t}}}{\sqrt{\Lambda^{1/t}}} \cdot \frac{df_2}{df_1}

where t=p2(gβˆ’1)2βˆ’4p2+(gβˆ’1)2βˆ’5t = \sqrt{\frac{p^2(g-1)^2 - 4}{p^2 + (g-1)^2 - 5}}, df1=p(gβˆ’1)df_1 = p(g-1), and df2=4+(p(gβˆ’1)+2)tβˆ’p2(gβˆ’1)22tdf_2 = 4 + (p(g-1) + 2)t - \frac{p^2(g-1)^2}{2t}.

DfPillai's Trace ($V$)

Pillai's Trace is:

V=tr[H(H+E)βˆ’1]=βˆ‘i=1sΞ»i1+Ξ»iV = \text{tr}[\mathbf{H}(\mathbf{H} + \mathbf{E})^{-1}] = \sum_{i=1}^{s} \frac{\lambda_i}{1 + \lambda_i}

Pillai's Trace is more robust than Wilks' Lambda when assumptions are violated, particularly with unequal group sizes or non-normality. V∈[0,p]V \in [0, p] with larger values indicating greater group differences.

DfHotelling-Lawley Trace ($T_0$)

The Hotelling-Lawley Trace is:

T0=tr(Eβˆ’1H)=βˆ‘i=1sΞ»iT_0 = \text{tr}(\mathbf{E}^{-1}\mathbf{H}) = \sum_{i=1}^{s} \lambda_i

This statistic equals the sum of eigenvalues of Eβˆ’1H\mathbf{E}^{-1}\mathbf{H}. It is most powerful when group differences are large and covariance matrices are equal. T0T_0 has no upper bound.

DfRoy's Largest Root ($\theta$)

Roy's Largest Root uses only the largest eigenvalue:

ΞΈ=Ξ»11+Ξ»1\theta = \frac{\lambda_1}{1 + \lambda_1}

This is the most powerful statistic when group differences occur primarily along a single discriminant dimension. However, it is the least robust to assumption violations and should be used cautiously.

Computing MANOVA Test Statistics

Problem: A study compares three treatments with two dependent variables (Y1Y_1: anxiety, Y2Y_2: depression). Given:

H=[24.58.28.215.3],E=[42.15.75.738.9]\mathbf{H} = \begin{bmatrix} 24.5 & 8.2 \\ 8.2 & 15.3 \end{bmatrix}, \quad \mathbf{E} = \begin{bmatrix} 42.1 & 5.7 \\ 5.7 & 38.9 \end{bmatrix}

Solution:

Step 1: Compute H+E\mathbf{H} + \mathbf{E}:

H+E=[66.613.913.954.2]\mathbf{H} + \mathbf{E} = \begin{bmatrix} 66.6 & 13.9 \\ 13.9 & 54.2 \end{bmatrix}

Step 2: Compute determinants:

∣E∣=(42.1)(38.9)βˆ’(5.7)2=1637.69βˆ’32.49=1605.20|\mathbf{E}| = (42.1)(38.9) - (5.7)^2 = 1637.69 - 32.49 = 1605.20
∣H+E∣=(66.6)(54.2)βˆ’(13.9)2=3609.72βˆ’193.21=3416.51|\mathbf{H} + \mathbf{E}| = (66.6)(54.2) - (13.9)^2 = 3609.72 - 193.21 = 3416.51

Step 3: Wilks' Lambda:

Ξ›=∣E∣∣H+E∣=1605.203416.51=0.470\Lambda = \frac{|\mathbf{E}|}{|\mathbf{H} + \mathbf{E}|} = \frac{1605.20}{3416.51} = 0.470

Step 4: Compute eigenvalues of Eβˆ’1H\mathbf{E}^{-1}\mathbf{H}:

Eβˆ’1=11605.20[38.9βˆ’5.7βˆ’5.742.1]=[0.02424βˆ’0.00355βˆ’0.003550.02623]\mathbf{E}^{-1} = \frac{1}{1605.20}\begin{bmatrix} 38.9 & -5.7 \\ -5.7 & 42.1 \end{bmatrix} = \begin{bmatrix} 0.02424 & -0.00355 \\ -0.00355 & 0.02623 \end{bmatrix}
Eβˆ’1H=[0.02424βˆ’0.00355βˆ’0.003550.02623][24.58.28.215.3]\mathbf{E}^{-1}\mathbf{H} = \begin{bmatrix} 0.02424 & -0.00355 \\ -0.00355 & 0.02623 \end{bmatrix}\begin{bmatrix} 24.5 & 8.2 \\ 8.2 & 15.3 \end{bmatrix}

Computing and solving the characteristic equation yields eigenvalues: Ξ»1=0.682\lambda_1 = 0.682, Ξ»2=0.198\lambda_2 = 0.198

Step 5: Compute all statistics:

Ξ›=1(1+0.682)(1+0.198)=12.015=0.496\Lambda = \frac{1}{(1+0.682)(1+0.198)} = \frac{1}{2.015} = 0.496

(rounded)

V=0.6821.682+0.1981.198=0.405+0.165=0.570V = \frac{0.682}{1.682} + \frac{0.198}{1.198} = 0.405 + 0.165 = 0.570
T0=0.682+0.198=0.880T_0 = 0.682 + 0.198 = 0.880
ΞΈ=0.6821.682=0.405\theta = \frac{0.682}{1.682} = 0.405

Assumptions and Diagnostics

ThMANOVA Assumption Testing

1. Multivariate Normality: Test using Mardia's multivariate skewness and kurtosis:

Skewness=1nβˆ‘i=1n[(yiβˆ’yΛ‰)TSβˆ’1(yiβˆ’yΛ‰)]2\text{Skewness} = \frac{1}{n} \sum_{i=1}^{n} [(\mathbf{y}_i - \bar{\mathbf{y}})^T \mathbf{S}^{-1} (\mathbf{y}_i - \bar{\mathbf{y}})]^2
Kurtosis=1nβˆ‘i=1n[(yiβˆ’yΛ‰)TSβˆ’1(yiβˆ’yΛ‰)]3\text{Kurtosis} = \frac{1}{n} \sum_{i=1}^{n} [(\mathbf{y}_i - \bar{\mathbf{y}})^T \mathbf{S}^{-1} (\mathbf{y}_i - \bar{\mathbf{y}})]^3

Under H0H_0: SkewnessβˆΌΟ‡p2\text{Skewness} \sim \chi^2_p and Kurtosis∼N(6p,24p)\text{Kurtosis} \sim N(6p, 24p) approximately.

2. Homogeneity of Covariance Matrices: Test using Box's M test:

M=(nβˆ’g)ln⁑∣Spβˆ£βˆ’βˆ‘j=1g(njβˆ’1)ln⁑∣Sj∣M = (n - g) \ln|\mathbf{S}_p| - \sum_{j=1}^{g} (n_j - 1) \ln|\mathbf{S}_j|

where Sp\mathbf{S}_p is the pooled covariance matrix. Under H0H_0:

M[1βˆ’βˆ‘j=1g(njβˆ’1)βˆ’1βˆ’(nβˆ’g)βˆ’12(p+1)]βˆΌΟ‡p(p+1)(gβˆ’1)/22M\left[1 - \frac{\sum_{j=1}^{g} (n_j - 1)^{-1} - (n-g)^{-1}}{2(p+1)}\right] \sim \chi^2_{p(p+1)(g-1)/2}

Box's M is very sensitive to non-normality; use with caution.

3. Linearity: Assess through scatter plots of dependent variable pairs within groups.

Post-Hoc Tests and Discriminant Analysis

DfDiscriminant Function Analysis

The discriminant functions are the eigenvectors of Eβˆ’1H\mathbf{E}^{-1}\mathbf{H}. The ss discriminant functions are:

Di=aiTY=ai1Y1+ai2Y2+β‹―+aipYpD_i = \mathbf{a}_i^T \mathbf{Y} = a_{i1}Y_1 + a_{i2}Y_2 + \cdots + a_{ip}Y_p

where ai\mathbf{a}_i is the eigenvector corresponding to eigenvalue Ξ»i\lambda_i. The functions maximize group separation in the direction of greatest between-group relative to within-group variance.

Structure Coefficients

The structure matrix S\mathbf{S} contains correlations between original variables and discriminant functions:

sjk=r(Yj,Dk)=βˆ‘i(Yijβˆ’YΛ‰j)(Dikβˆ’DΛ‰k)(nβˆ’1)sYjsDks_{jk} = r(Y_j, D_k) = \frac{\sum_i (Y_{ij} - \bar{Y}_j)(D_{ik} - \bar{D}_k)}{(n-1) s_{Y_j} s_{D_k}}

Structure coefficients greater than ∣0.30∣|0.30| or ∣0.40∣|0.40| (convention) indicate variables that meaningfully contribute to group discrimination.

Python Implementation

import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

# Generate MANOVA data
np.random.seed(42)

# Parameters
n_per_group = 30
n_groups = 3
n_vars = 2

# Group means
means = np.array([
    [10, 8],    # Group 1
    [12, 10],   # Group 2
    [14, 12]    # Group 3
])

# Covariance matrix
Sigma = np.array([
    [4.0, 2.5],
    [2.5, 3.5]
])

# Generate data
data_list = []
for g in range(n_groups):
    Y = np.random.multivariate_normal(means[g], Sigma, n_per_group)
    group_df = pd.DataFrame(Y, columns=['Anxiety', 'Depression'])
    group_df['Group'] = f'Treatment_{g+1}'
    data_list.append(group_df)

data = pd.concat(data_list, ignore_index=True)

# Compute H and E matrices
grand_mean = data[['Anxiety', 'Depression']].mean().values

H = np.zeros((n_vars, n_vars))
E = np.zeros((n_vars, n_vars))

for g in range(n_groups):
    group_data = data[data['Group'] == f'Treatment_{g+1}'][['Anxiety', 'Depression']].values
    n_g = len(group_data)
    group_mean = group_data.mean(axis=0)
    
    # Between-groups matrix
    H += n_g * np.outer(group_mean - grand_mean, group_mean - grand_mean)
    
    # Within-groups matrix
    for i in range(n_g):
        E += np.outer(group_data[i] - group_mean, group_data[i] - group_mean)

print("Hypothesis (Between-groups) matrix H:")
print(np.round(H, 3))
print("\nError (Within-groups) matrix E:")
print(np.round(E, 3))

# Compute eigenvalues of E^{-1}H
E_inv = np.linalg.inv(E)
E_inv_H = E_inv @ H
eigenvalues = np.linalg.eigvals(E_inv_H)
eigenvalues = np.sort(eigenvalues)[::-1]

print(f"\nEigenvalues of E^{{-1}}H: {np.round(eigenvalues, 4)}")

# MANOVA test statistics
s = min(n_vars, n_groups - 1)

# Wilks' Lambda
Lambda = np.prod(1 / (1 + eigenvalues[:s]))
print(f"\nWilks' Lambda: {Lambda:.4f}")

# Pillai's Trace
Pillai = np.sum(eigenvalues[:s] / (1 + eigenvalues[:s]))
print(f"Pillai's Trace: {Pillai:.4f}")

# Hotelling-Lawley Trace
HL_trace = np.sum(eigenvalues[:s])
print(f"Hotelling-Lawley Trace: {HL_trace:.4f}")

# Roy's Largest Root
Roy = eigenvalues[0] / (1 + eigenvalues[0])
print(f"Roy's Largest Root: {Roy:.4f}")

# Rao's approximation for Wilks' Lambda
p = n_vars
g = n_groups
n = len(data)

t = np.sqrt((p**2 * (g-1)**2 - 4) / (p**2 + (g-1)**2 - 5))
df1 = p * (g - 1)
df2 = 4 + (p * (g-1) + 2) * t - (p**2 * (g-1)**2) / (2*t)

F_approx = ((1 - Lambda**(1/t)) / Lambda**(1/t)) * (df2 / df1)
p_value = 1 - stats.f.cdf(F_approx, df1, df2)

print(f"\nRao's F approximation: F({df1:.1f}, {df2:.1f}) = {F_approx:.3f}, p = {p_value:.4f}")

# Box's M test for homogeneity of covariance matrices
def box_m_test(data, group_col, dep_vars):
    """Perform Box's M test for equality of covariance matrices."""
    groups = data[group_col].unique()
    g = len(groups)
    p = len(dep_vars)
    n = len(data)
    
    # Compute pooled and group covariance matrices
    S_pooled = np.zeros((p, p))
    df_total = 0
    
    log_dets = []
    df_groups = []
    
    for group in groups:
        group_data = data[data[group_col] == group][dep_vars].values
        n_g = len(group_data)
        S_g = np.cov(group_data, rowvar=False) * (n_g - 1)
        
        S_pooled += S_g
        df_total += n_g - 1
        
        log_dets.append(np.log(np.linalg.det(S_g / (n_g - 1))))
        df_groups.append(n_g - 1)
    
    S_pooled /= df_total
    log_det_pooled = np.log(np.linalg.det(S_pooled))
    
    # Box's M statistic
    M = (df_total) * log_det_pooled - sum([(df_groups[i]) * log_dets[i] for i in range(g)])
    
    # Correction factor
    sum_inv_df = sum([1/df_groups[i] for i in range(g)])
    C = (2 * p**2 + 3 * p - 1) / (6 * (p + 1) * (g - 1)) * (sum_inv_df - 1/df_total)
    
    # Chi-square approximation
    chi2 = M * (1 - C)
    df_chi2 = p * (p + 1) * (g - 1) / 2
    p_value = 1 - stats.chi2.cdf(chi2, df_chi2)
    
    return M, chi2, df_chi2, p_value

M_stat, chi2_stat, df_m, p_m = box_m_test(data, 'Group', ['Anxiety', 'Depression'])
print(f"\nBox's M test: M = {M_stat:.3f}, Chi2({df_m}) = {chi2_stat:.3f}, p = {p_m:.4f}")

# Discriminant Analysis
lda = LinearDiscriminantAnalysis()
X = data[['Anxiety', 'Depression']].values
y = data['Group'].values
lda.fit(X, y)

# Structure coefficients (correlations with discriminant functions)
disc_scores = lda.transform(X)
structure_corr = np.corrcoef(X.T, disc_scores.T)[:p, p:]
print(f"\nStructure coefficients:\n{np.round(structure_corr, 3)}")

# Visualize discriminant space
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Scatter plot in original space
colors = ['blue', 'red', 'green']
for g, group in enumerate(data['Group'].unique()):
    mask = data['Group'] == group
    axes[0].scatter(data[mask]['Anxiety'], data[mask]['Depression'], 
                   c=colors[g], alpha=0.6, label=group, s=50)

axes[0].set_xlabel('Anxiety')
axes[0].set_ylabel('Depression')
axes[0].set_title('Original Variable Space')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Discriminant scores
for g, group in enumerate(data['Group'].unique()):
    mask = data['Group'] == group
    axes[1].hist(disc_scores[mask, 0], bins=10, alpha=0.5, 
                color=colors[g], label=group, density=True)

axes[1].set_xlabel('Discriminant Score (Function 1)')
axes[1].set_ylabel('Density')
axes[1].set_title('Distribution of Discriminant Scores')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('manova_analysis.png', dpi=150)
plt.show()

# Effect sizes
# Partial eta-squared from Wilks' Lambda
eta2_partial = 1 - Lambda**(1/s)
print(f"\nPartial eta-squared (from Wilks' Lambda): {eta2_partial:.4f}")

# Compare with separate ANOVAs (for illustration)
print("\nSeparate one-way ANOVAs (for comparison):")
for var in ['Anxiety', 'Depression']:
    groups_data = [data[data['Group'] == g][var].values 
                   for g in data['Group'].unique()]
    f_stat, p_val = stats.f_oneway(*groups_data)
    print(f"  {var}: F = {f_stat:.3f}, p = {p_val:.4f}")

Summary: Multivariate Analysis of Variance (MANOVA)

  1. MANOVA Advantage: Tests multiple dependent variables simultaneously, controlling overall Ξ±\alpha and leveraging inter-variable correlations for increased power
  2. Test Statistics: Wilks' Lambda (Ξ›\Lambda), Pillai's Trace (VV), Hotelling-Lawley Trace (T0T_0), Roy's Largest Root (ΞΈ\theta) β€” all functions of eigenvalues of Eβˆ’1H\mathbf{E}^{-1}\mathbf{H}
  3. Wilks' Lambda: Ξ›=∏(1+Ξ»i)βˆ’1\Lambda = \prod(1+\lambda_i)^{-1}; smaller values indicate greater group differences; exact F-test for p≀2p \leq 2 or s=1s = 1
  4. Pillai's Trace: Most robust to assumption violations; V=βˆ‘Ξ»i/(1+Ξ»i)V = \sum \lambda_i/(1+\lambda_i); recommended when assumptions are questionable
  5. Key Matrices: H\mathbf{H} (hypothesis) captures between-group variation; E\mathbf{E} (error) captures within-group variation; test statistics are ratios of these matrices
⭐

Premium Content

Multivariate Analysis of Variance (MANOVA)

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement