🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Functional Data Analysis (FDA)

Advanced Statistical MethodsFunctional Methods🟢 Free Lesson

Advertisement

Functional Data Analysis (FDA)

Advanced Statistical Methods

Treating Curves and Signals as Data Points

Functional data analysis treats entire curves, surfaces, or functions as individual observations. Basis expansions, functional PCA, and registration methods enable statistical analysis of inherently functional data.

  • Biomechanics — Analyze gait patterns by treating each walking cycle as a functional observation
  • Meteorology — Compare daily temperature profiles across weather stations as functional data
  • Finance — Treat intraday trading curves as functional observations for pattern recognition

FDA lets you do statistics on curves, not just numbers.


Functional data analysis treats observations as functions xi(t)x_i(t) defined on a continuous domain T\mathcal{T} (typically an interval), rather than as vectors of discrete measurements. The paradigm shift from multivariate to functional statistics arises naturally in longitudinal studies, spectrometry, kinematics, and any setting where the sampling frequency is dense relative to the underlying signal.


Functional Data Representation

DfFunctional Data

Functional data consists of nn curves {xi(t):tT}i=1n\{x_i(t): t \in \mathcal{T}\}_{i=1}^{n}, where each xix_i is a smooth function belonging to a Hilbert space H=L2(T)\mathcal{H} = L^2(\mathcal{T}) with inner product x,y=Tx(t)y(t)dt\langle x, y \rangle = \int_\mathcal{T} x(t)y(t)\,dt.

In practice, we observe discretized or noisy versions:

xi(tij)=fi(tij)+εij,j=1,,mix_i(t_{ij}) = f_i(t_{ij}) + \varepsilon_{ij}, \quad j = 1, \dots, m_i

where fif_i is the true curve and εijN(0,σ2)\varepsilon_{ij} \sim N(0, \sigma^2) are measurement errors. FDA begins by converting these discrete observations into smooth functional representations.


Basis Expansions

The most common approach represents each curve as a linear combination of known basis functions:

Basis Expansion

xi(t)=k=1Kcikϕk(t)=ϕ(t)cix_i(t) = \sum_{k=1}^{K} c_{ik}\,\phi_k(t) = \boldsymbol{\phi}(t)^\top \mathbf{c}_i

where {ϕk}\{\phi_k\} are basis functions and ci=(ci1,,ciK)\mathbf{c}_i = (c_{i1}, \dots, c_{iK})^\top are coefficient vectors.

B-Spline Basis

DfB-Spline Basis Functions

B-splines of order rr are piecewise polynomial functions of degree r1r - 1 defined recursively on a knot sequence {t1t2tK+r}\{t_1 \leq t_2 \leq \dots \leq t_{K+r}\}:

Bi,1(t)={1tit<ti+10otherwiseB_{i,1}(t) = \begin{cases} 1 & t_i \leq t < t_{i+1} \\ 0 & \text{otherwise} \end{cases}
Bi,r(t)=ttiti+r1tiBi,r1(t)+ti+rtti+rti+1Bi+1,r1(t)B_{i,r}(t) = \frac{t - t_i}{t_{i+r-1} - t_i}B_{i,r-1}(t) + \frac{t_{i+r} - t}{t_{i+r} - t_{i+1}}B_{i+1,r-1}(t)

B-splines offer local support: changing one coefficient affects only nearby curve segments. The roughness penalty on the second derivative enforces smoothness:

Roughness Penalty

Pen(c)=λT[x(t)]2dt=λcRc\text{Pen}(\mathbf{c}) = \lambda \int_\mathcal{T} \left[x''(t)\right]^2 dt = \lambda\,\mathbf{c}^\top \mathbf{R}\,\mathbf{c}

where Rjk=ϕj(t)ϕk(t)dtR_{jk} = \int \phi_j''(t)\phi_k''(t)\,dt. The smoothing parameter λ\lambda controls the bias–variance tradeoff.

Fourier Basis

DfFourier Basis

For periodic functions on [0,T][0, T]:

ϕ0(t)=1,ϕ2k1(t)=cos(2πktT),ϕ2k(t)=sin(2πktT)\phi_0(t) = 1, \quad \phi_{2k-1}(t) = \cos\left(\frac{2\pi k t}{T}\right), \quad \phi_{2k}(t) = \sin\left(\frac{2\pi k t}{T}\right)

The Fourier basis is optimal for capturing oscillatory behavior and provides natural frequency-domain interpretation. The roughness penalty is proportional to kk4(c2k12+c2k2)\sum_k k^4 (c_{2k-1}^2 + c_{2k}^2).

Basis Selection

  • B-splines: Non-periodic, irregularly sampled, local features (growth curves, gait data)
  • Fourier: Periodic signals ( circadian rhythms, cardiac cycles)
  • Wavelets: Non-stationary signals with abrupt changes (EEG, financial time series)
  • Wavelet packets: Adaptive selection of time-frequency resolution

The number of basis functions KK (or equivalently λ\lambda) is selected via GCV, AIC, or BIC.


Functional PCA

DfFunctional PCA

FPCA decomposes the variability of functional data into orthogonal modes:

xi(t)=xˉ(t)+k=1ξikψk(t)x_i(t) = \bar{x}(t) + \sum_{k=1}^{\infty} \xi_{ik}\,\psi_k(t)

where xˉ(t)=1nixi(t)\bar{x}(t) = \frac{1}{n}\sum_i x_i(t) is the mean function, {ψk}\{\psi_k\} are orthonormal eigenfunctions, and ξik=(xi(t)xˉ(t))ψk(t)dt\xi_{ik} = \int (x_i(t) - \bar{x}(t))\psi_k(t)\,dt are functional principal component scores.

The eigenfunctions satisfy the Fredholm integral equation:

Eigenvalue Problem

TC(s,t)ψk(t)dt=λkψk(s)\int_\mathcal{T} C(s, t)\,\psi_k(t)\,dt = \lambda_k\,\psi_k(s)

where C(s,t)=1n1i(xi(s)xˉ(s))(xi(t)xˉ(t))C(s,t) = \frac{1}{n-1}\sum_i (x_i(s) - \bar{x}(s))(x_i(t) - \bar{x}(t)) is the sample covariance function.

ThKarhunen-Loève Representation

For a second-order process X(t)X(t) with covariance C(s,t)=k=1λkψk(s)ψk(t)C(s,t) = \sum_{k=1}^{\infty} \lambda_k \psi_k(s)\psi_k(t):

X(t)=μ(t)+k=1λkξkψk(t)X(t) = \mu(t) + \sum_{k=1}^{\infty} \sqrt{\lambda_k}\,\xi_k\,\psi_k(t)

where ξk\xi_k are uncorrelated with E[ξk]=0\mathbb{E}[\xi_k] = 0 and Var[ξk]=1\text{Var}[\xi_k] = 1. The truncation to KK terms captures proportion k=1Kλk/k=1λk\sum_{k=1}^K \lambda_k / \sum_{k=1}^\infty \lambda_k of total variance.


Functional Linear Models

Scalar-on-Function Regression

Scalar-on-Function Regression

Yi=α+Txi(t)β(t)dt+εi,εiN(0,σ2)Y_i = \alpha + \int_\mathcal{T} x_i(t)\,\beta(t)\,dt + \varepsilon_i, \quad \varepsilon_i \sim N(0, \sigma^2)

The coefficient function β(t)\beta(t) is estimated by penalized splines:

β^=(XX+λR)1XY\hat{\boldsymbol{\beta}} = (\mathbf{X}^\top\mathbf{X} + \lambda\mathbf{R})^{-1}\mathbf{X}^\top\mathbf{Y}

where Xik=xi(t)ϕk(t)dt\mathbf{X}_{ik} = \int x_i(t)\phi_k(t)\,dt is the design matrix.

Function-on-Function Regression

Function-on-Function Regression

Yi(t)=α(t)+Txi(s)β(s,t)ds+εi(t)Y_i(t) = \alpha(t) + \int_\mathcal{T} x_i(s)\,\beta(s, t)\,ds + \varepsilon_i(t)

where β(s,t)\beta(s,t) is a bivariate coefficient surface. Truncated basis expansion converts this to a multivariate regression problem.


Depth for Functional Data

DfFunctional Depth

The modified band depth (MBD) of a curve xix_i among {x1,,xn}\{x_1, \dots, x_n\} measures the proportion of intervals [xj(t),xk(t)][x_j(t), x_k(t)] that contain xi(t)x_i(t):

MBDi=(n2)1j<k1m=1m1[xi(t)[min(xj(t),xk(t)),max(xj(t),xk(t))]]\text{MBD}_i = \binom{n}{2}^{-1}\sum_{j < k} \frac{1}{m}\sum_{\ell=1}^{m} \mathbf{1}\left[x_i(t_\ell) \in [\min(x_j(t_\ell), x_k(t_\ell)), \max(x_j(t_\ell), x_k(t_\ell))]\right]

Functional depth generalizes the concept of median and quantiles to function spaces, enabling robust descriptive statistics, boxplots for functional data, and outlier detection.


Registration / Warping

DfCurve Registration

Registration (or alignment) removes phase variation by mapping each observed curve xi(t)x_i(t) to a standardized time domain via a warping function hih_i:

xi(t)=xi(hi(t))x_i^*(t) = x_i(h_i(t))

where hi:TTh_i: \mathcal{T} \to \mathcal{T} is a strictly increasing diffeomorphism.

The registration problem decomposes the observed variation into amplitude variation (vertical differences after alignment) and phase variation (horizontal shifts in features):

Registration Methods

  • Landmark registration: Aligns pre-specified features (peaks, valleys) using monotone interpolation.
  • Continuous registration (PROCRUSTES): Minimizes [xi(t)xˉ(hi(t))]2dt\int [x_i(t) - \bar{x}(h_i(t))]^2 dt jointly over hih_i and the target.
  • Elastic registration: Uses the square-root velocity function (SRVF) framework, where qi(t)=x˙i(t)/xi(t)q_i(t) = \dot{x}_i(t)/\sqrt{|x_i(t)|}, and alignment is performed in the Hilbert space of SRVFs via geodesic matching.

Python Implementation

import numpy as np
import pandas as pd
from scipy.interpolate import BSpline
from scipy.optimize import minimize_scalar
import matplotlib.pyplot as plt

np.random.seed(42)

# --- Generate functional data ---
t = np.linspace(0, 1, 200)
n = 30

# True signal + phase variation
def true_signal(t):
    return np.sin(2 * np.pi * t) + 0.5 * np.cos(4 * np.pi * t)

# Generate curves with random warping and amplitude perturbation
x = np.zeros((n, len(t)))
for i in range(n):
    phase_shift = np.random.normal(0, 0.05)
    amp = 1 + np.random.normal(0, 0.1)
    freq = 1 + np.random.normal(0, 0.05)
    x[i, :] = amp * np.sin(2 * np.pi * freq * (t + phase_shift)) + \
              0.5 * np.cos(4 * np.pi * freq * (t + phase_shift)) + \
              np.random.normal(0, 0.05, len(t))

# --- B-spline basis expansion ---
n_basis = 25
knots = np.linspace(0, 1, n_basis - 4 + 2)
knots = np.concatenate([[0, 0, 0], knots[1:-1], [1, 1, 1]])

from scipy.interpolate import BSpline
B = np.zeros((len(t), n_basis))
for k in range(n_basis):
    coeffs = np.zeros(n_basis)
    coeffs[k] = 1.0
    spl = BSpline(knots, coeffs, 3)
    B[:, k] = spl(t)

# Smooth each curve
lambdas = np.logspace(-4, 0, 20)
x_smooth = np.zeros_like(x)
for i in range(n):
    best_gcv = np.inf
    best_lam = lambdas[0]
    for lam in lambdas:
        R = np.zeros((n_basis, n_basis))
        for k in range(n_basis - 2):
            R[k:k+3, k:k+3] += lam * np.array([[1, -2, 1], [-2, 4, -2], [1, -2, 1]])
        c = np.linalg.solve(B.T @ B + R, B.T @ x[i, :])
        residual = x[i, :] - B @ c
        gcv = np.mean(residual**2) / (1 - n_basis / len(t))**2
        if gcv < best_gcv:
            best_gcv = gcv
            best_lam = lam
    R = np.zeros((n_basis, n_basis))
    for k in range(n_basis - 2):
        R[k:k+3, k:k+3] += best_lam * np.array([[1, -2, 1], [-2, 4, -2], [1, -2, 1]])
    c = np.linalg.solve(B.T @ B + R, B.T @ x[i, :])
    x_smooth[i, :] = B @ c

print(f"Smoothed {n} curves with {n_basis} B-splines")

# --- Functional PCA ---
mean_func = x_smooth.mean(axis=0)
centered = x_smooth - mean_func

# Covariance matrix in basis coordinates
c_basis = np.linalg.lstsq(B, centered.T, rcond=None)[0].T
cov_matrix = np.cov(c_basis.T)
eigenvalues, eigenvectors = np.linalg.eigh(cov_matrix)

# Reverse for descending order
idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

# Eigenfunctions
psi = np.zeros((len(t), min(5, n_basis)))
for k in range(min(5, n_basis)):
    psi[:, k] = B @ eigenvectors[:, k]
    psi[:, k] /= np.sqrt(np.trapz(psi[:, k] ** 2, t))

var_explained = eigenvalues / eigenvalues.sum()
print(f"\nFPCA variance explained (first 5 PCs): {var_explained[:5].round(4)}")
print(f"Cumulative: {np.cumsum(var_explained)[:5].round(4)}")

# --- Functional Depth (simplified MBD) ---
def modified_band_depth(X, t_grid):
    n, m = X.shape
    mbd = np.zeros(n)
    for i in range(n):
        count = 0
        total = 0
        for j in range(n):
            if j == i:
                continue
            for k in range(j + 1, n):
                if k == i:
                    continue
                lower = np.minimum(X[j], X[k])
                upper = np.maximum(X[j], X[k])
                count += np.sum((X[i] >= lower) & (X[i] <= upper))
                total += m
                # Approximate proportion for speed
        mbd[i] = count / total if total > 0 else 0
    return mbd

# Use subset for speed
mbd = modified_band_depth(x_smooth[:10, ::5], t[::5])
print(f"\nFunctional depths (first 10): {mbd.round(4)}")

# --- Visualization ---
fig, axes = plt.subplots(2, 2, figsize=(12, 8))

# Raw curves
for i in range(min(10, n)):
    axes[0, 0].plot(t, x[i, :], alpha=0.3, lw=0.5)
axes[0, 0].set_title("Raw Curves")
axes[0, 0].set_xlabel("t")
axes[0, 0].set_ylabel("x(t)")

# Smoothed curves
for i in range(min(10, n)):
    axes[0, 1].plot(t, x_smooth[i, :], alpha=0.5, lw=0.8)
axes[0, 1].set_title("Smoothed (B-splines)")
axes[0, 1].set_xlabel("t")

# Eigenfunctions
colors = ['red', 'blue', 'green', 'orange', 'purple']
for k in range(min(3, psi.shape[1])):
    axes[1, 0].plot(t, psi[:, k], color=colors[k], lw=1.5,
                    label=f"PC{k+1} ({var_explained[k]*100:.1f}%)")
axes[1, 0].legend(fontsize=9)
axes[1, 0].set_title("FPCA Eigenfunctions")
axes[1, 0].axhline(y=0, color='gray', ls='--', lw=0.5)

# Scree plot
axes[1, 1].bar(range(1, 8), var_explained[:7] * 100, color='steelblue')
axes[1, 1].set_xlabel("Component")
axes[1, 1].set_ylabel("% Variance")
axes[1, 1].set_title("Scree Plot")

plt.tight_layout()
plt.savefig("fda_analysis.png", dpi=150, bbox_inches="tight")
plt.show()

Summary

Key Takeaways: FDA

  1. FDA treats observations as smooth functions in L2(T)L^2(\mathcal{T}); basis expansions (B-splines, Fourier) convert discretized data into functional representations with roughness penalties controlling smoothness.
  2. FPCA decomposes functional variability via the Fredholm integral eigenvalue problem; the Karhunen-Loève representation provides an optimal low-dimensional approximation.
  3. Functional linear models (scalar-on-function, function-on-function) extend classical regression; penalized spline estimation handles the ill-posed inverse problem.
  4. Functional depth (MBD) generalizes median and quantiles to function spaces, enabling robust inference and outlier detection.
  5. Registration removes phase variation (horizontal shifts) from amplitude variation (vertical differences), which is critical for meaningful functional comparisons.

Premium Content

Functional Data Analysis (FDA)

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement