Granger Causality — Time Series Causality Testing
Statistics
Testing Whether One Time Series Predicts Another
Granger causality tests whether past values of one series improve predictions of another. It's a statistical notion of predictive causality that reveals lead-lag relationships in temporal data.
-
Economics — Test whether money supply growth Granger-causes inflation
-
Finance — Detect lead-lag relationships between stock markets across time zones
-
Neuroscience — Identify information flow directions between brain regions
If knowing X's past helps predict Y's future, X Granger-causes Y — a powerful test of temporal influence.
Granger causality tests whether past values of one time series help predict future values of another. It is a statistical notion of causality, not true causal inference.
DfGranger Causality
A time series Granger-causes if past values of contain information that improves the prediction of beyond what past values of alone provide.
Formal Definition
Consider forecasting using its own past and the past of :
Restricted Model
Here,
- =Coefficient on lagged Y
- =Number of lags
- =Error term with variance $\sigma_1^2$
Unrestricted Model
Here,
- =Coefficient on lagged X
- =Error term with variance $\sigma_2^2$
Testing Granger Causality
If Granger-causes , then for at least one , and .
Hypothesis Test
Granger Causality F-Test
Here,
- =Residual sum of squares from restricted model
- =Residual sum of squares from unrestricted model
- =Number of lags tested
- =Sample size
| Hypothesis | Conclusion |
|-----------|-----------|
| : | does NOT Granger-cause |
| : At least one | Granger-causes |
VAR Framework
Granger causality is naturally tested within a Vector Autoregression (VAR) model.
VAR(p) Model
Here,
- =2×2 coefficient matrix at lag i
- =Constant vector
Important Limitations
Granger Causality ? True Causality
Granger causality only tests predictive dependence, not true causal mechanisms. A significant result means:
-
is useful for forecasting
-
It does NOT mean causes
-
A third variable could cause both and
-
Results are sensitive to the lag length chosen
| Limitation | Explanation |
|-----------|------------|
| Predictive only | Tests statistical predictability, not mechanisms |
| Sensitive to lags | Results can change with different lag lengths |
| Linear only | Standard test assumes linear relationships |
| Stationarity | Series should be stationary or cointegrated |
| Omitted variables | May detect spurious causality if Z is missing |
Python Implementation
import numpy as np
import pandas as pd
from statsmodels.tsa.api import VAR
from statsmodels.tsa.stattools import grangercausalitytests
np.random.seed(42)
# Simulate correlated time series
n = 300
x = np.zeros(n)
y = np.zeros(n)
for t in range(1, n):
x[t] = 0.5 * x[t-1] + np.random.randn()
y[t] = 0.3 * x[t-1] + 0.4 * y[t-1] + np.random.randn()
data = pd.DataFrame({'Y': y, 'X': x})
# Granger causality test: X -> Y
print("X Granger-causes Y:")
gc_results = grangercausalitytests(data[['Y', 'X']], maxlag=5, verbose=True)
# VAR approach
model = VAR(data)
lag_order = model.select_order(maxlags=5)
print(f"\nSelected lag order: {lag_order.selected_orders['aic']}")
results = model.fit(maxlags=5)
print(results.summary())
Worked Example
Example: GDP and Unemployment
Testing whether GDP growth Granger-causes changes in unemployment:
-
ADF tests: Both series are non-stationary -> first differences are stationary
-
VAR lag selection: AIC suggests 2 lags
-
Granger test (: GDP does not Granger-cause unemployment):
-
F-statistic = 8.32, p-value = 0.0003
-
Reject : GDP growth helps predict unemployment changes
-
-
Reverse test (: Unemployment does not Granger-cause GDP):
-
F-statistic = 1.45, p-value = 0.236
-
Fail to reject : Unemployment does not predict GDP
-
Conclusion: GDP Granger-causes unemployment, but not vice versa.
Key Takeaways
Summary: Granger Causality
-
Granger causality tests whether improves predictions of beyond 's own past
-
It is a statistical test, not evidence of true causation
-
Test within a VAR framework using F-tests or likelihood ratio tests
-
Results depend on lag selection — always test multiple lag lengths
-
Both series should be stationary (or cointegrated if non-stationary)
-
Limitations: linear, predictive only, sensitive to omitted variables
Related Topics
-
See ARIMA Models for univariate time series modeling
-
See Causal Inference for methods that establish true causation
-
See Vector Autoregression for multivariate time series models