5  Autoregressive Models


Autoregressive models provide a simple yet useful framework to predict future values based on past observations. The forecasting model uses past values (\(Y_{t-1}, Y_{t-2}, Y_{t-3}, \dots\)) to forecast \(Y_{t}\).

5.1 AR

To predict the future, a natural place to start is the immediate past. The AR(1) model frames this simplest case, where only the immediately previous observation influences the current one:

\[ Y_t = \phi_0 +\phi_1 Y_{t-1}+ \varepsilon_t \]

where:

\[\begin{align} Y_t: & \ \text{the value of $Y$ at time} \ t \\ \phi_0: & \ \text{constant term} \\ \phi_1: & \ \text{autoregressive coefficient} \\ \varepsilon_t: & \ \text{white noise error} \end{align}\]

The coefficients \(\phi_0\) and \(\phi_1\) can be estimated using OLS regression. No causal interpretation should be assumed.

NoteExample

Adjust Autoregressive models to GDP data for Mexico. Stationarity is required, the idea is that a process whose behavior is stable over time is also predictable.

Show code
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from datetime import datetime
import yfinance as yf
from pandas_datareader import data as pdr
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.tsa.ar_model import AutoReg
import plotly.graph_objects as go
from scipy.stats import norm
Show code
GDP = pdr.DataReader('NGDPRNSAXDCMXQ', 'fred', start='1993Q1', end = '2026Q1')
GDP = GDP.rename(columns={'NGDPRNSAXDCMXQ': 'Values'})
GDP = GDP.reset_index()
GDP = GDP.rename(columns={'DATE': 'Date'})
GDP['Date'] = pd.to_datetime(GDP['Date'], dayfirst=True) + pd.offsets.QuarterEnd(0)
GDP.set_index(pd.PeriodIndex(GDP['Date'], freq='Q'), inplace=True)

GDP['Values'] = GDP['Values']/1000 # Mil millones
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Values'])
plt.xlabel('Date')
plt.ylabel('Billions of Mexican Pesos')
plt.title('Mexico Quartely GDP: 1990 to Date')
plt.tight_layout()
# plt.grid(True)
plt.show()

Clearly the series is not stationary. Formally, implement the tests:

Tests for stationarity

adf_p  = adfuller(GDP['Values'])[1]
kpss_p = kpss(GDP['Values'], regression='c', nlags='auto')[1]

print(f"ADF p-value: {adf_p:.4f}")
print(f"KPSS p-value: {kpss_p:.4f}")
ADF p-value: 0.7907
KPSS p-value: 0.0100
/var/folders/wl/12fdw3c55777609gp0_kvdrh0000gn/T/ipykernel_69787/1519198186.py:2: InterpolationWarning:

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

Estimate GDP growth:

GDP['y'] = np.log(GDP['Values']).diff() * 100
GDP = GDP.dropna()
Show code
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'])
plt.title('Quarterly Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()

Is it stationary?

adf_p  = adfuller(GDP['y'])[1]
kpss_p = kpss(GDP['y'], regression='c', nlags='auto')[1]

print(f"ADF p-value: {adf_p:.4f}")
print(f"KPSS p-value: {kpss_p:.4f}")
ADF p-value: 0.0000
KPSS p-value: 0.1000
/var/folders/wl/12fdw3c55777609gp0_kvdrh0000gn/T/ipykernel_69787/3253650088.py:2: InterpolationWarning:

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

Tests suggest the time series of GDP growth is stationary.

5.1.1 AR(1) model for the growth rate of GDP

The model of interest is:

\[ gGDP_t = \phi_0 +\phi_1 gGDP_{t-1}+ \varepsilon_t \]

df_ar1 = pd.DataFrame({
    'y': GDP['y'],
    'y_lag1': GDP['y'].shift(1)
}).dropna()

df_ar1 = sm.add_constant(df_ar1)

ols_model = sm.OLS(df_ar1['y'], df_ar1[['const', 'y_lag1']]).fit()
print(ols_model.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.130
Model:                            OLS   Adj. R-squared:                  0.124
Method:                 Least Squares   F-statistic:                     19.18
Date:                Thu, 23 Apr 2026   Prob (F-statistic):           2.45e-05
Time:                        15:25:29   Log-Likelihood:                -333.57
No. Observations:                 130   AIC:                             671.1
Df Residuals:                     128   BIC:                             676.9
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.6563      0.281      2.335      0.021       0.100       1.212
y_lag1        -0.3615      0.083     -4.380      0.000      -0.525      -0.198
==============================================================================
Omnibus:                      107.497   Durbin-Watson:                   2.037
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             1746.789
Skew:                          -2.611   Prob(JB):                         0.00
Kurtosis:                      20.182   Cond. No.                         3.44
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

The lagged growth rate is a useful predictor of the current GDP growth rate:

GDP.tail(4)
Date Values y
Date
2025Q1 2025-03-31 6220.5907 -3.039379
2025Q2 2025-06-30 6402.8110 2.887224
2025Q3 2025-09-30 6357.1770 -0.715270
2025Q4 2025-12-31 6527.1142 2.638051
last_y = GDP['y'].iloc[-1]

phi_0 = ols_model.params['const']
phi_1 = ols_model.params['y_lag1']

y_hat = phi_0 + phi_1 * last_y
print(f"2025 Q4 Forecast: {y_hat:.4f}%")
2025 Q4 Forecast: -0.2975%

\[ GDPGR_{25:Q4|25:Q3} = 0.6418 - 0.3602\cdot(-0.7709)= 0.9195\% \]

5.1.2 AR(2) model for the growth rate of GDP

\[ gGDP_t = \phi_0 +\phi_1 gGDP_{t-1} +\phi_2 gGDP_{t-2} + \varepsilon_t \]

df_ar2 = pd.DataFrame({
    'y': GDP['y'],
    'y_lag1': GDP['y'].shift(1),
    'y_lag2': GDP['y'].shift(2)
}).dropna()

df_ar2 = sm.add_constant(df_ar2)

ols_model2 = sm.OLS(df_ar2['y'], df_ar2[['const', 'y_lag1','y_lag2']]).fit()
print(ols_model2.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.133
Model:                            OLS   Adj. R-squared:                  0.119
Method:                 Least Squares   F-statistic:                     9.649
Date:                Thu, 23 Apr 2026   Prob (F-statistic):           0.000126
Time:                        15:25:29   Log-Likelihood:                -331.30
No. Observations:                 129   AIC:                             668.6
Df Residuals:                     126   BIC:                             677.2
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.6957      0.290      2.401      0.018       0.122       1.269
y_lag1        -0.3810      0.089     -4.279      0.000      -0.557      -0.205
y_lag2        -0.0546      0.089     -0.613      0.541      -0.231       0.122
==============================================================================
Omnibus:                      109.382   Durbin-Watson:                   2.029
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             1722.163
Skew:                          -2.721   Prob(JB):                         0.00
Kurtosis:                      20.053   Cond. No.                         4.08
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
y_l1 = GDP['y'].iloc[-1]
y_l2 = GDP['y'].iloc[-2]

phi2_0 = ols_model2.params['const']
phi2_1 = ols_model2.params['y_lag1']
phi2_2 = ols_model2.params['y_lag2']

y_hat = phi2_0 + phi2_1 * y_l1 + phi2_2 * y_l2
print(f"2025 Q1 Forecast: {y_hat:.4f}%")
2025 Q1 Forecast: -0.2705%

The adjusted r-squared increases only slightly relative to the AR(1) model and the second lag is not statistically significant. Most of the variation in next-quarter’s GDP growth remains unforecasted. Can we do better?


5.1.3 Forecast Error and Confidence intervals

5.1.3.1 Root Mean Squared Forecast Error (RMSFE)


Because the forecasting model is estimated using historical data available at the time the model is estimated, a forecast is an out-of-sample prediction of the future. Forecasts can either be one-step ahead or multiple steps ahead, for instance:

\[ Y_{T+1|T} \] represents the forecast of \(Y\) at date \(T+1\) based on data through date \(T\). This implies a forecast error:

\[ Y_{T+1}-\hat{Y}_{T+1|T} \]

The accuracy of a model’s predictions can be measured with the Mean Squared Forecast Error (MSFE). It quantifies the average difference between the actual (observed) values and the predicted values produced by the model. More formally, it is the expected value of the square of the forecast error, when the forecast is made for an observation not used to estimate the forecasting model (i.e., for an observation in the future).

\[ MSFE = E\big[Y_{T+1}-\hat{Y}_{T+1|T}\big]^2 \]

Using the square of the error penalizes large forecast error much more heavily than small ones. A small error might not matter, but a very large error could call into question the entire forecasting exercise.

The Root Mean Squared Forescast Error (RMSFE) derives from the MSE, but has the same units as the series being forecast so its magnitude is more easily interpreted, this is:

\[ RMSFE = \sqrt{E[Y_{T+1}-\hat{Y}_{T+1|T}]^2} \]

The RMSFE is like the standard deviation, except that it explicitly focuses on the forecast error using estimated coefficients. It can be understood a measure of the magnitude of a typical forecasting “mistake”.


The method we employ to estimate the MSFE is based on simulating how the forecasting model would have done, had we been using it in real time.

– Choose a date \(P\) for the start of a pseudo out-of-sample (POOS), for example \(P = 0.8T\).

– Re-estimate your model every period, \(s = P−1,…,T−1\)

– Compute the forecast error: \(Y_{s+1}-\hat{Y}_{s+1|s}\)

– Plug this forecast error into the equation: \(MSFE_{POOS}=\frac{1}{P} \sum_{s=T-P+1}[Y_{s+1}-\hat{Y}_{s+1|s}]^2\)


Return to the example of predicting the next quarter GDP growth to approximate the forecast error. Start a pseudo out-of-sample (POOS) by splitting the sample into ‘train’ and ‘test’ sub-samples:

y_roll = GDP['y'] 
T = len(y_roll)
P = int(0.8 * T) # 80% train

Fit the AR(1) model recursively and use it to forecast one step ahead

forecasts = []
actuals   = []

# Rolling POOS forecast
for s in range(P, T - 1):
    # Fit model on data up to time s
    model = AutoReg(y_roll[:s+1], lags=1, old_names=False).fit()

    # Forecast y_{s+1}
    y_pred = model.predict(start=s+1, end=s+1).iloc[0]

    forecasts.append(y_pred)
    actuals.append(y_roll.iloc[s+1])

Compute forecast errors:

errors = np.array(actuals) - np.array(forecasts)

# Compute RMSE
rmsfe = np.sqrt(np.mean(errors**2))
print(f"POOS RMSE AR(1): {rmsfe:.4f} pp")
POOS RMSE AR(1): 5.3961 pp


5.1.3.2 Confidence intervals


The RMSFE can be used as an estimate of the forecast error variance, which feeds directly into confidence intervals. A \((1-\alpha)\) forecast interval can be constructed as

\[ \hat{Y}_{T+1|T} \pm Z_{\alpha/2} \cdot RMSFE \]

The \(Z-value\) is found using the inverse of the CDF from the normal distribution. This assumes the innovation \(\varepsilon_t\) in the AR model is normally distributed.

Continuing with the example:

Recall the forecast of GDP growth using an AR(1) model is:

Show code
y_hat = phi_0 + phi_1 * last_y
print(f"2025 Q4 Forecast: {y_hat:.4f}%")
2025 Q4 Forecast: -0.2975%

which can also be obtained as:

final_model = AutoReg(GDP['y'], lags=1).fit()
forecast_T1 = final_model.params['const'] + final_model.params['y.L1'] * GDP['y'].iloc[-1]
print(f"2025 Q2 Forecast: {forecast_T1:.4f}%")
2025 Q2 Forecast: -0.2975%

Then, plug the one-period forecast in:

from scipy.stats import norm

# For a 80% confidence interval
alpha = 0.20
z = norm.ppf(1-alpha/2)

ci_lower = forecast_T1 - z * rmsfe
ci_upper = forecast_T1 + z * rmsfe

print(f"2025 Q4 Forecast: {forecast_T1:.4f}")
print(f"80% CI (based on RMSFE): [{ci_lower:.4f}, {ci_upper:.4f}]")
2025 Q4 Forecast: -0.2975
80% CI (based on RMSFE): [-7.2129, 6.6180]

Forecasting the next 4 quarters:

forecast_horizon = 4
f_values = []
current_y = last_y

for _ in range(forecast_horizon):
    next_y = final_model.params['const'] + final_model.params['y.L1'] * current_y
    f_values.append(next_y)
    current_y = next_y


ci_upper = [f + z * rmsfe for f in f_values]
ci_lower = [f - z * rmsfe for f in f_values]
Show code
GDP['Date'] = pd.to_datetime(GDP['Date'])
last_date   = GDP['Date'].iloc[-1]
forecast_dates = pd.date_range(start=last_date + pd.offsets.QuarterEnd(), periods=4, freq='QE')

extended_dates  = [GDP['Date'].iloc[-1]] + list(forecast_dates)
extended_values = [GDP['y'].iloc[-1]] + f_values

# Plot
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'], label='Observed')
plt.plot(extended_dates, extended_values, label='Forecast', color='orange')
plt.fill_between(forecast_dates, ci_lower, ci_upper, color='orange', alpha=0.3, label='80% CI')
plt.title('Forecast of GDP Growth for México with 80% Confidence Interval')
plt.xlabel('Date')
plt.ylabel('y')
plt.legend()
plt.tight_layout()
plt.show()

While we ony employ AR(1) and AR(2) models in this example, the order of the autoregression can be extended. An autorregresive model of order p, AR(p), is written as:

\[ Y_{t} = \phi_0 + \phi_{1}Y_{t-1} + \phi_{2}Y_{t-2} + \dots + \phi_{p}Y_{t-p} + \varepsilon_{t} \]

where \(\varepsilon_t\) is white noise and restrictions on the values of \(\phi\) apply due to stationarity requirements.

NotePractice

In this activity, you will estimate an AR(1) model to forecast U.S. inflation (FRED code: CPIAUCSL). Using monthly data from 2010 to 2026, forecast three months ahead, and construct 90% forecast confidence intervals using an out-of-sample measure of forecast accuracy.

Forecasting U.S. inflation is highly relevant for Mexico. The United States is Mexico’s largest trading partner, and inflation dynamics in the U.S. influence Mexico’s macroeconomic indicators including export demand, exchange rate movements, and monetary policy decisions by Banco de México.

Because inflation in more advanced economies tends to exhibit persistence, short-term dynamics are often reasonably approximated by an AR(1) process.

5.2 MA

So far, we have assumed that persistence in the series comes from past values of the variable itself, but persistence might come from past shocks. This possibility leads us to Moving Average (MA) models.

A moving average model of order q, MA(q), is:

\[ Y_{t} = \mu + \varepsilon_t + \theta_{1}\varepsilon_{t-1} + \theta_{2}\varepsilon_{t-2} + \dots + \theta_{q}\varepsilon_{t-q} \]

where \(\varepsilon_t\) is white noise:

  • \(E(\varepsilon_t)=0\)
  • \(Var(\varepsilon_t)=\sigma^2\)
  • \(Cov(\varepsilon_t, \varepsilon_s)=0\) for \(t\ne s\)

The MA(q) can also be written as:

\[ Y_t = \mu + \sum_{j=0}^q \theta_j \varepsilon_{t-j} \] with \(\theta_0 := 1\).

Properties:

Mean: \[ E(Y_t) = \mu \]

Variance: \[ Var(Y_t) = \sigma^2 \sum_{j=0}^q\theta_j^2 \]

Covariance:

\[ Cov(Y_t,Y_{t-h})= \gamma_h = \begin{cases} \sigma^2 \sum_{j=0}^{q-h}\theta_j \theta_{j+h} &\text{ } h=0,1,2,\dots,q \\ 0 &\text{ } h>q. \end{cases} \]

Note that the process MA(q) is then, by construction, weakly stationary.



Now, consider the special case MA(1):

\[ Y_{t} = \mu + \varepsilon_t + \theta_{1}\varepsilon_{t-1} \]

– Mean: \(E(Y_t) = \mu\)

– Variance: \(Var(Y_t)=\sigma^2(1+\theta_1^2)\)

– Covariance: \(Cov(Y_t,Y_{t-1})=\sigma^2 \theta_1\)

The autocorrelation function is characterized by:

\[ \rho(0) = 1 \]

\[ \rho(1) = \frac{\sigma^2 \theta_1}{\sigma^2(1+\theta_1^2)}= \frac{\theta_1}{1+\theta_1^2} \]

that is, in a MA(1) the ACF cuts off at lag 1.



It is possible to write any stationary \(AR(p)\) model as an \(MA(\infty)\). For example, consider an AR(1) with \(\mid \phi \mid < 1\):

\[ \begin{align*} Y_t &= \phi Y_{t-1} + \varepsilon_t\\ &= \phi (\phi Y_{t-2} + \varepsilon_{t-1}) + \varepsilon_t\\ &= \phi ^2Y_{t-2} + \phi \varepsilon_{t-1} + \varepsilon_t\\ &= \phi ^3Y_{t-3} + \phi ^2\varepsilon_{t-2} + \phi \varepsilon_{t-1} + \varepsilon_t\\ &= \vdots \\ &= \varepsilon_t + \phi \varepsilon_{t-1} + \phi ^2\varepsilon_{t-2} + \phi ^3Y_{t-3} + \dots \end{align*} \]

which is an \(MA(\infty)\) process.

If an MA model is invertible, then the invertible \(MA(q)\) process can be written as an \(AR(\infty)\). For MA(q), invertibility is ensured if the roots of the MA polynomial \(1+\theta_1z + \cdots + \theta_qz^q=0\) lie outside the unit circle.

Examine features of the following processes:

\[ Y_t = 1 + 0.8 \varepsilon_{t-1} + \varepsilon_t \]

Show code
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf 


T = 500
mu = 1
theta1 = 0.8

np.random.seed(1234)
eps = np.random.normal(loc=0.0, scale=1.0, size=T+1)

y_ma1 = np.zeros(T)

for t in range(1, T):
    y_ma1[t] = mu + eps[t] + theta1 * eps[t-1]

# Plots
fig, axes = plt.subplots(3, 1, figsize=(7, 10))
axes[0].plot(y_ma1)
axes[0].set_title("Simulated MA(1) process")

plot_acf(y_ma1, lags=20, ax=axes[1], zero=False)
axes[1].set_title("ACF of MA(1)")

plot_pacf(y_ma1, lags=20, ax=axes[2], zero=False)
axes[2].set_title("PACF of MA(1)")

plt.tight_layout()
plt.show()

ACF: big spike at lag 1, near zero after lag 1

PACF: decays gradually.

\[ Y_t = 1 + 0.5 \varepsilon_{t-1} + 0.4 \varepsilon_{t-2} + \varepsilon_t \]

Show code
theta1 = 0.5
theta2 = 0.4

np.random.seed(1234)
eps = np.random.normal(loc=0.0, scale=1.0, size=T+2)

y_ma2 = np.zeros(T)

# Start at t = 2 so t-1 and t-2 exist
for t in range(2, T):
    y_ma2[t] = mu + eps[t] + theta1 * eps[t-1] + theta2 * eps[t-2]

fig, axes = plt.subplots(3, 1, figsize=(7, 10))
axes[0].plot(y_ma2)
axes[0].set_title("Simulated MA(2) process)")

plot_acf(y_ma2, lags=10, ax=axes[1], zero=False)
axes[1].set_title("ACF of MA(2)")

plot_pacf(y_ma2, lags=10, ax=axes[2], zero=False)
axes[2].set_title("PACF of MA(2)")

plt.tight_layout()
plt.show()

ACF: significant at lags 1 and 2, near zero for lags > 2.

PACF: decays gradually.

\[ Y_t = 1 + 0.8 Y_{t-1} + \varepsilon_t \]

Show code
phi1 = 0.8

np.random.seed(124)
eps = np.random.normal(loc=0.0, scale=1.0, size=T)
y_ar1 = np.zeros(T)


for t in range(1, T):
    y_ar1[t] = mu + phi1 * y_ar1[t-1] + eps[t]


fig, axes = plt.subplots(3, 1, figsize=(7, 10))
axes[0].plot(y_ar1)
axes[0].set_title("Simulated AR(1) process")

plot_acf(y_ar1, lags=10, ax=axes[1], zero = False)
axes[1].set_title("ACF of AR(1)")

plot_pacf(y_ar1, lags=10, ax=axes[2], zero = False)
axes[2].set_title("PACF of AR(1)")

plt.tight_layout()
plt.show()

ACF: gradually decaying.

PACF: large spike at lag 1, near zero after lag 1.

\[ Y_t = 1 + 0.5 Y_{t-1} + 0.4 Y_{t-2} + \varepsilon_t \]

Show code
phi1 = 0.5
phi2 = 0.4

np.random.seed(1234)
eps = np.random.normal(loc=0.0, scale=1.0, size=T)
y_ar2 = np.zeros(T)

# Start at t = 2 so that y_{t-1} and y_{t-2} exist
for t in range(2, T):
    y_ar2[t] = mu + phi1 * y_ar2[t-1] + phi2 * y_ar2[t-2] + eps[t]


fig, axes = plt.subplots(3, 1, figsize=(7, 10))
axes[0].plot(y_ar2)
axes[0].set_title("Simulated AR(2) process ")

plot_acf(y_ar2, lags=20, ax=axes[1], zero = False)
axes[1].set_title("ACF of AR(2)")

plot_pacf(y_ar2, lags=20, ax=axes[2], zero = False)
axes[2].set_title("PACF of AR(2)")

plt.tight_layout()
plt.show()

ACF: decays gradually (oscillates if \(\phi_2\) is negative)

PACF: strong at lags 1 and 2, near zero after lag 2.


Diagnostic:

For MA(q):

– ACF: cuts off after lag q

– PACF: tails off gradually

For AR(p):

– PACF: cuts off after lag p

– ACF: tails off gradually


5.3 ARMA/ARIMA

The combination of an autoregression (AR) and a moving average (MA) model, is an ARMA model. ARMA is an acronym for AutoRegressive Moving Average. An ARMA(p,q) process, is written as:

\[ Y_{t} = \mu + \phi_{1}Y_{t-1} + \phi_{2}Y_{t-2} + \dots + \phi_{p}Y_{t-p} + \varepsilon_t + \theta_{1}\varepsilon_{t-1} + \theta_{2}\varepsilon_{t-2} + \dots + \theta_{q}\varepsilon_{t-q} \]


ARIMA is the acronym for AutoRegressive Integrated Moving Average. An ARIMA(p,d,q) process, is written as:

\[ Y'_{t} = \mu + \phi_{1}Y'_{t-1} + \phi_{2}Y'_{t-2} + \dots + \phi_{p}Y'_{t-p} + \varepsilon_t + \theta_{1}\varepsilon_{t-1} + \theta_{2}\varepsilon_{t-2} + \dots + \theta_{q}\varepsilon_{t-q} \] where \(Y'\) is the differenced series.


where:

p:
Order of the autoregressive component
q:
Order of the moving average component
d:
Degree of differencing



Information Criteria


ACF and PACF will not help to distinguish ARIMA models when \(p,q>0\). In practice, several alternative specifications of ARIMA models are usually possible. In these situations, we need a way to compare competing models and select the one that better balances good fit with parsimony.

For this purpose we use information criteria. Two commonly used metrics are the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC):

\[ AIC = -2 \mathcal{L} + 2K \]

\[ BIC = -2 \mathcal{L} + ln(T)K \]

where \(\mathcal{L}\) is the likelihood of the data (how well the model explains the data we observed), \(K\) is the number of parameters in the model and \(T\) is the number of observations.

When comparing several candidate models, the model with the lowest AIC or BIC is preferred. The general idea is that a model should explain the data well \((\uparrow \mathcal{L})\) without becoming unnecessarily complex \((\downarrow K)\).

5.3.1 Box-Jenkins methodology

The Box–Jenkins methodology is a systematic approach used to identify, estimate, and validate ARIMA models for time series forecasting. The methodology consists of four main stages, which are often repeated iteratively until a satisfactory model is obtained.

1. Identification

The first step is to determine the appropriate structure of the ARIMA model. - Check stationarity

  • Examine the ACF and PACF plots to identify possible values for p and q

2. Estimation

Once candidate models are identified, the next step is to estimate its parameters. Evaluate:

  • Statistical significance of the parameters

  • Overall model fit using AIC or BIC

3. Diagnostic Checking

A well-specified model should produce residuals that behave like white noise. Residuals have no autocorrelation, constant variance and zero mean.

  • Use Residual ACF plots

  • Implement Ljung–Box tests for autocorrelation

  • Examine the residual distributions

If the residuals still contain structure or autocorrelation, the model must be re-specified, and the process returns to the identification stage.

4. Forecasting

Once a model passes the diagnostic checks, it can be used to generate forecasts.


The Box-Jenkins Methodology


NoteExample

Forecast, using ARIMA models, the GDP growth for Mexico.

Show code
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from datetime import datetime
import yfinance as yf
from pandas_datareader import data as pdr
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.graphics.tsaplots import plot_acf
from statsmodels.graphics.tsaplots import plot_pacf
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error
import plotly.graph_objects as go


1. Identification

Show code
GDP = pdr.DataReader('NGDPRNSAXDCMXQ', 'fred', start='1993Q1', end = '2026Q1')
GDP = GDP.rename(columns={'NGDPRNSAXDCMXQ': 'Values'})
GDP = GDP.reset_index()
GDP = GDP.rename(columns={'DATE': 'Date'})

GDP['Date'] = pd.to_datetime(GDP['Date'])
GDP['Date'] = pd.to_datetime(GDP['Date'], dayfirst=True) + pd.offsets.QuarterEnd(0)
GDP.set_index(pd.PeriodIndex(GDP['Date'], freq='Q'), inplace=True)

GDP['Values'] = GDP['Values']/1000 # Mil millones
GDP['y'] = np.log(GDP['Values']).diff() * 100
GDP = GDP.dropna()
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'])
plt.title('Quarterly Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()

Tests for stationarity

Show code
adf_p  = adfuller(GDP['y'])[1]
kpss_p = kpss(GDP['y'], regression='c', nlags='auto')[1]

print(f"ADF p-value: {adf_p:.4f}")
print(f"KPSS p-value: {kpss_p:.4f}")
ADF p-value: 0.0000
KPSS p-value: 0.1000
/var/folders/wl/12fdw3c55777609gp0_kvdrh0000gn/T/ipykernel_69787/3857404078.py:2: InterpolationWarning:

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

Tests suggest the time series is stationary.

Plot the ACF and PACF

Show code
plot_acf(GDP['y'], lags=36, zero=False)
plt.title('Autocorrelation Function (ACF) of Mexico GDP Growth')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()

Show code
plot_pacf(GDP['y'], lags=36, zero=False, method = 'ols')
plt.title('Partial autocorrelation Function (PACF) of Mexico GDP growth')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()

Both the ACPF and PACF show significant spikes at lags 1 and 3, but not in lag 2. Begin by estimating an ARMA (1,1). Later, we’ll fit other models with nearby lag lengths to compare information criteria.

2. Estimation

model = ARIMA(GDP['y'], order=(1, 0, 1))
fitted_model = model.fit()

print(fitted_model.summary())
print("\nInformation Criteria:")
print(f"AIC: {fitted_model.aic:.2f}")
print(f"BIC: {fitted_model.bic:.2f}")
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  131
Model:                 ARIMA(1, 0, 1)   Log Likelihood                -334.149
Date:                Thu, 23 Apr 2026   AIC                            676.297
Time:                        15:25:32   BIC                            687.798
Sample:                    06-30-1993   HQIC                           680.970
                         - 12-31-2025                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.4812      0.203      2.366      0.018       0.083       0.880
ar.L1          0.1135      0.163      0.694      0.488      -0.207       0.434
ma.L1         -0.5304      0.133     -3.993      0.000      -0.791      -0.270
sigma2         9.6020      0.578     16.603      0.000       8.469      10.736
===================================================================================
Ljung-Box (L1) (Q):                   0.01   Jarque-Bera (JB):              2022.72
Prob(Q):                              0.92   Prob(JB):                         0.00
Heteroskedasticity (H):               2.87   Skew:                            -3.07
Prob(H) (two-sided):                  0.00   Kurtosis:                        21.24
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Information Criteria:
AIC: 676.30
BIC: 687.80
def fit_arma(ts, p, q):
    model = ARIMA(ts, order=(p, 0, q))
    fitted_model = model.fit()
    print(f"\nInformation Criteria (p={p}, q={q}):")
    print(f"AIC: {fitted_model.aic:.2f}")
    print(f"BIC: {fitted_model.bic:.2f}")
    return fitted_model
m1 = fit_arma(GDP['y'],1,1)
m2 = fit_arma(GDP['y'],1,0)
m3 = fit_arma(GDP['y'],0,1)
m4 = fit_arma(GDP['y'],3,1)
m5 = fit_arma(GDP['y'],1,3)
m6 = fit_arma(GDP['y'],3,3)

Information Criteria (p=1, q=1):
AIC: 676.30
BIC: 687.80

Information Criteria (p=1, q=0):
AIC: 677.43
BIC: 686.05

Information Criteria (p=0, q=1):
AIC: 674.55
BIC: 683.18

Information Criteria (p=3, q=1):
AIC: 665.21
BIC: 682.47

Information Criteria (p=1, q=3):
AIC: 663.26
BIC: 680.51

Information Criteria (p=3, q=3):
AIC: 666.68
BIC: 689.68

Select the model with lowest information criteria.

3. Diagnostic Checking

    residuals = m5.resid

    fig = plt.figure(figsize=(7, 7))
    gs = fig.add_gridspec(2, 2)

    ax1 = fig.add_subplot(gs[0, :]) 
    sns.lineplot(x=np.arange(len(residuals)), y=residuals, ax=ax1)
    ax1.set_title("Residuals over Time")

    ax2 = fig.add_subplot(gs[1, 0])
    plot_acf(residuals, lags=30, zero=False, ax=ax2)
    ax2.set_title("ACF of Residuals")

    ax3 = fig.add_subplot(gs[1, 1])
    sns.histplot(residuals, kde=True, ax=ax3)
    ax3.set_title("Histogram of Residuals")

    plt.tight_layout()
    plt.show()

Confirm the residuals are white noise with a Ljung-Box test:

print("\nLjung-Box Test:")
print(acorr_ljungbox(residuals, lags=[8], return_df=True))

Ljung-Box Test:
    lb_stat  lb_pvalue
8  3.381331   0.908202

The null hypothesis of no serial correlation can’t be rejected.

4. Forecasting

periods = 4
cl      = 0.9   

forecast      = m5.get_forecast(steps=periods, alpha = 1-cl) 
mean_forecast = forecast.predicted_mean
conf_int      = forecast.conf_int()

for date, value in mean_forecast.items():
  print(f"{date}: {value:.2f}")
2026Q1: -1.21
2026Q2: 1.87
2026Q3: -1.03
2026Q4: 1.98
GDP['Date'] = pd.to_datetime(GDP['Date'])
last_date   = GDP['Date'].iloc[-1]
forecast_dates = pd.date_range(start=last_date + pd.offsets.QuarterEnd(), periods=periods, freq='QE')

extended_dates  = [GDP['Date'].iloc[-1]] + list(forecast_dates)
extended_values = [GDP['y'].iloc[-1]] + list(mean_forecast.values)

# Plot
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'], label='Observed')
plt.plot(extended_dates, extended_values, label='Forecast', color='orange')
plt.fill_between(forecast_dates, conf_int.iloc[:, 0], conf_int.iloc[:, 1], color='orange', alpha=0.3, label=' Confidence Interval')
plt.title(f'ARIMA Forecast for GDP Growth  with  a {100*cl}% Confidence Interval')
plt.xlabel('Date')
plt.ylabel('y')
plt.legend()
plt.tight_layout()
plt.show()


5.4 SARIMA

A seasonal ARIMA (sARIMA) model is formed as an ARIMA model that adds seasonal terms. These models are designed to handle repeated seasonal patterns in data, such as quarterly demand cycles, monthly sales peaks, or yearly production slowdowns. A SARIMA model extends the ARIMA framework by including seasonal autoregressive, differencing, and moving average components. It can be expressed as:

\[ SARIMA(p,d,q)[P,D,Q]_s \]

where:

Non-seasonal components (standard ARIMA(p,d,q) model):

p: order of the autoregressive (AR) component

d: number of non-seasonal differences

q: order of the moving average (MA) component

Seasonal components

P: order of the seasonal autoregressive component

D: number of seasonal differences (\(Y_t - Y_{t-s}\)).

Q: order of the seasonal moving average component

s: length of the seasonal cycle (12 for monthly data, 4 for quarterly data, etc.)



Identifying SARIMA Models

As in the Box–Jenkins methodology, identification relies on:

  • Plotting the time series

  • Examining the ACF and PACF

  • Looking for spikes at seasonal lags (e.g., 12, 24, 36 for monthly data)

Seasonal patterns often appear as significant autocorrelations at multiples of the seasonal period. For example:

A SARIMA \((0,0,0)(0,0,2)_{12}:\)

  • will show significant spikes only in lags 12 and 24 in the ACF

  • The PACF tails off in the seasonal lags

An SARIMA \((0,0,0)(2,0,0)_{12}:\)

  • will show significant spikes only in lags 12 and 24 in the PACF

  • The ACF tails off in the seasonal lags

NoteExample

Forecast, using SARIMA models, the GDP growth for Mexico.

Show code
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from datetime import datetime
import yfinance as yf
from pandas_datareader import data as pdr
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.seasonal import seasonal_decompose, STL
from sklearn.metrics import mean_squared_error
import plotly.graph_objects as go


1. Identification

Show code
GDP = pdr.DataReader('NGDPRNSAXDCMXQ', 'fred', start='1993Q1', end = '2026Q1')
GDP = GDP.rename(columns={'NGDPRNSAXDCMXQ': 'Values'})
GDP = GDP.reset_index()
GDP = GDP.rename(columns={'DATE': 'Date'})

GDP['Date'] = pd.to_datetime(GDP['Date'], dayfirst=True) + pd.offsets.QuarterEnd(0)
GDP.set_index(pd.PeriodIndex(GDP['Date'], freq='Q'), inplace=True)

GDP['Values'] = GDP['Values']/1000 # Mil millones
GDP['y'] = np.log(GDP['Values']).diff() * 100
GDP = GDP.dropna()
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'])
plt.title('Quarterly Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()

stl = STL(GDP['y'], period=4, robust=True)
res = stl.fit()

fig, axes = plt.subplots(4, 1, figsize=(7, 10), sharex=True)

GDP['y'].plot(ax=axes[0], title='Mexico Quarterly GDP', ylabel='Level')
res.trend.plot(ax=axes[1], title='Trend/Cycle')
res.seasonal.plot(ax=axes[2], title='Seasonality')
res.resid.plot(ax=axes[3], title='Remainder')

for ax in axes:
    ax.grid(False)
    
plt.tight_layout()
plt.show()

The decomposition shows a clear seasonal component with a periodicity of 4 (quarterly), justifying the use of a SARIMA model.

Tests for stationarity

Show code
adf_p  = adfuller(GDP['y'])[1]
kpss_p = kpss(GDP['y'], regression='c', nlags='auto')[1]

print(f"ADF p-value: {adf_p:.4f}")
print(f"KPSS p-value: {kpss_p:.4f}")
ADF p-value: 0.0000
KPSS p-value: 0.1000
/var/folders/wl/12fdw3c55777609gp0_kvdrh0000gn/T/ipykernel_69787/3857404078.py:2: InterpolationWarning:

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

Tests suggest the time series is stationary.

Plot the ACF and PACF

Show code
plot_acf(GDP['y'], lags=36, zero=False)
plt.title('Autocorrelation Function (ACF) of Mexico GDP Growth')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()

Show code
plot_pacf(GDP['y'], lags=36, zero=False, method = 'ols')
plt.title('Partial autocorrelation Function (PACF) of Mexico GDP growth')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()

For the nonseasonal component we found earlier that a (1,0,3) model capture the process with some accuracy. The significant spikes at lag 4, 8, 12 suggest a seasonal MA(3) component. Then, an initial candidate will be a SARIMA \((1, 0, 3) (0, 0, 3)_4\) model. Proceed to estimate the model and compare the AIC, BIC, and residuals with other reasonable guesses.

2. Estimation

def fit_sarima(ts, p, d, q, P, D, Q, s):
    model = SARIMAX(ts, order=(p, d, q), seasonal_order=(P, D, Q, s))
    fitted_model = model.fit(disp=False)
    print(f"\nSARIMA({p},{d},{q})({P},{D},{Q})[{s}]")
    print(f"AIC: {fitted_model.aic:.2f}")
    print(f"BIC: {fitted_model.bic:.2f}")
    return fitted_model
m1 = fit_sarima(GDP['y'],1,0,3,0,0,3,4)

m2 = fit_sarima(GDP['y'],1,0,3,0,0,1,4)

m3 = fit_sarima(GDP['y'],1,0,3,0,0,2,4)

m4 = fit_sarima(GDP['y'],1,0,1,1,0,1,4)

m5 = fit_sarima(GDP['y'],1,0,1,0,0,1,4)
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/base/model.py:607: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

SARIMA(1,0,3)(0,0,3)[4]
AIC: 671.47
BIC: 694.48

SARIMA(1,0,3)(0,0,1)[4]
AIC: 669.83
BIC: 687.08

SARIMA(1,0,3)(0,0,2)[4]
AIC: 670.78
BIC: 690.90

SARIMA(1,0,1)(1,0,1)[4]
AIC: 661.88
BIC: 676.26

SARIMA(1,0,1)(0,0,1)[4]
AIC: 678.43
BIC: 689.93
/opt/anaconda3/lib/python3.12/site-packages/statsmodels/base/model.py:607: ConvergenceWarning:

Maximum Likelihood optimization failed to converge. Check mle_retvals

Select the model with lowest information criteria.

3. Diagnostic Checking

  residuals = m4.resid

  fig = plt.figure(figsize=(7, 7))
  gs = fig.add_gridspec(2, 2)

  ax1 = fig.add_subplot(gs[0, :]) 
  sns.lineplot(x=np.arange(len(residuals)), y=residuals, ax=ax1)
  ax1.set_title("Residuals over Time")

  ax2 = fig.add_subplot(gs[1, 0])
  plot_acf(residuals, lags=30, zero=False, ax=ax2)
  ax2.set_title("ACF of Residuals")

  ax3 = fig.add_subplot(gs[1, 1])
  sns.histplot(residuals, kde=True, ax=ax3)
  ax3.set_title("Histogram of Residuals")

  plt.tight_layout()
  plt.show()

Confirm the residuals are white noise with a Ljung-Box test:

print("\nLjung-Box Test:")
print(acorr_ljungbox(residuals, lags=[8], model_df=4, return_df=True))

Ljung-Box Test:
    lb_stat  lb_pvalue
8  0.990826   0.911184

The null hypothesis of no serial correlation can’t be rejected.

rmse = np.sqrt(np.mean(residuals**2))
print(f"RMSE: {rmse:.4f}")
RMSE: 2.9042

4. Forecasting

periods = 4
cl      = 0.9   # confidence level

forecast = m4.get_forecast(steps=periods, alpha = 1-cl) 
mean_forecast = forecast.predicted_mean
conf_int = forecast.conf_int()

for date, value in mean_forecast.items():
  print(f"{date}: {value:.2f}")
2026Q1: -2.52
2026Q2: 1.32
2026Q3: 0.30
2026Q4: 2.02
GDP['Date'] = pd.to_datetime(GDP['Date'])
last_date   = GDP['Date'].iloc[-1]
forecast_dates = pd.date_range(start=last_date + pd.offsets.QuarterEnd(), periods=periods, freq='QE')

extended_dates  = [GDP['Date'].iloc[-1]] + list(forecast_dates)
extended_values = [GDP['y'].iloc[-1]] + list(mean_forecast.values)


# Plot
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'], label='Observed')
plt.plot(extended_dates, extended_values, label='Forecast', color='orange')
plt.fill_between(forecast_dates, conf_int.iloc[:, 0], conf_int.iloc[:, 1], color='orange', alpha=0.3, label=' Confidence Interval')
plt.title(f'SARIMA Forecast for GDP Growth  with  a {100*cl}% Confidence Interval')
plt.xlabel('Date')
plt.ylabel('y')
plt.legend()
plt.tight_layout()
plt.show()


NotePractice

Apply the Box–Jenkins methodology to model and forecast the series Mexican Auto Imports (MAUINSA) obtained from the FRED database from 2010 to 2026. Automotive imports/exports are closely linked to economic activity, exchange rate movements, and the structure of North American supply chains. As a result, forecasting this series can provide valuable insights into future manufacturing demand, trade dynamics, and the broader business cycle in Mexico. Reliable forecasts of auto imports may therefore be useful for firms involved in logistics and manufacturing planning, as well as for policymakers monitoring external sector activity and industrial production


5.5 SARIMAX

Up to this point, we have modeled time series using ARIMA/SARIMA models, which rely exclusively on the past behavior of the variable of interest. In many applications, however, the evolution of a time series is also driven by external factors. To incorporate this idea, we can now introduce the SARIMAX model, where the ‘X’ stands for exogenous variables.

An exogenous variable \(X_t\) affects \(Y_t\), but is not determined by the model itself. For example: policy interventions, economic indicators, or temporary shocks (e.g., pandemic dummy). As in multiple regression models, to chose \(X\) we rely on economic theory, reasoning, literature review, institutional knowledge, and data availability.

A SARIMAX \((p,d,q)(P,D,Q)_s\) model is defined as:

\[ Y_t = SARIMA + \beta X+ \varepsilon_t \]

\(\beta X\) captures the effect of external variables, the term shifts the level of the series depending on \(X\).

In short, while SARIMA assumes the series evolves only from its own past, SARIMAX allows external variables to influence the series directly.

A SARIMAX model can also be expressed as a regression model where the residuals follow a SARIMA process:

\[ Y_t = \beta X+ u_t \]

where:

\[ u_t \sim SARIMA(p,d,q)(P,D,Q)_s \]

The regression component captures systematic external effects and the SARIMA component models remaining dependence and structure.

Important Limitation

Forecasts now depend on two elements:

\[ \hat{Y}_{t+h} = SARIMA + \hat{\beta} X_{t+h} \]

Including exogenous variables improves the model only if those variables are known or can be forecasted.

If \(X_{t+h}\) is known forecasting is straightforward

If \(X_{t+h}\) is unknown it must be assumed or forecasted


Consider as an example a simple AR1 model for GDP Growth with a temporary covid shock. That is a SARIMAX(1,0,0)(0,0,0) model with a dummy variable for periods during the covid pandemic:

\[ Y_t = \phi_1 Y_{t-1} + \beta D + \varepsilon_t \]

with \(D=1\) for covid quarters and \(D=0\) otherwise, \(\phi_1\) captures the persistence in GDP growth \(\beta\) is the effect of the shock. \(\phi_0\) was omitted to simplify the notation.

Interpretation

With \(D=1\):

\[ Y_t = \phi_1 Y_{t-1} + \beta + \varepsilon_t \] The series, growth, is shifted by \(\beta\).

With \(D=0\):

\[ Y_t = \phi_1 Y_{t-1} + \varepsilon_t \]

Standard AR(1) behavior.

NoteExample

Forecast, using a SARIMAX model, the GDP growth for Mexico.

Show code
import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from datetime import datetime
import yfinance as yf
from pandas_datareader import data as pdr
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.seasonal import seasonal_decompose, STL
from sklearn.metrics import mean_squared_error
import plotly.graph_objects as go


1. Identification

Show code
GDP = pdr.DataReader('NGDPRNSAXDCMXQ', 'fred', start='1993Q1', end = '2026Q1')
GDP = GDP.rename(columns={'NGDPRNSAXDCMXQ': 'Values'})
GDP = GDP.reset_index()
GDP = GDP.rename(columns={'DATE': 'Date'})

GDP['Date'] = pd.to_datetime(GDP['Date'], dayfirst=True) + pd.offsets.QuarterEnd(0)
GDP.set_index(pd.PeriodIndex(GDP['Date'], freq='Q'), inplace=True)

GDP['Values'] = GDP['Values']/1000 # Mil millones
GDP['y'] = np.log(GDP['Values']).diff() * 100
GDP = GDP.dropna()
Show code
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'])
plt.title('Quarterly Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()
Show code
fig = go.Figure()

# Add observed data
fig.add_trace(go.Scatter(
    x=GDP['Date'],
    y=GDP['y'],
    mode='lines',
    name='Observed',
    hoverinfo='text',
    hovertemplate='<b>Date:</b> %{x|%Y-%m-%d}<br><b>Value:</b> %{y:.2f}<extra></extra>'
))


fig.update_layout(
    title=f'Mexican GDP Growth',
    xaxis_title='Date',
    yaxis_title='Percent',
    hovermode='x unified'
)

fig.show()
Show code
stl = STL(GDP['y'], period=4, robust=True)
res = stl.fit()

fig, axes = plt.subplots(4, 1, figsize=(7, 10), sharex=True)

GDP['y'].plot(ax=axes[0], title='Mexico Quarterly GDP', ylabel='Level')
res.trend.plot(ax=axes[1], title='Trend/Cycle')
res.seasonal.plot(ax=axes[2], title='Seasonality')
res.resid.plot(ax=axes[3], title='Remainder')

for ax in axes:
    ax.grid(False)
    
plt.tight_layout()
plt.show()

The decomposition shows a clear seasonal component with a periodicity of 4 (quarterly), justifying the use of a SARIMA model.

Tests for stationarity

Show code
adf_p  = adfuller(GDP['y'])[1]
kpss_p = kpss(GDP['y'], regression='c', nlags='auto')[1]

print(f"ADF p-value: {adf_p:.4f}")
print(f"KPSS p-value: {kpss_p:.4f}")
ADF p-value: 0.0000
KPSS p-value: 0.1000
/var/folders/wl/12fdw3c55777609gp0_kvdrh0000gn/T/ipykernel_69787/1650264032.py:2: InterpolationWarning:

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

Tests suggest the time series is stationary.

Plot the ACF and PACF

Show code
plt.figure(figsize=(7, 5))
plot_acf(GDP['y'], lags=36, zero=False)
plt.title('Autocorrelation Function (ACF) of Mexico GDP Growth')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()
<Figure size 672x480 with 0 Axes>

Show code
plt.figure(figsize=(7, 5))
plot_pacf(GDP['y'], lags=36, zero=False, method = 'ols')
plt.title('Partial autocorrelation Function (PACF) of Mexico GDP growth')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()
<Figure size 672x480 with 0 Axes>

2. Estimation

Define the exogenous factors. Here we will only include a dummy variable that accounts for periods of economic disruptions:

GDP['shock'] = 0

shock_period = [
    ('2008Q2', '2009Q2'),   # Global financial crisis
    ('2020Q1', '2021Q1')    # COVID shock 
]

for start, end in shock_period:
    GDP.loc[start:end, 'shock'] = 1
def fit_sarimax(ts, exog, p, d, q, P, D, Q, s):
    model = SARIMAX(
        ts,
        exog=exog,
        order=(p, d, q),
        seasonal_order=(P, D, Q, s),
        enforce_stationarity=False,
        enforce_invertibility=False
    )
    fitted_model = model.fit(disp=False)
    print(f"\nSARIMAX({p},{d},{q})({P},{D},{Q})[{s}]")
    print(f"AIC: {fitted_model.aic:.2f}")
    print(f"BIC: {fitted_model.bic:.2f}")
    return fitted_model

To maintain comparability we will only estimate the \(SARIMA(1,0,1)(1,0,1)_4\) that offered the lowest information criteria earlier in section 5.4.

X = GDP[['shock']]

model = fit_sarimax(GDP['y'],X,1,0,1,1,0,1,4)

SARIMAX(1,0,1)(1,0,1)[4]
AIC: 630.29
BIC: 647.26
Show code
print(model.summary())
                                     SARIMAX Results                                     
=========================================================================================
Dep. Variable:                                 y   No. Observations:                  131
Model:             SARIMAX(1, 0, 1)x(1, 0, 1, 4)   Log Likelihood                -309.143
Date:                           Thu, 23 Apr 2026   AIC                            630.287
Time:                                   15:25:36   BIC                            647.257
Sample:                               06-30-1993   HQIC                           637.181
                                    - 12-31-2025                                         
Covariance Type:                             opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
shock         -1.4374      0.605     -2.376      0.018      -2.623      -0.252
ar.L1          0.6339      0.105      6.025      0.000       0.428       0.840
ma.L1         -1.0000    479.239     -0.002      0.998    -940.292     938.291
ar.S.L4        0.9789      0.020     49.449      0.000       0.940       1.018
ma.S.L4       -0.8366      0.083    -10.076      0.000      -0.999      -0.674
sigma2         7.4285   3560.545      0.002      0.998   -6971.112    6985.969
===================================================================================
Ljung-Box (L1) (Q):                   0.39   Jarque-Bera (JB):              4920.26
Prob(Q):                              0.53   Prob(JB):                         0.00
Heteroskedasticity (H):               2.67   Skew:                            -3.92
Prob(H) (two-sided):                  0.00   Kurtosis:                        32.72
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

3. Diagnostic Checking

residuals = model.resid

fig = plt.figure(figsize=(7, 7))
gs = fig.add_gridspec(2, 2)

ax1 = fig.add_subplot(gs[0, :]) 
sns.lineplot(x=np.arange(len(residuals)), y=residuals, ax=ax1)
ax1.set_title("Residuals over Time")

ax2 = fig.add_subplot(gs[1, 0])
plot_acf(residuals, lags=30, zero=False, ax=ax2)
ax2.set_title("ACF of Residuals")

ax3 = fig.add_subplot(gs[1, 1])
sns.histplot(residuals, kde=True, ax=ax3)
ax3.set_title("Histogram of Residuals")

plt.tight_layout()
plt.show()

Confirm the residuals are white noise with a Ljung-Box test:

print("\nLjung-Box Test:")
print(acorr_ljungbox(residuals, lags=[8], model_df=5,return_df=True))

Ljung-Box Test:
    lb_stat  lb_pvalue
8  3.300246   0.347608

model_df is the number of parameters: p+q+P+Q+k, where k is the number of exogenous coefficients. The Ljung–Box statistic at lag h is compared to a \(\chi^2\) distribution with: \(df = h − model\_df\). To be positive, we must set lags > model_df.

The null hypothesis of no serial correlation can’t be rejected.

4. Forecasting

Assume that no crisis are expected for the period to be forecasted, that is, \(D_{T+h}= 0\)

periods = 4
cl      = 0.9   # confidence level

future_X = pd.DataFrame({'shock': [0] * periods})
forecast = model.get_forecast(steps=periods, exog=future_X, alpha=1-cl)

mean_forecast = forecast.predicted_mean
conf_int = forecast.conf_int()

for date, value in mean_forecast.items():
  print(f"{date}: {value:.2f}")
2026Q1: -2.74
2026Q2: 1.31
2026Q3: 0.32
2026Q4: 2.19
Year = mean_forecast.sum()
print(f"Forecast: {Year:.2f}%")
Forecast: 1.08%
Show code
GDP['Date'] = pd.to_datetime(GDP['Date'])
last_date   = GDP['Date'].iloc[-1]
forecast_dates = pd.date_range(start=last_date + pd.offsets.QuarterEnd(), periods=periods, freq='QE')

extended_dates  = [GDP['Date'].iloc[-1]] + list(forecast_dates)
extended_values = [GDP['y'].iloc[-1]] + list(mean_forecast.values)


# Plot
plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['y'], label='Observed')
plt.plot(extended_dates, extended_values, label='Forecast', color='orange')
plt.fill_between(forecast_dates, conf_int.iloc[:, 0], conf_int.iloc[:, 1], color='orange', alpha=0.3, label=' Confidence Interval')
plt.title(f'SARIMAX Forecast for GDP Growth  with  a {100*cl}% Confidence Interval')
plt.xlabel('Date')
plt.ylabel('y')
plt.legend()
plt.tight_layout()
plt.show()

Show code
fig = go.Figure()

# Add observed data
fig.add_trace(go.Scatter(
    x=GDP['Date'],
    y=GDP['y'],
    mode='lines',
    name='Observed',
    hoverinfo='text',
    hovertemplate='<b>Date:</b> %{x|%Y-%m-%d}<br><b>Value:</b> %{y:.2f}<extra></extra>'
))

# Add forecast
fig.add_trace(go.Scatter(
    x=extended_dates,
    y=extended_values,
    mode='lines',
    name='Forecast',
    line=dict(color='orange'),
    hoverinfo='text',
    hovertemplate='<b>Date:</b> %{x|%Y-%m-%d}<br><b>Forecast:</b> %{y:.2f}<extra></extra>'
))

# Add confidence interval
fig.add_trace(go.Scatter(
    x=forecast_dates,
    y=conf_int.iloc[:, 0],
    mode='lines',
    line=dict(width=0),
    showlegend=False,
    hoverinfo='none'
))

fig.add_trace(go.Scatter(
    x=forecast_dates,
    y=conf_int.iloc[:, 1],
    mode='lines',
    line=dict(width=0),
    fill='tonexty',
    fillcolor='rgba(255,165,0,0.3)',
    name=f'{100*cl}% CI',
    hoverinfo='none'
))

fig.update_layout(
    title=f'SARIMAX Forecast for GDP Growth.  {100*cl}% Confidence Interval',
    xaxis_title='Date',
    yaxis_title='y',
    hovermode='x unified'
)

fig.show()


NotePractice

Using monthly data from 2010 to 2025, estimate a SARIMAX model to forecast the growth rate of Mexican auto imports (MAUINSA). The dependent variable must be specified as the growth rate, computed as the first difference of the logarithm multiplied by 100. As an exogenous variable, include the growth rate of the US Industrial Production Index (INDPRO), constructed using the same transformation. Use the SARIMA specification previously identified for this series and extend it to a SARIMAX model by incorporating the exogenous regressor. Then, generate forecasts for the next 12 periods. Since SARIMAX requires future values of the exogenous variable, you must explicitly define the path of the Industrial Production Index during the forecast horizon (e.g., assume it remains constant at its most recent value or follows its recent average growth).