import numpy as np
from arch import arch_model
In this post we discuss a fundamental model in financial time series modelling - the ARCH model.
The model was introduced by Engle in the 80’s and brought him a Nobel prize in Economy. Nowadays ARCH is still the go-to model, or at least a starting point for time series with varying volatility, a characteristic shared by many financial return series.
ARCH model
Throughout we always denote a time series by \((Y_t)\). ARCH model is a specification of the conditional variance of \(Y_t\) given its history \(\mathcal F_{t-1}\) by the following equation: \[\mathrm{Var}[Y_t| \mathcal F_{t-1}] = \omega + \alpha Y_{t-1}^2.\]
In python, it is straightforward to define the ARCH model with the arch
library.
arch
library
= np.random.default_rng(41)
rng = rng.standard_normal(size=(500,)) * np.arange(500) * 0.2
y = arch_model(y, mean='zero', q=0)
mod = mod.fit(disp="off")
res = res.forecast(horizon=50, method="bootstrap", simulations=3) fcst
Above we exposed the main classes of the library.
- arch_model
is a convenience model constructor function with which one can specify the the mean process, the volatility (aka conditional standard deviation), the noise distribution (adhering to scipy.stats
).
- mod.fit()
returns a result container: one that contains residuals, standardised residuals, conditional volatility of the fit model.
- res.forecast()
returns a forecast container: one that contains conditonal variance given the present value, simulations given the current value etc.
type(mod), type(res), type(fcst)
(arch.univariate.mean.ZeroMean,
arch.univariate.base.ARCHModelResult,
arch.univariate.base.ARCHModelForecast)
res.summary()
Dep. Variable: | y | R-squared: | 0.000 |
Mean Model: | Zero Mean | Adj. R-squared: | 0.002 |
Vol Model: | ARCH | Log-Likelihood: | -2719.79 |
Distribution: | Normal | AIC: | 5443.58 |
Method: | Maximum Likelihood | BIC: | 5452.01 |
No. Observations: | 500 | ||
Date: | Sat, Oct 19 2024 | Df Residuals: | 500 |
Time: | 18:28:40 | Df Model: | 0 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
omega | 2082.8286 | 396.923 | 5.247 | 1.542e-07 | [1.305e+03,2.861e+03] |
alpha[1] | 0.4962 | 0.174 | 2.850 | 4.377e-03 | [ 0.155, 0.838] |
Covariance estimator: robust
import polars as pl
import hvplot.polars
'matplotlib')
hvplot.extension(
# 3 forecasted paths with horizon 50 pl.DataFrame(fcst.simulations.values.squeeze().T).hvplot()
Mean model and noise distribution
By default the noise distribution is standard normal. Another popular choice is t distribution where the degree of freedom \(\nu\) is a model parameter.
One can also change the mean to be non-zero. The model spec is as follows \[
\begin{align*}
Y_t &= \mu + \sigma_t W_t \\
\sigma_t^2 &= \omega+ \alpha (Y_{t-1}-\mu)^2
\end{align*}
\] where \(W_t\) is iid with t distribution, say.
In arch
, it suffices to set mean="constant"
and dist="t"
in the constructor.
=y+30, mean="constant", q=0, dist='t').fit(disp="off") arch_model(y
Constant Mean - ARCH Model Results
====================================================================================
Dep. Variable: y R-squared: 0.000
Mean Model: Constant Mean Adj. R-squared: 0.000
Vol Model: ARCH Log-Likelihood: -2679.58
Distribution: Standardized Student's t AIC: 5367.17
Method: Maximum Likelihood BIC: 5384.03
No. Observations: 500
Date: Sat, Oct 19 2024 Df Residuals: 499
Time: 18:28:43 Df Model: 1
Mean Model
========================================================================
coef std err t P>|t| 95.0% Conf. Int.
------------------------------------------------------------------------
mu 27.9482 1.487 18.797 7.917e-79 [ 25.034, 30.862]
Volatility Model
============================================================================
coef std err t P>|t| 95.0% Conf. Int.
----------------------------------------------------------------------------
omega 1557.7431 401.680 3.878 1.053e-04 [7.705e+02,2.345e+03]
alpha[1] 1.0000 0.263 3.798 1.460e-04 [ 0.484, 1.516]
Distribution
========================================================================
coef std err t P>|t| 95.0% Conf. Int.
------------------------------------------------------------------------
nu 3.3891 0.357 9.480 2.536e-21 [ 2.688, 4.090]
========================================================================
Covariance estimator: robust
ARCHModelResult, id: 0x7978505fd3a0
Why stop here? We can also allow the conditonal mean of \(Y_t\) given history to vary with time as well, one simplest choice would be to assume AR dynamic for the conditional mean \[ \begin{align*} Y_t &= \mu + \phi (Y_{t-1} - \mu) + \sigma_t W_t \\ \sigma_t^2 &= \omega+ \alpha (Y_{t-1}-\mu)^2 \end{align*} \] where as before \(W_t\) is iid normal by default.
In arch
, just set mean="AR"
and lags=[1]
where 1 is as in AR(1) for the conditional mean.
=y+30, mean="AR", q=0, lags=[1]).fit(disp="off") arch_model(y
AR - ARCH Model Results
==============================================================================
Dep. Variable: y R-squared: -0.009
Mean Model: AR Adj. R-squared: -0.011
Vol Model: ARCH Log-Likelihood: -2713.60
Distribution: Normal AIC: 5435.21
Method: Maximum Likelihood BIC: 5452.06
No. Observations: 499
Date: Sat, Oct 19 2024 Df Residuals: 497
Time: 18:28:43 Df Model: 2
Mean Model
========================================================================
coef std err t P>|t| 95.0% Conf. Int.
------------------------------------------------------------------------
Const 27.0714 2.670 10.138 3.746e-24 [ 21.838, 32.305]
y[1] -0.0280 8.163e-02 -0.343 0.731 [ -0.188, 0.132]
Volatility Model
============================================================================
coef std err t P>|t| 95.0% Conf. Int.
----------------------------------------------------------------------------
omega 1947.6921 445.220 4.375 1.216e-05 [1.075e+03,2.820e+03]
alpha[1] 0.5807 0.247 2.351 1.873e-02 [9.656e-02, 1.065]
============================================================================
Covariance estimator: robust
ARCHModelResult, id: 0x79785063a780
as before, we can obtain the result container and forecast container as follows
= arch_model(y+30, mean="AR", q=0, lags=[1]).fit(disp="off")
res = res.forecast(horizon=5) fcst
notice the difference of resid
and std_resid
, as well as the diff between variance
and residual_variance
1:].std(), res.std_resid[1:].std() res.resid[
(np.float64(58.66084093396231), np.float64(0.9985090629605182))
fcst.variance
h.1 | h.2 | h.3 | h.4 | h.5 | |
---|---|---|---|---|---|
499 | 23353.942452 | 15527.507157 | 10965.954024 | 8317.077882 | 6778.893468 |
fcst.residual_variance
h.1 | h.2 | h.3 | h.4 | h.5 | |
---|---|---|---|---|---|
499 | 23353.942452 | 15509.168033 | 10953.760756 | 8308.466659 | 6772.362326 |
Generalized ARCH (GARCH)
Another generalization of the ARCH model focuses on the conditonal variance specification itself, rather than the conditional mean and the noise distribution. The most prominent example along this line of thinking is the GARCH family, which in its simplest form (zero mean model and lag 1) is specified as follows \[ \begin{align*} Y_t &= \sigma_t W_t \\ \sigma_{t}^2 &= \omega + \alpha Y_{t-1}^2 + \beta \sigma_{t-1}^2 \end{align*} \] The difference lies in the additional beta parameter which specifies a dependence of the conditional variance on the conditional variance of previous timestamp.
GARCH generalizing ARCH is similar to the way ARMA generalizes AR.
Introducing archette
archette
is a simplified implementation of the zero mean GARCH model with iid Gaussian noise.
The existing arch
library is amazing and covers many popular GARCH-type models. archette
is not meant to replace arch
in any way, after all, it supports only a very particular model (undoubtedly a very important one).
The goal of archette
is to reduce the complexity of arch
(e.g. all the class hierarchy) to highlight only the essence of GARCH model. It is meant to be super hackable so that one can build variants of GARCH models by modifying parts of the source code of archette
.
There is only one class GARCHETTE
in archette
for which implements the methods:
- fit
- nll
- forecast_vs
- simulate
and the properties
- vs
- std_resids
- params
The API is simple: states are maintained in the instances of the main class rahter than delegated to separate container classes, hopefully reducing some mental overhead for users.
Here is how one can fit a model and inspect the fitted parameters and the corresponding negative log likelihood.
(see how to install: https://xiaochuany.github.io/archette/ )
from archette import GARCHETTE
= np.random.default_rng()
rng = rng.standard_normal(500) * np.arange(500)
y = GARCHETTE()
mod mod.fit(y)
<archette.core.GARCHETTE at 0x79785c680590>
mod.params
array([124.94299377, 0.16898145, 0.83101855])
mod.nll(mod.params)
np.float64(5832.3224873676845)
Model fitting
Under the hood, fit
calls minimize
function in scipy.optimize
module to minimize the negative log likelihood (nll
).
Let \(f\) denote the joint distribution of time series \(y\). By successive conditioning, one can write the nll as \[ -\log f(y) = - \log f_\theta(y_0) - \sum_{t=1}^{T} \log f_{\theta}(y_t | y_{[0,t-1]}) \] where \(y_{[0,t-1]}:= (y_0, ..., y_{t-1})\) and \(\theta\) is the parameter of the model.
Since the marginal distribution at time 0 is unknown, one typically remove the first term and minimize \(-\log f(y|y_0)\) instead, where \[ -\log f(y|y_0) = - \sum_{t=1}^{T} \log f_{\theta}(y_t | y_{[0,t-1]}) \] By Gaussian noise assumption and lag 1, we have, up to some unimportant additive constant, \[ - \log f_{\theta}(y_t | y_{[0,t-1]}) = - \log f_{\theta}(y_t | y_{t-1}) = \frac{1}{2}\Big(\log v_t + \frac{y_t^2}{v_t}\Big) \] where \(v_t\) is the conditional variance of \(y_t\) given \(y_{t-1}\) (which is denoted by \(\sigma_t^2\) before).
Notice that \(v_0\) is not observable, hence one needs to set a sensible initial value, e.g. sample variance of \(y\).
We define two helper functions, one computes the sequence \(v\), the other the nll.
def _get_vs(y, params, sig2_init):
"""
compute conditional variance by recursion of the GARCH specification
"""
= params
om,al,be = np.zeros(y.size)
vs 0] = sig2_init
vs[for i in range(1,y.size):
= om + al * y[i-1]**2 + be * vs[i-1]
vs[i] return vs
def _nllgauss(y,params,sig2_init):
"""
compute nll of GARCH(1,1) with zero conditional mean and gaussian noise
"""
= _get_vs(y,params, sig2_init)
vs = np.log(vs) + y**2 / vs
nll return nll.sum()
Now we are in a good place to start writing the main class.
from scipy.optimize import minimize
class GARCHETTE:
"""simple garch model"""
def __init__(self):
self._y = None
self.params = None
self._is_fit = False
self._v_init = None
def fit(self,y:np.ndarray):
"""fit y to garch"""
self._y = y
self._v_init = y.var()
self._is_fit = True
= self.nll
func self.params = minimize(
func, =(self._v_init * 0.4,0.3,0.3),
x0=[(0,None), (0,None), (0,None)],
bounds= {
constraints"type": "ineq",
"fun": lambda x: 1-x[1]-x[2]
}
).xreturn self
def nll(self, params)-> float:
"""negative log likelihood of the series at the given params"""
assert self._is_fit
return _nllgauss(self._y, params, self._v_init)
@property
def vs(self):
return _get_vs(self._y, self.params, self._v_init)
We see that the method nll
and property vs
are just wrappers around the helper functions that we just defined. The fit
method, though, needs some attention. For the minimization of nll, apart from the objective function nll
and initial value x0
, it is also necessary to set bounds and constraints for the variables of nll
. We impose the parameters to be non-negative because the sequence v
must be non-negative (as squares). Also we need
\[
\alpha + \beta <1
\] because it is the necessary and sufficnet condition for the stationarity of GARCH(1,1) model. Having these constraints are essential for the optimization to work. Finally, we made an educated guess for the initial value: we set the initial values \(\alpha=\beta=0.3\) to be away from the boundary of the constrained parameter space. The initial value for \(\omega\) is set based on an important fact about the long term conditional variance: for any \(t\), as \(T\to\infty\): \[
\mathbb E[Y_{t+T}^2| \mathcal F_t] \to \frac{\omega}{1-\alpha-\beta}
\] If we believe that the sample variance of y
approaches the long term conditonal variance, then initialize om=(1-0.3-0.3)* y.var()
is indeed sensible.
Now it remains to forecast conditional variances (or equivalently conditional volatilities) and simulate paths. For variance forecast, the recursion for \(h\ge 2\) is simpler than that of the GARCH specification: \[ \mathbb E[Y^2_{t+h}| \mathcal F_t] = \omega + \alpha \mathbb E[Y^2_{t+h-1}|\mathcal F_{t}] + \beta \mathbb E[\sigma^2_{t+h-1}|\mathcal F_{t}] = \omega + (\alpha +\beta) \mathbb E[Y^2_{t+h-1}|\mathcal F_{t}] \] This leads to
def _make_fcst_vs(params,last_v, last_y, horizon):
= params
om,al,be = np.empty(horizon)
fcst_vs 0] = om + al* last_y**2 + be* last_v
fcst_vs[for i in range(1,horizon):
= om + (al+be) * fcst_vs[i-1]
fcst_vs[i] return fcst_vs
The simulation helper function is rather straightforward. We leave the possibility of specifying noises ws
to the user.
def _simulate(params, last_y, last_v, horizon, n_rep, seed, ws=None):
= params
om,al,be if ws is None:
np.random.seed(seed)= np.random.randn(n_rep,horizon)
ws = np.empty((n_rep, horizon+1))
y = np.empty((n_rep, horizon+1))
v 0] = last_y
y[:,0] = last_v
v[:,for i in range(1,horizon+1):
= om + al* y[:,i-1]**2 + be * v[:,i-1]
v[:,i] = np.sqrt(v[:,i]) * ws[:,i-1]
y[:,i] return y[:,1:]
Combining everything, here is all there is in the archette
library.
from typing import Literal
class GARCHETTE:
"""simple garch model"""
def __init__(self):
self._y = None
self.params = None
self._is_fit = False
self._v_init = None
def fit(self,y:np.ndarray):
"""fit y to garch"""
self._y = y
self._v_init = y.var()
self._is_fit = True
= self.nll
func self.params = minimize(
func, =(self._v_init * 0.4,0.3,0.3),
x0=[(0,None), (0,None), (0,None)],
bounds= {
constraints"type": "ineq",
"fun": lambda x: 1-x[1]-x[2]
}
).xreturn self
def nll(self, params)-> float:
"""negative log likelihood of the series at the given params"""
assert self._is_fit
return _nllgauss(self._y, params, self._v_init)
@property
def vs(self):
assert self._is_fit
return _get_vs(self._y, self.params, self._v_init)
@property
def std_resids(self):
assert self._is_fit
return y/ np.sqrt(self.vs)
def forecast_vs(self,
int
horizon:-> np.ndarray:
) """forecast conditional variance in the horizon"""
assert self._is_fit
return _make_fcst_vs(self.params, self.vs[-1], self._y[-1], horizon)
def simulate(self,
int, # path length
horizon:"bootstrap","simulate"]="simulate",# "bootstrap" resamples from past std_resids; "simulate" simulates gaussian nosie
method:Literal[=1_000, # number of repetitions
n_rep=42) -> np.ndarray:
seedassert self._is_fit
if method == "bootstrap":
np.random.seed(seed)= np.random.choice(self.std_resids, size=(n_rep, horizon),replace=True)
ws else: ws=None
return _simulate(self.params,self._y[-1], self.vs[-1], horizon, n_rep, seed, ws)
Sanity check
We define GARCH model with both arch
and archette
, we see that when the model is mis-specified, the MLE found different parameters.
= arch_model(y, mean='zero', dist='normal', rescale=False).fit(disp='off')
mod_arch = GARCHETTE().fit(y) mod_archette
mod_arch.params
omega 30.588087
alpha[1] 0.073224
beta[1] 0.926776
Name: params, dtype: float64
mod_archette.params
array([124.94299377, 0.16898145, 0.83101855])
What if model is well specified ?
= mod_archette.simulate(horizon=500, n_rep =1, seed=89).squeeze() y_sim
='zero', dist='normal', rescale=False).fit(disp='off').params arch_model(y_sim, mean
omega 622.399179
alpha[1] 0.172894
beta[1] 0.795693
Name: params, dtype: float64
GARCHETTE().fit(y_sim).params
array([5.98300620e+02, 1.66863034e-01, 8.02092039e-01])
The estimated parameters using arch
and archette
are quite close. Hoorey!