\documentclass[a4paper,11pt]{article} \usepackage{bm} \usepackage{verbatim} \newcommand{\obsvec}{\ensuremath\mathbf{y}} \newcommand{\obsmat}{\ensuremath\mathbf{H}} \newcommand{\obsdist}{\ensuremath\mathbf{w}} \newcommand{\obsvar}{\ensuremath\mathbf{R}} \newcommand{\statevec}{\ensuremath{\bm\xi}} \newcommand{\strmat}{\ensuremath\mathbf{F}} \newcommand{\strdist}{\ensuremath\mathbf{v}} \newcommand{\strvar}{\ensuremath\mathbf{Q}} \newcommand{\hatstatevar}{\ensuremath\mathbf{P}} \newcommand{\prederr}{\ensuremath\mathbf{e}} \newcommand{\predvar}{\ensuremath{\bm\Sigma}} \newcommand{\exog}{\ensuremath{\mathbf{z}}} \newcommand{\param}{\ensuremath{\bm\theta}} \title{RFC: Implementation in \texttt{gretl} of the Kalman filter at the user level} \author{Jack} \begin{document} \maketitle The Kalman filter is one of those functionalities that may come in handy in a number of cases: of course we already use internally it for ARMA estimation, but for example, it'd be nice to have a tool like that for people who do DSGE models or maybe for people who would like to implement ARMA estimation with series with missing values and so on. \section*{Notation} Notation \emph{is} a problem. It seems that in econometrics everyone is happy with $y = X \beta + u$, but we can't, as a community, make up our minds on a standard for state-space models. Harvey (1991), Hamilton (1994), Harvey \& Proietti (2005) and Pollock (1999) all use different conventions. The notation here will be based on Hamilton's, with slight variations. A state-space model can be written down as \begin{eqnarray} \label{eq:obseq} \obsvec_t & = & \obsmat_t \statevec_t + \mathbf{d}_t + \obsdist_t \\ \label{eq:straneq} \statevec_{t} & = & \strmat_{t} \statevec_{t-1} + \mathbf{c}_t + \strdist_t \end{eqnarray} The symbol $T$ (not in bold) indicates the sample size. The observables $\obsvec_t$ are an $n$-vector (in many cases, $n=1$, but not necessarily), while the unobservable vector of states $\statevec_t$ has $m$ elements.%; usually, if not always, $n < m$. The covariance matrices for $\obsdist_t$ and $\strdist_t$ are $\obsvar_t$ and $\strvar_t$, respectively. In the special case when $\strmat_t = \strmat$, $\obsmat_t = \obsmat$ and so on, the model will be called \emph{time-invariant}. We have two options: we may go forward in time and do the \emph{filtering} proper, or after that we go back and do the \emph{smoothing}. To quote from Pollock (1999), p. 241 (with different notation), \begin{quote} [\ldots] it is assumed that prior information on the previous state vector $\statevec_0$ is available in the form of an unbiased estimate $\hat{\statevec}_0$ which has been drawn from a distribution with a mean of $\statevec_0$ and a dispersion matrix of $\hatstatevar_0$. \end{quote} It may happen that $\hatstatevar_{0} \to \infty$ for diffuse priors. In this case, we need to check Koopman (1997); in the standard case, however, the prediction equations are: \begin{eqnarray} \label{eq:filtstates} E_{t-1}(\statevec_t) = \hat{\statevec}_{t|t-1} & = & \strmat_t \hat{\statevec}_{t-1} + \mathbf{c}_t \label{eq:statepred}\\ V_{t-1}(\statevec_t) = \hatstatevar_{t|t-1} & = & \strmat_t \hatstatevar_{t-1} \strmat'_t + \strvar_t \label{eq:Ppred}\\ E_{t-1}(\obsvec_t) = \hat{\obsvec}_{t|t-1} & = & \obsmat_t \hat{\statevec}_{t|t-1} + \mathbf{d}_t \\ \prederr_t & = & \obsvec_t - \hat{\obsvec}_{t|t-1} \\ V_{t-1}(\prederr_t) = \predvar_t & = & \obsmat_t \hatstatevar_{t|t-1} \obsmat_t' + \obsvar_t \end{eqnarray} so the updating equations are: \begin{eqnarray} \hat{\statevec}_t & = & \hat{\statevec}_{t|t-1} + \mathbf{K}_{t} \prederr_t \label{eq:stateupd}\\ \hatstatevar_{t} & = & \hatstatevar_{t|t-1} - \hatstatevar_{t|t-1} \obsmat_t' \mathbf{F}^{-1}_t \obsmat_t \hatstatevar_{t|t-1} \label{eq:Pupd} \end{eqnarray} where $\mathbf{K}_{t} = \hatstatevar_{t|t-1} \obsmat_t' \obsvar_t^{-1}$ is the Kalman gain. Equations (\ref{eq:statepred})--(\ref{eq:stateupd}) and (\ref{eq:Ppred})--(\ref{eq:Pupd}) can be written as a single set of recursions going directly from $\hat{\statevec}_{t-1}$ to $\hat{\statevec}_t$ and from $\hatstatevar_{t-1}$ to $\hatstatevar_{t}$. Note that in practice we may want to implement something computationally smarter, like the information filter (some bibliographic search is needed here, I can't remember exactly what it looks like). I can't be bothered right now to write the smoothing recursions, but it's not terribly important. \section*{User syntax} The idea is to define a command block like we do with \texttt{mle} or \texttt{gmm}: \begin{verbatim} list y = x1 x2 # or possibly matrix Y = { x1 x2 } kalman [ --smooth ] obsy y obsmat H obsex d obsvar Omega strmat Phi strex c strvar Psi inistate x0 iniprec P0 end kalman \end{verbatim} where \texttt{obsy} stands for ``observable $\obsvec$'', \texttt{obsmat} stands for ``matrix in the observation equation'' (\ref{eq:obseq}), \texttt{obsex} for ``exogenous term in the observation equation'' and \texttt{obsvar} for ``variance of the observation equation''. A similar conventions would hold for the state transition equation (\ref{eq:straneq}), with \texttt{str} replacing \texttt{obs}. The strings \texttt{inistate} and \texttt{iniprec} should be self-explanatory. The elements \texttt{obsex}, \texttt{strex} and \texttt{inistate} need not be compulsory. If absent, they are understood to be 0. More or less the same may go for \texttt{iniprec}, if the initialisation is done via a diffuse prior (the simplest thing would be to use somehing like $P_0 = 10^{30}\cdot I$, but smarter methods are available -- again, Koopman (1997) should come to the rescue). For time-invariant models, the arguments to \texttt{obsmat, strmat} etcetera must be matrices of the appropriate size. In the case of time-varying models, the best idea I've been able to come up with so far\footnote{Another solution is to pass matrices with $T$ rows and an appropriate number of columns, where each row is the vectorization of the corresponding matrix at time $t$; for example, if $\strmat_t$ is time-varying, then the argument to \texttt{strmat} will have to be a $T \times m^2$ matrix. It'll be gretl's job to reshape the pertinent row of such a matrix, but the more I think the more I find it ugly.} is to pass matrices which result from user-defined functions. This is very general and elegant; several interesting possibilities are opened here, such as allowing, say, the elements of $\obsmat_t$ to depend on $\prederr_{t-1}, \obsvec_{t-1}$, structural parameters (as can be the case with DSGE models) and so on. Hence, you would have something like \[ \obsmat_t = \obsmat(\exog_t, \param), \] where $\obsmat$ corresponds to a user-written function of the kind \begin{verbatim} function MyObsMat(list X, matrix theta) \end{verbatim} whose content depends on the context (recursive OLS, ARMAs, you name it), to be called as \begin{verbatim} list Z = x1 x2 x3 matrix parm = { 0.5, -0.8 } kalman obsmat MyObsMat(Z, parm) ... end kalman \end{verbatim} Of course, one also needs to retrieve the interesting quantities: this could be done via via the existing accessors \texttt{\$uhat} for $\prederr_t$ and \texttt{\$sigma} (other candidates may be \texttt{\$vcv} and \texttt{\$h} --- but I personally like \texttt{\$sigma} better) for $\predvar_t$. When $n = 1$, these would be series; otherwise \texttt{\$uhat} would be a $T \times n$ matrix (a bit like we do for single equation/multiple equations models), while \texttt{\$sigma} should be a $T \times \frac{n (n+1)}{2}$ matrix, with the $t$-th row holding $\mathrm{vech}(\predvar_t)'$. New accessors should be devoted to returning the states and their variances: suitable candidates could be and \texttt{\$states} for $\statevec_t$ and \texttt{\$stvars} for $\mathrm{vech}(\hatstatevar_t)'$. Moreover, it is probably useful to also return the series \[ \ell_t = -0.5 \left[ n \ln(2 \pi) + \left|\predvar_t\right| + \prederr_t'\predvar_t^{-1} \prederr_t \right] , \] possibly in the accessor \texttt{\$lnl}, which would yield the likelihood. It then becomes trivial to insert the whole thing in an \texttt{mle} block to estimate all sorts of model. Note, however, that the \texttt{kalman} block itself would perform no optimisation whatsoever: its job would only be the filtering (or the smoothing, with the \verb|--smooth| option). \section*{References} Hamilton, James D. (1994), \emph{Time Series Analysis}, Princeton University Press. \smallskip \noindent Harvey, A.C. (1991), \emph{Forecasting, Structural Time Series Models and the Kalman Filter}, Cambridge University Press. \smallskip \noindent Harvey, A.C. and T. Proietti (2005), \emph{Readings in Unobserved Component Models}, Oxford University Press. \smallskip \noindent Koopman, S. J. (1997), ``Exact Initial Kalman Filtering and Smoothing for Nonstationary Time Series Models'', Journal of the American Statistical Association, Vol. 92, No. 440, pp. 1630--1638 \smallskip \noindent Pollock, D.S.G. (1999), \emph{A Handbook of Time-Series Analysis, Signal Processing and Dynamics}, Academic Press. \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: t %%% End: