Hello all,
Ignacio, Jack and I have had a conversation lately about ARIMA
estimation in gretl. The outcome is that various problems have
been fixed, but also that I've introduced a backward-incompatible
change in CVS which I'd like to expose, and on which I'd
appreciate any comments. (And I'd particularly appreciate if
Ignacio or Jack could correct anything erroneous below!)
The backward-incompatible change concerns ARIMAX models -- that
is, ARIMA models including exogenous regressors -- but we can get
at it by starting from the most basic ARMA specification:
\phi(L) y_t = \theta(L) \epsilon_t
where \phi(L) and \theta(L) are polynomials in the lag operator,
L, and \epsilon is a white-noise process.
The first question is: how do we handle the situation where the
level of y_t is a function of some exogenous variable(s), X_t?
There are two options:
Model A:
\phi(L) y_t = X_t\beta + \theta(L) \epsilon_t
Model B:
\phi(L) (y_t - X_t\beta) = \theta(L) \epsilon_t
Model A might be described as an ARMA model augmented with
exogenous variables, while B might be described as a regression
model with ARMA errors. This choice is discussed in the Gretl
User's Guide. (Short story: we do Model B if exact ML is chosen,
and Model A if conditional ML is chosen; except that we do Model B
regardless, if estimation via X-12-ARIMA is selected).
The new issue, however, is how we handle a non-zero integration
order (the 'I' in ARIMA). Take the simplest case: non-seasonal
order 1, seasonal order 0. If we're doing exact ML, and hence
have effectively chosen B above, the options are:
Model C:
\phi(L) ((1-L) y_t - X_t\beta) = \theta(L) \epsilon_t
Model D:
\phi(L) (1-L) (y_t - X_t\beta) = \theta(L) \epsilon_t
In practical terms the issue between C and D is, given that we're
differencing y_t, do we also difference X_t? (No = C, Yes = D.)
Up till now we have been doing C, and there's nothing inherently
wrong with that, but for the sake of compatibility with most other
ARIMA software we should be doing D. And in current CVS we do D,
unless the new option -y (or --y-diff-only) is given in
conjunction with the arima command, in which case C is restored.
I don't know how many gretl users are estimating ARIMAX models,
but obviously if anyone is doing so, we'd like to hear what they
have to say.
Allin.