This is a follow-up to Artur's comment/question in
http://lists.wfu.edu/pipermail/gretl-users/2019-February/013703.html
Use of "fcast" with panel data is now enabled (to varying degrees)
for pooled OLS, fixed effects, random effects and dpanel. By "now" I
mean in git and the snapshots.
Preliminary note: if you want to produce out-of-sample predictions
using panel data you should ensure that any required transformations
(lags, differences, etc.) are generated for the entire dataset
before a model is estimated on a sub-sample.
Within sample forecasts
Producing within-sample "forecasts" is unproblematic, if in general
not very interesting since the $yhat accessor for fitted values is
already available. However, there's modest added value from "fcast"
in a couple of respects: you can get a printout of the forecast
evaluation statistics; and for random-effects estimation the values
produced by fcast include an estimate of the individual effect,
which is omitted from $yhat.
Out of sample forecasts
In the panel data context "out of sample" could mean either in the
time dimension or in the cross-sectional dimension. As of now,
gretl's automatic --out-of-sample option operates strictly in the
time dimension, but it's possible to forecast out of sample in the
cross-sectional dimension via the stobs/endobs arguments to "fcast"
or by setting the sample range appropriately before calling "fcast".
I'll deal with the --out-of-sample option (time dimension) first.
This is subject to one restriction and one caveat.
Restriction: the model on which forecasts are to be based must be
estimated using all individuals in the dataset (i.e. sub-sampled
only with respect to time). This is not a very severe limitation:
one can always save, then open, a reduced copy of the dataset
(sub-sampled in the cross-sectional dimension) first, if necessary.
Caveat: It's not clear what we should do if the model specification
includes time dummies. At present we impute an out-of-sample time
effect equal to the mean of the in-sample time effects. This is
obviously debatable; feel free to debate it. Perhaps we should ban
such prediction altogether.
In addition, a refinement: for models estimated via "dpanel", the
--out-of-sample option produces a dynamic forecast by default (as
with dynamic estimators on straight time-series data). This can be
interdicted by use of the --static option to "fcast".
Now to out-of-sample in the cross-sectional dimension. What are we
going to do with individual effects, for the fixed- and
random-effects estimators? For now, we set the predicted individual
effect to the global constant (for fixed effects) or to its expected
value of zero (for random effects). This is obviously another
debatable point, analogous to the out-of-sample time effect case.
Besides debatable policies there are no doubt some bugs lurking in
the new fcast facilities. Comments welcome.
Allin