Hi,
it seems to me that movavg() (and perhaps other "intertemporal"
functions as well?) could/should also handle missing values in the
middle of time series, and not only like they currently seem to be doing
at the start or end.
Consider this example, where a manual moving average calculation works
just fine (of course producing NAs where required data are missing, but
that's ok), but the built-in movavg() function refuses to do anything:
<hansl>
nulldata 10
setobs 1 1 --time-series
series in = NA # some missings, but...
smpl 6 10 # ... a sub-sample...
series in = index # ...with valid values
smpl --full
# compare built-in and manual mov. avg.
series outma = movavg(in, 4)
series outclone = (in + in(-1) + in(-2) + in(-3)) / 4
d = sum(abs(outma - outclone)) # ok, same results
# now set 1st obs as non-missing, too
in[1] = 0
# again (try to) compare
series outclone2 = (in + in(-1) + in(-2) + in(-3)) / 4
catch series outma2 = movavg(in, 4) # missing value error
if $error
print outclone2 # ok with some missings (of course)
endif
</hansl>
This is with yesterday's snapshot (on Win 8). If you're wondering why
time series would have gaps in the middle, well this happens with
real-world data for example from Eurostat.
And next a related but conceptually more difficult thing with movavg():
If I restrict the sample to non-missing values, I can apply movavg, but
the results are misleading, because gretl "forgets" that this is not
really a contiguous time series anymore. (The fourth non-missing lagged
value for example can be much farther in the past than four periods.)
Intuitively I'd say that such "intertemporal" functions should require
time-series data and not work on "undated" sub-samples, but OTOH I know
that such "undated" sub-samples are often the outcome of sample
restrictions that you cannot avoid and where it would be annoying if
suddenly movavg() & co. didn't work anymore. So no concrete suggestion
here, just food for thought.
thanks,
sven