Am 04.07.2011 18:07, schrieb Allin Cottrell:
On Mon, 27 Jun 2011, Ignacio Diaz-Emparanza wrote:
> El 26/06/11 19:07, Sven Schreiber escribió:
>> Hi,
>>
>> so I'm trying to make my function package work both with sub-sampled and
>> full-sampled data. This is not so easy because of gretl's behavior to
>> use pre-sample data for VAR/VECM estimation, if such data is available.
>> This means that the two cases (sub-sampled or not) have to be handled
>> differently. (IMHO this is bad, BTW, so taking up the issue which was
>> already discussed before, I still vote to always *prevent* gretl from
>> reaching back to pre-sample values.)
>
> I have also been some time in the need that gretl does not 'remember'
> the part of the sample that was before the first subsampled observation.
>
> In general I think that this behaviour may be practical but in some cases
> we need that gretl absolutely forget the previous observations. May be
> possible to add an option --forget to the smpl command such that gretl
> work in that way?
Sorry it has taken me a while to get to this. Sven and Ignacio, could
you give an account of exactly how the current smpl behavior is
problematic? I'm not saying there's no problem, but I'd like to get
clear on precisely what it is. A simple example function that exposes
the problem would be nice.
Not sure if this convinces you, but here's an illustration:
<hansl>
open denmark
function series smplcheck(series y)
ols y 0 y(-1)
series user = $uhat
return user
end function
series u = smplcheck(LRM)
printf "missings: %d\n", sum(missing(u))
# (prints 1)
smpl 1975:1 ;
series u2 = smplcheck(LRM)
printf "missings: %d\n", sum(missing(u))
# (prints 0)
</hansl>
The "problem" is that basic series properties such as the presence of
missing values currently depend on whether the data are subsampled or
not. This is due to the reach-back behavior of gretl for autoregressive
models. I have learned how to deal with that (see my recently posted
function package), but I still think it is unfortunate that function
authors have to deal with this.
Ignacio's suggestion of an option switch for smpl may be worth pursuing.
I'm reluctant to switch the behavior unconditionally. The way it works
at present is IMO easier for the fairly common task of replicating
estimation of dynamic models. E.g, somebody has estimated a dynamic
model starting in 1990:1 and you want to replicate it: you just do "smpl
1990:1 ;" and estimate. Otherwise you would have to calculate where the
data range should start in order to get estimation starting in 1990:1.
That's trivial in some cases, not so trivial in others.
I agree; there seems to be a good case for gretl's behavior for normal
quick-and-dirty usage. My suggestion should probably only refer to the
behavior within functions: I guess this would mean to enforce the
principle that functions only see the copy of data that they are given,
under the regime of the currently selected sample range.
thanks,
sven