Am 01.08.2014 um 11:22 schrieb Allin Cottrell:
On Thu, 31 Jul 2014, Sven Schreiber wrote:
>
> The second file 'tobeimported' also has panel data identified by ix and
> tx, but there is an additional ix=33 in there, so it's 6 units by 5 periods.
>
> When you append the second file to the first, gretl says 'appended ok',
> but in fact it does not honor the ix/tx-structure, such that the values
> of the existing variable 'somedata' are moved around to different
> observations. This is pretty bad, especially if you do not notice it
> right away!
You need to use "join" to do this sort of thing; "append" is working
as
expected but it won't/can't do what you want here.
I'm ok with the "won't/can't do" part, but I have some issues with
the
"as expected" portion...
When you define a panel structure via the use of index variables this is
just a matter of giving gretl construction guidelines. The series used as
indices have no special status in the resulting panel; individuals are
numbered from 1 to N and time-series observations from 1 to T, as shown
when you print series with --byobs.
I can't speak for the average user perhaps, but even with some years of
gretl experience my personal expectation was that these construction
guidelines should be binding (for gretl) also in the future once the
dataset structure was set. And I think the time series analogy is
misleading, because of course gretl doesn't just treat the obs in a time
series workfile as 1 to T, but it does watch out what the starting and
ending observations are when you do 'append'.
So when you do the "append" that you mention, gretl adds one more
individual (number 6) and updates (overwrites) the common series ix with
the values from the second file. So far as gretl is concerned no data have
been "moved": all the previously existing observations are still lined up
with the same individuals, 1 to 5.
This is probably unrelated to the specific panel index variable issue
here, but the fact that 'append' overwrites stuff makes me feel uneasy
in general I must say, because I tend to go by the literal meaning of
the word "append".
If we were to give the index variables used in constructing a panel a
special status thereafter, we'd have to work out a policy regarding those
variables: do they become immutable? if not, what happens if you redefine
one or both of them?
Yes I understand those problems, but currently gretl kind of requires
the user to know its internal storage structure (stacked cross-section
vs. stacked time-series, treatment of unbalanced data) even though at
first glance it appears to offer the user a way around those internals
via panel index variables.
For me this resulted in quite a mess without any warning. Perhaps it
would be better if gretl disabled any 'append'-ing or similar things
whenever a dataset is defined via index variables. Instead it could say
"please use the 'join' command explained in the manual for these kinds
of operations", basically like you did in your message here. Otherwise,
how is the user reasonably supposed to know?
thanks,
sven