On Sat, 2 Aug 2014, Sven Schreiber wrote:
Am 01.08.2014 um 11:22 schrieb Allin Cottrell:
> On Thu, 31 Jul 2014, Sven Schreiber wrote:
>> The second file 'tobeimported' also has panel data identified by ix and
>> tx, but there is an additional ix=33 in there, so it's 6 units by 5 periods.
>> When you append the second file to the first, gretl says 'appended ok',
>> but in fact it does not honor the ix/tx-structure, such that the values
>> of the existing variable 'somedata' are moved around to different
>> observations. This is pretty bad, especially if you do not notice it
>> right away!
> You need to use "join" to do this sort of thing; "append" is
> expected but it won't/can't do what you want here.
I'm ok with the "won't/can't do" part, but I have some issues with
"as expected" portion...
OK, general point here. Certainly what we do should be better documented,
and moreover what we do should perhaps be different in some ways. I think
that a systematic revamp of our handling of panel data would be a good
candidate for a gretl 2.0 target.
> When you define a panel structure via the use of index variables
> just a matter of giving gretl construction guidelines. The series used as
> indices have no special status in the resulting panel; individuals are
> numbered from 1 to N and time-series observations from 1 to T, as shown
> when you print series with --byobs.
I can't speak for the average user perhaps, but even with some years of
gretl experience my personal expectation was that these construction
guidelines should be binding (for gretl) also in the future once the
dataset structure was set. And I think the time series analogy is
misleading, because of course gretl doesn't just treat the obs in a time
series workfile as 1 to T, but it does watch out what the starting and
ending observations are when you do 'append'.
There's a disanalogy between "panel time" which always just goes from 1 to
T, and "pure time" in a time-series dataset, yes. The "setobs" command
an option (a newish one) for setting panel time (e.g. marking the time
dimension as quarterly with some definite starting period) but right now
that is only used in drawing panel plots.
> So when you do the "append" that you mention, gretl
adds one more
> individual (number 6) and updates (overwrites) the common series ix with
> the values from the second file. So far as gretl is concerned no data have
> been "moved": all the previously existing observations are still lined up
> with the same individuals, 1 to 5.
This is probably unrelated to the specific panel index variable issue
here, but the fact that 'append' overwrites stuff makes me feel uneasy
in general I must say, because I tend to go by the literal meaning of
the word "append".
Fair enough. "Updating" overlapping observations on append should probably
be an option and not the default.
> If we were to give the index variables used in constructing a
> special status thereafter, we'd have to work out a policy regarding those
> variables: do they become immutable? if not, what happens if you redefine
> one or both of them?
Yes I understand those problems [...]
This is one thing we should think through for 2.0.