On Wed, 9 Feb 2005, Martin Obermaier wrote that
open wagepan.gdt
setobs 8 1:1 --stacked-cross-section
model1 <- pooled 19 0 17 3 5 4 27 7 18
model2 <- hausman
crashes gretl. I've now looked into this. One easy-to-fix thing is
that I was not dealing properly with the case where there are
insufficient degrees of freedom to calculate the group means
regression. That was the immediate cause of the crash, and it's now
fixed in CVS.
Actually, though, with the "wagepan" data there are plenty of
degrees of freedom (545 people observed in each of 8 years). But
your setobs line, with --stacked-cross-section, has confused gretl.
This is a case of --stacked-time-series (the observations go from
1980 to 1987 for the first person, then from 1980 to 1987 for the
second and so on, making a little time series for each person).
Nonetheless, gretl shouldn't crash!
I'm afraid, though, that this example has exposed some more
difficult issues, which I'm working on but have not fully resolved.
There are 545 cross-sectional units in this example, and gretl, up
to now, has calculated the within-goups regression by including a
dummy variable for each unit. With this number of units, that's
just silly. It's much more efficient to subtract the group means.
So I've coded that, but then another issue appeared: what if, when
you subtract the group means, some of the variables become all zero?
For instance there's a "black" dummy variable in your model. Since
nobody changed their color over the 8 years, the deviation from the
"group mean" is always zero for this variable. I need to devise a
way of handling such issues of perfect collinearity.
Of course, the same problem arises if you have a variable like
"black" and try to include a dummy variable specific to each person
-- it's just that the problem may not be so apparent.
If anyone on the list has thoughts on the Right Way to handle this
issue, I'd be very glad to hear them!
Allin Cottrell