On Tue, 15 Dec 2009, artur bala wrote:
Allin Cottrell a écrit :
I have now (in CVS) changed the behavior of smpl --restrict so
that it doesn't strive to produce a balanced panel
It works fine! Thanks!
unless you give the new --balanced option.
What still intrigues me when using the --balanced flag is that the
number of observations provided (n = xxx) doesn't fit the number of 0s
(or 1s) according to FRS. Here's the output:
? smpl FRS=0 --restrict --replace --balanced
Full data set: 686 observations
Current sample: 1:1 - 97:3 (n = 291) # there are 269 observations with
FRS=0
? smpl FRS=1 --restrict --replace --balanced
Full data set: 686 observations
Current sample: 1:1 - 98:7 (n = 686) # there are 417 observations with
FRS=1
Do you find it somewhat misleading?
1) I'm assuming that if someone has a panel dataset, then they
probably want to be able to apply panel-specific methods if at all
possible.
2) In gretl, it is possible to apply such methods *only* if the
dataset constitutes a balanced panel (with an equal number of time
series observations for each cross-sectional unit).
3) However, the requirement at point 2 is in a sense "formal":
some of the time-series observations may be composed of NA
(missing) values.
4) Therefore, when a user sub-samples in a way that would
"naturally" destroy the panel structure of the data, we offer the
"--balanced" option as a means of reconstituting the panel
structure. This necessarily involves "padding" the observations
for least some of the cross-sectional units with missing values.
5) Note that such padding observations will not be used in
calculating statistics of any sort. They are there purely to
satisfy the formal requirement of balance (which we need for
"accounting" purposes).
In your example:
? smpl FRS=0 --restrict --replace --balanced
Full data set: 686 observations
Current sample: 1:1 - 97:3 (n = 291)
# there are 269 observations with FRS=0
That's right: the extra 291 - 269 = 22 'observations' are padding
rows. But only the observations that actually have FRS=0 will be
used by gretl in calculating any statistical results.
If you don't want the padding, don't choose --balanced.
Allin Cottrell