Hi,
unfortunately the panel subsampling business doesn't seem to be fully
resolved. I'm struggling to understand where exactly the apparent bug is
located or if there are perhaps more than one, so let's go step by step.
1) First, consider the attached panel-structured dataset
"checkdata.gdt", and then the following small script:
<hansl>
function void samplecheck(series x)
smpl # simply prints the currently active sample
end function
# assuming checkdata.gdt is loaded...
setobs 4 1991:1 --panel-time
smpl 1991:1 2017:4 --time # restricts from 1708 to 1512 obs
samplecheck(UKIB)
</hansl>
The point here is that the samplecheck function should only get the
restricted active sample from the outer scope (right?) -so only 1512
obs- , but inside the function it still reports 1708 as being the
relevant "full" sample range. Is that expected? I suspect that that's
where the trouble is perhaps already beginning.
2) OK, next it gets more "interesting" in terms of the printed sample
information. Again with the attached dataset, consider the following
script, where in the end things are totally messed up:
<hansl>
# assuming checkdata.gdt is loaded...
smpl full
setobs 4 1991:1 --panel-time
smpl 1991:1 2017:4 --time # n = 1512
function void check2 (series x)
series dates = $obsdate
smpl 5 5 --unit
smpl dates < 20010101 --restrict
smpl
end function
check2(UKIB) # prints: 15:001 - 1:122 (n = -1708) !!
</hansl>
Note that the dataset really only has 14 units, not 15.
And perhaps I should be explicit that it's not just a cosmetic issue of
what's printed out, but it is affecting the actual data and number
crunching down the road.
2b) I replaced the home-cooked time restriction "smpl dates < 20010101
--restrict" with "smpl 1991:1 2001:1 --time" inside the function, but
then gretl crashed hard.
------
3) Maybe unrelated, maybe not, but at least even easier to replicate:
<hansl>
open abdata
setobs 4 2000:1 --panel-time # just for illustration
smpl 2000:3 2000:4 --time # gives 1:1 - 140:2 (n = 280)
</hansl>
Notice that the time dimension is shown as 1..2, whereas it's really the
(pseudo) period numbers 3..4 that are selected. This is in contrast to
the cross-section dimension using "--unit", where the respective unit
numbers are preserved and shown alright. (Check with "smpl 10 11 --unit".)
A slightly different example (again with the attached dataset
checkdata.gdt loaded):
<hansl>
smpl full
setobs 4 1991:1 --panel-time
smpl # gives 1:001 - 14:122 (n = 1708)
smpl 1991:1 2017:4 --time # see below
</hansl>
Here the full sample is reported as a panel structure before the
"--time" restriction (1:001 - 14:122 (n = 1708)), but the last command
which restricts the panel in the time dimension yields the following
output, where suddenly the first line only shows the total obs number,
not the panel structure anymore:
<output>
Voller Datensatz: 1708 Beobachtungen
Aktuelle Stichprobe: 1:001 - 14:108 (n = 1512)
</output>
All this is with the recent snapshot from Feb 26th.
thanks
sven