On Wed, 12 Feb 2025, Sven Schreiber wrote:
Am 11.11.2024 um 03:04 schrieb Cottrell, Allin:
> On Sun, Nov 10, 2024 at 1:27 PM Sven Schreiber
> <sven.schreiber(a)fu-berlin.de> wrote:
>> I'm currently looking at ch. 21 of the guide, "Cheat sheet".
I'd propose
>> the following cleanups (which I could apply if people agree):
Following-up on this earlier list, here's an update:
>> section 21.1:
>>
>> - Time averaging of panel datasets: I think it would be nice to
>> use a real-world dataset such as grunfeld.gdt instead of having
>> the slightly distracting code for creation of artificial data.
Here's a tested variant of what I meant:
<hansl>
open grunfeld
# how many periods (here: years) to average
newfreq = 4
# a dummy for endpoints
series endpoint = (time % newfreq == 0) # 'time' already in dataset
list X = invest value kstock # time-varying variables
# compute averages
loop foreach i X
series $i = movavg($i, newfreq)
endloop
# drop extra observations
smpl endpoint --dummy --permanent
# restore panel structure
setobs firm year --panel-vars
print firm year X -o
</hansl>
OK with you guys to replace the old example with artificial data?
Sounds fine to me.
>> section 21.2:
>>
>> - Generating a “subset of values” dummy: Nowadays one could use
>> the contains() function I think, which would be more readable.
Here's an artificial but also tested example of what I mean:
<hansl>
nulldata 10
series src = {1,2,3,12,13,14,22,23,24,25}
matrix sel = {2,13,14,25}
series D1 = contains(src, sel)
</hansl>
So I think that the long-ish paragraph about the "clever solution"
could be deleted. Also, I'm not sure that what is then labeled as
the "proper solution" using the replace() function is actually
"more proper" than the one I gave using contains(). Opinions?
I like it.
>> section 21.3:
>>
>> - Interaction dummies (p. 194 of the A4 guide version from October):
>> remove the old string-substitution-based code that pre-dates the
>> interaction operator (^; which is also already mentioned there).
Again, is the old solution (starting with "But back in my
day...") really
still needed?
Probably not, but if it's omitted it may be worth inserting the
example from chapater 15 rather than just giving a reference to it.
>> - Realized volatility: Is this example even consistent? It
>> starts by talking about minutes and hours, but then switches
>> over to seconds and minutes. Maybe that's part of the clever
>> trick, I don't know... Apart from that, it seems that another
>> trick in the cheat sheet could be re-used here, namely "Moving
>> functions for time series".
OK, so here's something much more straightforward IMHO to
calculate a per-hour volatility, using the aggregate function:
<hansl>
nulldata 720
setobs 60 1:1 --time-series # 60 minutes per hour
series x = normal()
matrix v = aggregate(x, $obsmajor, var) # $obsmajor means hour here
print v
dataset compact 1 # yields error !
series rv = v[,end]
</hansl>
I agree, that's a nicely "natural" way to do it.
HOWEVER, for the "dataset compact 1" line gretl tells me
"not
supported", and I don't understand why. Shouldn't it be quite easy
to compact from any periodicity down to 1?
That's now fixed, but there's more to say about it when we get a
chance.
>> - Cross-validation: Could it be that using some feature of
the
>> regls apparatus or a contributed package (by Artur?) would be
>> more practical nowadays?
It mentions the leverage command - could be that this was already
the answer to my previous remark, not sure
Use of "leverage" seems appropriate to me.
Allin