panelpd strangeness
by Sven Schreiber
Hi again,
OK let's see if this one qualifies as a bug:
<hansl>
clear
T = 200
N = 20
len = T*N
nulldata len --preserve
setobs T 1:1 --stacked-time-series
eval $panelpd # gives 0
</hansl>
The zero $panelpd value is in contrast to the documentation, which says:
"If the periodicity is not set in the active panel dataset, returns 1 in
analogy to $pd for cross-sectional or undated data."
In the meantime I have updated to the latest snapshot.
thanks
sven
1 day, 16 hours
list creation error
by Sven Schreiber
Hi,
I'm getting a weird parsing error with a basic list creation:
<hansl>
open inven.gdt
L = cinven r3 # error: symbol 'SERIES' invalid in this context
</hansl>
If I pre-prend the explicit "list" keyword, then it works. Also if L
already exists.
This is a three-week old snapshot.
thanks
sven
1 day, 17 hours
Regression with many dummies (RFC)
by Riccardo (Jack) Lucchetti
Hi all,
I've begun to explore the issue of the numerical performance of OLS
regression, where you want to condition on a qualitative variable with
many different values, that is you want to run something like
ols y X dummify(fac)
where "fac" is a discrete variable with a high number of possible valid
values (call it h).
Normally, you don't really care for all the parameters; you just want
the OLS subvector for X (call it beta). Of course, a special case of the
above is fixed-effect estimation in panel data, but the problem is in
fact a little bit more general than that.
If nelem(X) = k, that would lead to regressing y on a list with k+h-1
elements. If the sample size n and h are both large, that takes a lot of
RAM, and it's very inefficient, since (as is well known) you can compute
beta in a much more clever way via the Frisch-Waugh theorem.
The attached script does just that[*], and compares execution time for
both approaches, so you can play with it.
My question to the community is: would it be worthwhile to implement the
"specialised" algorithm natively? Something like
fols y X fac
where "fols" stands for "factorised OLS"? Or maybe as an option to the
ols command? Or maybe as a function? Having such a command (or function)
would of course just pay off just in cases when both n and h are large.
Is this worth the effort?
[*] The attached function just computes beta, not all the auxiliary
quantities. But those are easy to add.
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------
1 week, 6 days
"ar" error inside function (was Re: Re: how to run auxiliary estimation commands inside functions)
by Sven Schreiber
Am 25.09.2025 um 18:28 schrieb Sven Schreiber:
>
> <hansl>
>
> # test an aux dataset inside a function
>
> function void internalar (matrix x, int p, series anyoldseries)
> # the series argument is only added to trigger inheriting
> # the dataset of the caller
>
> x = vec(x)
> T = rows(x)
>
> # we know we are working in a panel context
> smpl 1 1 --unit
> setobs 1 1 --time-series
> print $nobs
>
> errorif(T>$nobs, "incoming vector too long")
> # smpl 1 T # redundant
> series s = x
>
> print
>
> ols s s(-1)
> arma p 0; s
> # ar p ; s # yields data error
> end function
>
> open grunfeld
> internalar(mnormal($pd), 2, invest)
>
> </hansl>
> but notice that "ar" yields a mysterious data error which might be
> independent of the current issue, don't know.
>
I'm doing a ping on this concerning not the original topic, but the
error with the "ar" line (if you uncomment the last line of the function
in the script above). I just checked on the latest snapshot, where it
persists. Given that the arma command works, I'm not conscious of doing
anything wrong there, but maybe it has to do with initial values or so.
thanks
sven
2 weeks, 3 days