spurious description string in pointerized series arg
by Sven Schreiber
Hi,
the attached script files demonstrate a bug. Please run the
bugtest_driver.inp script which includes the other script
port_bugtest.inp. The example uses the built-in example dataset abdata.
The point is that the top-level series "e" which is passed as a pointer
arg to the function (factcoint_test) gets a spurious and more or less
arbitrary description string, sometimes including non-printable
characters. Actually sometimes the correct description happens; in that
case please just run the script again. To trigger the bug it seems
necessary to have an enclosing loop.
In itself it's not a big deal, but since apparently chunks of memory end
up where they do not belong, maybe some related unknown bugs exist and
may be fixed by getting this straight.
This is a fairly recent snapshot.
thanks
sven
2 weeks, 3 days
panelpd strangeness
by Sven Schreiber
Hi again,
OK let's see if this one qualifies as a bug:
<hansl>
clear
T = 200
N = 20
len = T*N
nulldata len --preserve
setobs T 1:1 --stacked-time-series
eval $panelpd # gives 0
</hansl>
The zero $panelpd value is in contrast to the documentation, which says:
"If the periodicity is not set in the active panel dataset, returns 1 in
analogy to $pd for cross-sectional or undated data."
In the meantime I have updated to the latest snapshot.
thanks
sven
4 months
list creation error
by Sven Schreiber
Hi,
I'm getting a weird parsing error with a basic list creation:
<hansl>
open inven.gdt
L = cinven r3 # error: symbol 'SERIES' invalid in this context
</hansl>
If I pre-prend the explicit "list" keyword, then it works. Also if L
already exists.
This is a three-week old snapshot.
thanks
sven
4 months
Regression with many dummies (RFC)
by Riccardo (Jack) Lucchetti
Hi all,
I've begun to explore the issue of the numerical performance of OLS
regression, where you want to condition on a qualitative variable with
many different values, that is you want to run something like
ols y X dummify(fac)
where "fac" is a discrete variable with a high number of possible valid
values (call it h).
Normally, you don't really care for all the parameters; you just want
the OLS subvector for X (call it beta). Of course, a special case of the
above is fixed-effect estimation in panel data, but the problem is in
fact a little bit more general than that.
If nelem(X) = k, that would lead to regressing y on a list with k+h-1
elements. If the sample size n and h are both large, that takes a lot of
RAM, and it's very inefficient, since (as is well known) you can compute
beta in a much more clever way via the Frisch-Waugh theorem.
The attached script does just that[*], and compares execution time for
both approaches, so you can play with it.
My question to the community is: would it be worthwhile to implement the
"specialised" algorithm natively? Something like
fols y X fac
where "fols" stands for "factorised OLS"? Or maybe as an option to the
ols command? Or maybe as a function? Having such a command (or function)
would of course just pay off just in cases when both n and h are large.
Is this worth the effort?
[*] The attached function just computes beta, not all the auxiliary
quantities. But those are easy to add.
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------
4 months, 2 weeks
"ar" error inside function (was Re: Re: how to run auxiliary estimation commands inside functions)
by Sven Schreiber
Am 25.09.2025 um 18:28 schrieb Sven Schreiber:
>
> <hansl>
>
> # test an aux dataset inside a function
>
> function void internalar (matrix x, int p, series anyoldseries)
> # the series argument is only added to trigger inheriting
> # the dataset of the caller
>
> x = vec(x)
> T = rows(x)
>
> # we know we are working in a panel context
> smpl 1 1 --unit
> setobs 1 1 --time-series
> print $nobs
>
> errorif(T>$nobs, "incoming vector too long")
> # smpl 1 T # redundant
> series s = x
>
> print
>
> ols s s(-1)
> arma p 0; s
> # ar p ; s # yields data error
> end function
>
> open grunfeld
> internalar(mnormal($pd), 2, invest)
>
> </hansl>
> but notice that "ar" yields a mysterious data error which might be
> independent of the current issue, don't know.
>
I'm doing a ping on this concerning not the original topic, but the
error with the "ar" line (if you uncomment the last line of the function
in the script above). I just checked on the latest snapshot, where it
persists. Given that the arma command works, I'm not conscious of doing
anything wrong there, but maybe it has to do with initial values or so.
thanks
sven
4 months, 2 weeks