On Sun, 27 Jul 2025, Sven Schreiber wrote:
Am 25.07.2025 um 21:30 schrieb Cottrell, Allin:
> It's more specific than I indicated: things go wrong as soon as a lagged
> regressor /that is not the last regressor at the time/, gets omitted.
>
> The regressors will have been reordered, if necessary, at the start of the
> backward stepwise procedure, to increase the likelihood that terms to be
> omitted will come at the end, which simplifies things considerably. But
> that's not going to work 100%. When we have to eliminate an "internal"
> column of Q, and row/column of R, the recomputation that's needed involves
> reference to the dataset, but at what starting point? We were diving in at
> dset->t1, but in a specification with lags it should be at orig->t1,
> where @orig is the original model. That's a distinction without a
> difference in the case of cross-sectional data, but it's crucial for time
> series with lags.Now that's fixed in git.
Thanks, Allin! I tested with the latest snapshot and everything seems fine
now!
Ah, glad to hear it.
I've now committed a second round of corrections for add/omit with
the --auto flag, this time to implement proper handling of missing
values.
There's one case that provokes a "missing data" error. That's with
"add", when one or more of the candidate series for addition include
missing values within the original estimation range that are not
aligned with missing values in the dependent variable or the
original regressors. We refuse that case since it would force a
reduction of the sample relative to the original estimation. A
simple example follows.
<hansl>
open data4-10
ENROLL[30] = NA # this is alright now
MEMNEA[30] = NA # also OK now
# MEMNEA[40] = NA # provokes an error
list X = 2..9
ols ENROLL const
add X --auto=BIC
</hansl>
Allin