On Fri, Oct 17, 2025 at 10:27 AM Riccardo (Jack) Lucchetti <p002264@staff.univpm.it> wrote:Hi all, I've begun to explore the issue of the numerical performance of OLS regression, where you want to condition on a qualitative variable with many different values [and have come up with an efficient solution].This is nice. I'd say it's a bit too specialized to be an option to "ols", and a function package would be a good way to go. But since it seems that aggregate() does the heavy lifting, we could revisit that to see if there's a tweak that could speed things up in this case. However, I wouldn't be opposed to a new "fols" command (fols y xlist ; faclist) if that has an additional speed advantage.
Jack mentioned that the approach is a generalization of fixed-effects regression, and that the foundation is through the Frisch-Waugh-Lovell theorem. So I wonder whether one should go one step further and also attempt to cover non-qualitative control variables. Basically, I'd like to point out the apparent connection to double machine learning. I'm sure that in terms of the algorithm it's non-trivial to go from discrete to continuous variables. However, before a new estimation command is introduced, I think it would be good to think about what else it might cover in the future, if only to make the name of the command general enough.
cheers
sven