This response to Fred is kinda long but hopefully some of it may be of
interest.
Reminder: Fred reported a crash when running a Tobit model with 5000+
panel-unit dummy variables, and also reported the estimation taking
ages -- even plain OLS with all the panel dummies was painfully slow.
Sven has addressed the econometrics of the Tobit estimation; in this
reply I'm just concerned with the mechanics.
Diagnosing the crash requires being able to run the thing on a
reasonable time scale, so my first task was speeding it up.
The first step in our Tobit is running OLS (to flush out any missing
values and set up a suitably sized model structure), and that step in
itself was taking far too long. Up till now our default OLS engine has
been a gretl-native Cholesky solver. It's fast and effective for
problems of a typical size in econometrics, but this case has exposed
the fact that it bogs down badly when there are thousands of
regressors. So I've now put a switch in place: if the problem exceeds
a certain size we switch to LAPACK, which on most systems these days
will be highly optimized (on Mac the Accelerate Framework will be
used).
For reference I'm showing below the test rig I'm running in emulation
of Fred's example (in the first instance, just the OLS part). The
X'X matrix has > 25 million elements.
<hansl>
set seed 1234
N = 5000
T = 4
NT = N*T
nulldata NT --preserve
setobs 4 1:1 --stacked-time-series
series x = normal()
series y = normal()
y = y < 1 ? 0 : y
genr unitdum
ols y du* x
</hansl>
With our previous OLS routine, estimation took about 300 seconds on my
desktop; with the new LAPACK switch it takes about 5 seconds.
Alright, that's nice, but what about Tobit? Well, our default
estimation algorithm for Tobit is Newton-Raphson, using the analytical
Hessian. The Hessian here will have tens of millions of elements so
Newton is bound to be slow. I therefore tried using BFGS instead, and
for good measure made the convergence tolerance a bit sloppier than
usual:
set optimizer BFGS
set bfgs_toler 1.0e-5
BFGS produced parameter estimates in a tolerable time, but then we hit
computation of the Hessian for the covariance matrix -- and after
about 5 minutes I couldn't be bothered waiting for that to finish!
Next step: enable the --opg option for tobit (and intreg), to get a
cheaper covariance matrix via the Outer Product of the Gradient. Plus,
I introduced a little parallelization into the tobit loglikelihood
code. Net result: I could now run the whole thing in under 100
seconds. The tobit portion looks like this:
<hansl>
set optimizer BFGS
set bfgs_toler 1.0e-5
set stopwatch
tobit y du* x --opg
printf "elapsed: %gs\n", $stopwatch
</hansl>
I then experimented a bit more with the Hessian and introduced some
parallelization there. 5000 dummies are still a problem, but with 500
dummies the full Newton/Hessian estimation goes through in about 5
seconds.
As of now I haven't been able to replicate a crash; I'll dig at that
some more later.
Allin