Let me just add a footnote to this discussion. My claim, infra, is
more or less, "Don't try to second-guess gretl in respect of matrix
operations, just write the 'natural' code (e.g. "X'X" for X'X)
and
chances are you'll get optimal performance."
I'm reasonably confident about this in general, but there may well
be some counter-examples. If you find one, please let us know and
we'll try to fix it.
As shown by Jack's timing comparison script (also infra) there's no
need to speculate about whether hansl code variant A is or is not
faster than hansl code variant B. Just wrap the calculation in a big
fat loop and apply $stopwatch to both versions -- the empirical
method.
Allin