As you probably know, gretl supports matrix "left division" (A \ B)
and "right division" (A / B). I'll take left division as my example
here but everything applies to right division mutatis mutandis.
If A is square, A \ B gives the matrix X that solves AX = B, but if A
has more rows than columns it gives the least squares solution. So if
you just want the coefficients from an OLS regression you can do
betahat = X \ y
rather than using mols(), if you wish. The result has good accuracy;
it uses QR decomposition via lapack.
So far so good, but I recently noticed that it doesn't play very
nicely if A (m x n, m > n) happens to be of less than full rank: you
get an answer, but in general it's _not_ actually the least-squares
solution.
In git, therefore, I've now replaced the lapack function we were
calling before (dgels) with dgelsy, which uses column pivoting and
handles rank deficiency correctly.
Here's a little example that illustrates how things now work.
<hansl>
set verbose off
scalar T = 50
scalar k = 4
matrix X = ones(T,1) ~ mnormal(T,k)
matrix y = mnormal(T,1)
# regular matrix OLS
b = mols(y, X)
uh = y - X*b
SSR = uh'uh
s2 = SSR/(T-cols(X))
se = sqrt(diag(s2 * inv(X'X)))
printf "regular OLS (coeff, se):\n\n%12.6g\n", b~se
printf "SSR = %g, s2 = %g\n\n", SSR, s2
# add a perfectly collinear column and use
# left-division (QR with column pivoting)
X ~= ones(T,1)
b = X \ y
uh = y - X*b
SSR = uh'uh
s2 = SSR/(T-rank(X))
se = sqrt(diag(s2 * ginv(X'X)))
printf "Left-division, rank-deficient (coeff, se):\n\n%12.6g\n", b~se
printf "SSR = %g, s2 = %g\n\n", SSR, s2
</hansl>
The SSR will now be the same in the two cases: mols() with an X matrix
of full rank, and left division with a redundant column added to X. In
the latter case you get an extra coefficient, but X*b produces the
same result.
Allin
Show replies by date