On Sat, 12 Nov 2016, Sven Schreiber wrote:
Am 11.11.2016 um 04:28 schrieb Allin Cottrell:
> Announcement: Following some off-list discussion, there's now a new item
> in the right-click menu when you have two or more series selected in the
> main gretl window, namely a check on collinearity, in the form of the
> condition number of a matrix composed of those series. (You get the
> choice of including a constant in the matrix.)
Thanks for this feature.
Before it's released, let me just ask whether this output might instead be
added to the output of "correlation matrix" in the same context menu.
Correlation is also a measure of linear association, and so it would feel
like a natural place IMO.
This would save the extra entry in the context menu, and instead of asking
the user whether or not to include a constant term, both variants could just
be printed out simultaneously. (Currently there's only a few lines of output
in the window when one selects "correlation matrix".)
I'd prefer not to do that; more below.
While we're on the subject of the condition number and
collinearity, I have a
question about the following example: Open the example data hall.gdt and
select the two variables "consrat" and "ewr"; the correlation output
(again,
right-click and then select from the context menu) then shows a corr coeff of
just 0.16. The new collinearity analysis gives a whopping 634, an order of
magnitude greater than the rule-of-thumb value 50. The bkw.gfn package
confirms this value.
This strikes me as qualitatively very different, and spontaneously I'm not
sure why that is so. Any ideas?
The Pearson correlation coefficient is undefined if one or both of
the terms are constants. However, the Belsley condition number can
handle a constant, and presumably the big condition number in the
example you describe (but note, only when you include a constant) is
due to the fact that consrat itself is almost a constant. So these
are rather different calculations (although certainly related), and
I'd rather keep them distinct.
Allin