On Sat, 7 Jan 2017, Artur Tarassow wrote:
> I've made a first stab at this (in git and snapshots).
I'll briefly
> describe what's there, and pose a few questions for consideration.
Thank you for this, Allin. It's a nice feature I think.
> Command line: the "corr" command now has a --plot=whatever option (along
> the same lines as the "freq" command). You can use --plot=none to
> suppress a heatmap plot in interactive mode, --plot=display to show it
> on-screen, or --plot=somename.pdf, and so on.
That works quite nicely here on Ubuntu.
Good to hear.
However, it seems that using the "corr" excludes the
possibility to plot the
correlation-matrix using a matrix, right? I think this is a bit unfortunate.
In this case, the user would still need to program it itself or use an
additional external package, right?
In git I've added an option --matrix=matname for "corr" (as for
"freq" and "summary"), which should take care of this point.
Example:
<hansl>
open data4-10
matrix X = {dataset}
corr --matrix=X --plot=display
</hansl>
> For now I'm showing positive correlations in red, of varying
intensity,
> and negative correlations in blue, with white in the middle (around
> zero). To cut down on visual clutter I've extended the white range to
> cover correlations that are not significantly different from zero at the
> 20% significance level.
Personally, I can live with the 20% sign. level ;-)
However, I would also favor a monochrome option for the heat plot as I had in
my little function. Especially when it comes to publication, one often needs
to provide black-white figures. So I would favor another option like
"--monochrome" or so.
OK, but do you have a suggestion for how to distinguish negative
from positive correlations in grayscale? (Simply running a scale
from white = -1 to black = +1 doesn't seem like it'd be very
intuitive.)
> By default, the whole matrix is shown but there's a
--triangle option to
> show just the lower triangle.
Also a nice idea, and it works good here.
> Some questions:
>
> * Full matrix vs triangle: which should be the default? Is the choice
> worth having or should we just fix on one or the other?
I like to have both options. And if it doesn't impose any issues in gretl, I
don't see a reason not to consider these.
Alright, we'll leave that as it is.
> * Right now there's a minimum dimension of 3 for doing the
heatmap:
> should the minimum be bigger than that? (3 x 3 looks kinda silly, but
> maybe it should be available anyway?)
Again, I think it's good if the user can decide on this.
> * The plot works reasonably well for up to about 30 series, but is going
> to get quite messy for more than that. Should we set a max? And/or,
> should we make a special effort to get the plot working acceptably for
> bigger dimensions? (Smaller font, not sure what else could be done.)
Mmmhh, 30 sounds quite large already. I couldn't imagine of a case where you
would need an even larger dimension...
Fine, we can leave that be for now, and handle complaints if and
when they arise!
Allin