Hi Sven

Thanks for responding.

"new topic"

Oops, sorry. I'm now sending this as a new email. (Unless your previous email was also new).

Yes, I'm not exactly clear on what the results are showing.

1) I did a correlation matrix of the following variables: BirthRate, Agriculture, Service, GDPPerCap, Population, InfantMort. I did NOT click on "ensure uniform sample size".

The top of the results box says

Correlation Coefficients, using the observations 4 - 229
(missing values were skipped)
Two-tailed critical values for n = 221: 5% 0.1320, 1% 0.1729

A couple of results:

BirthRate and Agriculture = 0.7021

Population and GDPPerCap = -0.0687

2) Next, I did another correlation matrix of the same variables, but this time I DID click on "ensure uniform sample size".

The top of the results box says

     Correlation Coefficients, using the observations 4 - 229
     (missing values were skipped)
     Two-tailed critical values for n = 221: 5% 0.1320, 1% 0.1729

A couple of results

BirthRate and Agriculture = 0.7021.

Population and GDPPerCap = -0.0707

3) Next, I did a correlation matrix of JUST BirthRate and Agriculture.Here are the results:

corr(BirthRate, Agriculture) = 0.68261942
Under the null hypothesis of no correlation:
t(220) = 13.855, with two-tailed p-value 0.0000

The correlation here is different from steps 1 or 2, and also this correlation agrees with the correlation between these two variables I got using SAS.

The number of cases used to do this correlation (from an analysis with SAS) is 222.

4) Finally, a correlation of JUST Population and GDPPerCap. Here are the results:

corr(GDPPerCap, population) = -0.06858619
Under the null hypothesis of no correlation:
t(226) = -1.03351, with two-tailed p-value 0.3025

The results here are different from step 2 and very slightly different from step 1. The results, however, of the above, of JUST Population and GDPPerCap are the same as the results from SAS, = -0.06859. And, by the way, the p-value is also the same as the results from SAS.

Also, the number of cases used to do this correlation (from an analysis with SAS) is 228.

CONCLUSION: It looks like if I run correlation of JUST two variables of interest, I can get the same results as from SAS.

Just to mention, my reference was a correlation matrix from SAS of all of the above variables in the same analysis, specifying pairwise correlation.

Thanks

Gene

----- Forwarded Message -----

From: Sven Schreiber <sven.schreiber@fu-berlin.de>

To: "gretl-users@gretlml.univpm.it" <gretl-users@gretlml.univpm.it>

Sent: Saturday, June 1, 2024 at 10:29:36 PM EDT

Subject: [Gretl-users] Re: Gretl: correlations

Am 30.05.2024 um 19:28 schrieb g s:

Hi all

This is a new topic. In the future, please don't just reply within an existing thread (even if you change the subject line), but send a fresh email message to the list.

My data set has some missing values for some variables. When I do correlation, using the drop down menus (view and then correlation matrix), I notice that gretl reports a correlation matrix excluding all cases with any missing values for any of the variables.

I don't think that's correct. Notice that in the dialog window there's a tick box saying "ensure uniform sample". And the results differ depending on whether you tick the box or not.

However, at the top of the result printout there's the line "using obs x to y", insinuating that the same sample applies to all, even if the tick box wasn't active. So that seems to be misleading, and I guess that printout could be improved (or perhaps omitted).

cheers

sven