Am 26.05.2024 um 07:45 schrieb g s:
Hi Artur
Thanks very much for pointing this out. I did the means by group and
it worked, and, good news, the results seem consistent with SAS and
with other stat software.
Question though.
The results come out like this:
AreaRegion = Australia - Oceania (n = 22):
Mean 1.5576e+006
Is it possible to get the values to come out as exact numbers rather
than something point something + e006?
No. As mentioned in the reference for the 'set' command with respect to
the setting 'display_digits', the number of significant digits in the
context of 'summary' is currently limited to five. I know that piece of
information is not straightforward to find, and I don't think that this
setting can be changed via the menus. (When you're just looking at the
raw values of the variable, then you can modify the number of
significant digits alright, though.)
I guess the background is that for formatting reasons we don't want to
exceed a field width of 11 characters -- 5 digits + 1 decimal separator
+ 2 characters "e+" + 3 digits afterwards.
For example, in the region Australia - Oceania, the mean population is
1,557,590, but I can't tell if gretl is giving that exact result.
In the data set in question, the variable of interest is population,
so the number of digits can be quite large.
Exactly, there can be a lot of digits, so what would you do then? I
don't think there can be a general rule. For example, the mean
population for that region in your dataset is _not_ exactly 1,557,590,
but instead 1557590.272727 from what I'm seeing here. So the software
you were using for comparison apparently was set to report 7 significant
digits and/or to only display the integer part of the number. (It's
obvioulsy quite rare to get an exact integer number when calculating an
average of many numbers.)
The general point is that with more or less continuous variables you
have to make a compromise most of the time.
Having said this, I would accept the point that a limitation to five
significant digits could be a problem sometimes.
Actually, another question, the summary statistics gives many
statistics. Is it possible to include sum as one of them? That is, in
addition to mean population by area/region, is it possible to get sum
of population by area/region?
In the console, it's easy, just type:
= aggregate(population, AreaRegion, sum)
I'm not aware of how to do that easily in the menus.
cheers
sven