On Sat, 3 Jun 2017, Filipe Rodrigues da Costa wrote:
Dear All,
I'm trying to use the functions sd(x), and mean(x) in a list instead
of a series. From what I understood, these functions ignore NAs when
used in a series, but when used in a list they return NA if any
missing value.
Yes, that's right as a description of what gretl does, and moreover I
think it's the right thing to do. It's very standard to want
statistics for a given series ignoring any NAs in the series. But if
you want a series containing the "cross-sectional" mean of a list of
series, then in general you don't want to skip NAs. I would argue that
in that case you would in fact be calculating a different statistic at
observations where there are any NAs.
I need to compute the cross-sectional mean and standard deviation and
I
want NAs to be ignored, as they are in a series. Is there a simple way
of doing this?
There isn't a "built-in" way, but one could write a hansl function to
do it. Here's a simple example.
<hansl>
function series my_list_mean (list L)
series ret = mean(L)
series chk = missing(ret)
if nelem(L) > 1 && sum(chk) > 0
loop t=$t1..$t2 -q
if chk[t]
n_ok = 0
tot = 0
loop i=1..nelem(L) -q
if !missing(L[i][t])
tot += L[i][t]
n_ok++
endif
endloop
if n_ok > 0
ret[t] = tot / n_ok
endif
endif
endloop
endif
return ret
end function
open data4-10
list L = ENROLL CATHOL INCOME
INCOME[10] = NA
series mL = my_list_mean(L)
print L mL -o
</hansl>
Allin Cottrell