Hi Giulia,
Not sure that the following script is the best shortcut but I think it does
work. It supposes that there's a variable in your dataset which identifies
the companies - named here as 'company'.
Best,
Artur
# create a list with all variables, x1 to x20, to check for any missing
values
list xlist = x1
loop foreach i x2..x20
list xlist = xlist $i
endloop
# create a series, as a dummy to mark 'valid' observations; it takes 1 by
default
series validity=1
# check for NA: test_empty=0 if there's at least one missing value
throughout 'xlist'
series test_empty = ok(xlist)
# loop through each company
loop for i=1..1577
smpl company=$i --restrict --replace
# check if there's at least one missing value
scalar test_sc = sum(test_empty)
# if so then validity=0 for all the years for the respective company
if test_sc <5
series validity=0
endif
endloop
# finally restrict sample to 'valid' observations only
smpl validity=1 --restrict --replace
# and compute summary statistics - mean & median
summary xlist
# compute the sum for each variable
matrix m_x = {xlist}
matrix m_sum = sumc(xlist)
smpl full
2015-12-16 9:24 GMT+01:00 Giulia Taveggia <taveggia(a)csilmilano.com>:
Dear all,
I handle a panel dataset consisting of 1577 companies, 5 years and 20
variables. I would like to know how restrict the dataset excluding missing
values (ex. If one company have a missing value for one year it should be
excluded by all other years). After doing this operation, I would like to
compute the mean, the median and the sum for each variable. Could you
please write down the right script for these actions?
Thank you very much,
Best Regards
Giulia
*Giulia Taveggia*
Researcher, Country Analysis and Forecasts Unit
CSIL Centre for Industrial Studies
Corso Monforte 15 - 20122 Milano - Italy Tel +39 02780497- Fax +39 02
780703
_______________________________________________
Gretl-users mailing list
Gretl-users(a)lists.wfu.edu
http://lists.wfu.edu/mailman/listinfo/gretl-users