Hi Giulia,Not sure that the following script is the best shortcut but I think it does work. It supposes that there's a variable in your dataset which identifies the companies - named here as 'company'.Best,Artur# create a list with all variables, x1 to x20, to check for any missing valueslist xlist = x1loop foreach i x2..x20list xlist = xlist $iendloop# create a series, as a dummy to mark 'valid' observations; it takes 1 by defaultseries validity=1# check for NA: test_empty=0 if there's at least one missing value throughout 'xlist'series test_empty = ok(xlist)# loop through each companyloop for i=1..1577smpl company=$i --restrict --replace# check if there's at least one missing valuescalar test_sc = sum(test_empty)# if so then validity=0 for all the years for the respective companyif test_sc <5series validity=0endifendloop# finally restrict sample to 'valid' observations onlysmpl validity=1 --restrict --replace# and compute summary statistics - mean & mediansummary xlist# compute the sum for each variablematrix m_x = {xlist}matrix m_sum = sumc(xlist)smpl full2015-12-16 9:24 GMT+01:00 Giulia Taveggia <taveggia@csilmilano.com>:_______________________________________________Dear all,
I handle a panel dataset consisting of 1577 companies, 5 years and 20 variables. I would like to know how restrict the dataset excluding missing values (ex. If one company have a missing value for one year it should be excluded by all other years). After doing this operation, I would like to compute the mean, the median and the sum for each variable. Could you please write down the right script for these actions?
Thank you very much,
Best Regards
Giulia
Giulia Taveggia
Researcher, Country Analysis and Forecasts Unit
CSIL Centre for Industrial Studies
Corso Monforte 15 - 20122 Milano - Italy Tel +39 02780497- Fax +39 02 780703
Gretl-users mailing list
Gretl-users@lists.wfu.edu
http://lists.wfu.edu/mailman/listinfo/gretl-users