Hi Giulia,

Not sure that the following script is the best shortcut but I think it does work. It supposes that there's a variable in your dataset which identifies the companies - named here as 'company'.

Best,

Artur

# create a list with all variables, x1 to x20, to check for any missing values

list xlist = x1

loop foreach i x2..x20

list xlist = xlist $i

endloop

# create a series, as a dummy to mark 'valid' observations; it takes 1 by default

series validity=1

# check for NA: test_empty=0 if there's at least one missing value throughout 'xlist'

series test_empty = ok(xlist)

# loop through each company

loop for i=1..1577

smpl company=$i --restrict --replace

# check if there's at least one missing value

scalar test_sc = sum(test_empty)

# if so then validity=0 for all the years for the respective company

if test_sc <5

series validity=0

endif

endloop

# finally restrict sample to 'valid' observations only

smpl validity=1 --restrict --replace

# and compute summary statistics - mean & median

summary xlist

# compute the sum for each variable

matrix m_x = {xlist}

matrix m_sum = sumc(xlist)

smpl full

2015-12-16 9:24 GMT+01:00 Giulia Taveggia <taveggia@csilmilano.com>:

Dear all,

I handle a panel dataset consisting of 1577 companies, 5 years and 20 variables. I would like to know how restrict the dataset excluding missing values (ex. If one company have a missing value for one year it should be excluded by all other years). After doing this operation, I would like to compute the mean, the median and the sum for each variable. Could you please write down the right script for these actions?

Thank you very much,

Best Regards

Giulia

Giulia Taveggia

Researcher, Country Analysis and Forecasts Unit

CSIL Centre for Industrial Studies

Corso Monforte 15 - 20122 Milano - Italy Tel +39 02780497- Fax +39 02 780703

_______________________________________________

Gretl-users mailing list

Gretl-users@lists.wfu.edu

http://lists.wfu.edu/mailman/listinfo/gretl-users