On Wed, 30 Jul 2008, Gordon Hughes wrote:
I would like to pose some questions about the best way of dealing
with
missing values when using mle. I had assumed, wrongly, that mle simply
applies case-wise deletion to cases with missing values. Instead it
terminates with a more or less informative error message. On the other
hand, built-in estimation commands seem to handle missing values
internally.
Yes. The mle command uses the trick of flagging an error when some NAs are
encountered because it become simple to do "the right thing" automatically
in cases when parameters jump out of their boundaries (negative variances
etc). Currently, it's the user's responsibility to handle "proper" NAs
in
mle blocks.
A. Is this the right thing to do? Deleting cases with missing
values in
calculating the log-likelihood, etc is not difficult, but there may be
models when missing values cause problems if not dealt with properly - eg
any lag structure. Would it be possible to provide an option (--delmiss)
for case-wise deletion of missing values?
In many cases, a very handy technique is using the ok() function with a
list argument, as in
<script>
list X = y x1 x2
smpl ok(X) --restrict
</script>
whose meaning should be obvious.
You also have another way to achieve the same result, which is arguably
more byzantine but perhaps more transparent in cases when what you really
need is a linear combination of some variables for the valid observations
only:
<script>
nulldata 20
x = (uniform()>0.8) ? NA : normal()
list X = const x
b = {1;1}
y = (uniform()>0.8) ? NA : normal()
e = y - lincomb(X, b)
print y x e -o
smpl ok(e) --restrict
print y x e -o
</script>
B. In writing a function it is not difficult to test for missing
values,
but then what should the function do? Here the question is what are the
consequences of adjusting the sample to exclude cases with missing values
within the function? Would this affect the sample used in the program
that calls the function? If so, one would really need a "smpl --restore"
command to set the sample back to its state on entry to the function.
No need to worry. The smpl command in functions is local. Example:
<script>
function pos_only(series x)
smpl (x>0) --restrict
summary x
end function
nulldata 100
y = normal()
pos_only(y)
summary y
</script>
Riccardo (Jack) Lucchetti
Dipartimento di Economia
Università Politecnica delle Marche
r.lucchetti(a)univpm.it
http://www.econ.univpm.it/lucchetti