On Tue, 13 Apr 2010, Allin Cottrell wrote:
That behavior is by design and IMO it's not a bug, since zero
times any value is zero. (This is the one exception we make to the
rule that NAs always propagate to the result, and it's explained
in section 5.7 of the User's Guide, "Handling missing values".)
FWIW, let me back Allin on this one. The idea that NA*0 should return 0
troubled me in the past, until I realised that setting something to NA is
often a shortcut we use when we really mean 'I don't want this observation
in my sample'. In fact, NA simply stands for 'I don't know what this cell
contains', and when you think about it this way, then NA*0==0 makes
perfect sense.
Example: suppose you have wage data for individuals who may be unemployed.
For those cases, you'd obviously have NA. Now suppose you want to
"interact" wage with a gender dummy (suppose 1==male, 0==female). Clearly,
if you multiply wage by gender (w_male = wage * gender) you'd get
w_male==NA for unemployed men and w_male==0 for unemployed women, which is
counter-intuitive only if you expect things to work if you screen
unemployed people by checking for ok(w_g). The "proper" way to go in this
case would be to define w_male as
w_male = ok(wage) ? wage * gender : NA
Riccardo (Jack) Lucchetti
Dipartimento di Economia
Università Politecnica delle Marche
r.lucchetti(a)univpm.it
http://www.econ.univpm.it/lucchetti