I've looked further into how R handles NA/Nan/Inf (for
floating-point data). It has its own logic but I'm not sure it's
very intuitive, or something we'd want to emulate.
Yes, R does have distinct internal representations for NA and NaN.
But in some ways NaNs are treated as a subset of NAs, while in
other ways NAs are treated as if they were NaNs. If you do
x <- 0/0 # or
x <- log(-1)
you get a value that gives TRUE for both is.nan(x) and is.na(x).
If you do
x <- NA
you get a value that answers TRUE to is.na(x) but FALSE to
is.nan(x).
NaNs are treated like NAs in that they are automatically skipped
by default when running a linear regression via lm(). Infinities
are not treated the same way. That is, if you have data vectors x
and y and you define one of the x-values to log(-1) you get a
warning,
In log(-1) : NaNs produced
but the relevant observation is skipped by lm(), as in gretl. If
you define an x-value to log(0) (producing -Inf) you get no
warning but the regression fails with
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok,
...) :
NA/NaN/Inf in foreign function call (arg 1)
And (as we already knew) NAs are treated as NaNs in that 0*NA =
NA.
Allin
Show replies by date