RNA

Friday, 16 April 2010

I've looked further into how R handles NA/Nan/Inf (for
floating-point data). It has its own logic but I'm not sure it's
very intuitive, or something we'd want to emulate.

Yes, R does have distinct internal representations for NA and NaN.
But in some ways NaNs are treated as a subset of NAs, while in
other ways NAs are treated as if they were NaNs. If you do

x <- 0/0 # or
x <- log(-1)

you get a value that gives TRUE for both is.nan(x) and is.na(x).
If you do

x <- NA

you get a value that answers TRUE to is.na(x) but FALSE to
is.nan(x).

NaNs are treated like NAs in that they are automatically skipped
by default when running a linear regression via lm(). Infinities
are not treated the same way. That is, if you have data vectors x
and y and you define one of the x-values to log(-1) you get a
warning,

In log(-1) : NaNs produced

but the relevant observation is skipped by lm(), as in gretl. If
you define an x-value to log(0) (producing -Inf) you get no
warning but the regression fails with

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok,
...) :
  NA/NaN/Inf in foreign function call (arg 1)

And (as we already knew) NAs are treated as NaNs in that 0*NA =
NA.

Allin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006