NA*0? What about dummy variables?

Sunday, 18 April 2010

Date: Fri, 16 Apr 2010 18:14:08 -0400 (EDT)
From: Allin Cottrell <cottrell(a)wfu.edu&gt;
Subject: Re: [Gretl-devel] NA and nan: next steps
To: Gretl development <gretl-devel(a)lists.wfu.edu&gt;
Message-ID: <Pine.A41.4.58.1004161812050.2326900(a)f1n11.sp2net.wfu.edu&gt;
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Sat, 17 Apr 2010, Sven Schreiber wrote:

...
 Allin Cottrell schrieb:
 > On Wed, 14 Apr 2010, Gordon Hughes wrote:
 >
 >> Can I raise a dissenting voice?  Do you REALLY want to expend the
 >> effort to distinguishing between NA and NaN in every single procedure
 >> and (presumably) every function, etc?  It would be even worse if you
 >> added +/-Inf.  My reaction is that there are better ways to spend
 >> time in developing the program.
 >
 > I have no desire to spend a lot of time in this area. I suspect
 > there's an intractable problem here, which has to be resolved by
 > fiat. In principle, NA and NaN are different things, which is
 > particularly apparent in the case of evaluating 0*NA versus 0*NaN.
 >
 > The statistical programs that we've had reports on to date on this
 > list resolve the issue by treating NAs as if they were NaNs;

 I don't mean to suggest any implication for gretl's development here,
 but it seems to me that this statement is not correct as regards Octave
 and R; at least from what quick googling revealed to me, since I'm not
 an expert in either of those packages. Both Octave (/Matlab) and R seem
 to distinguish NA and NaN (and I guess even +-Inf) AFAICS. 
->OK, they may make _some_ distinction, but if they evaluate 0*NA as-
->NA (as we've heard) then they are not doing it right.

->Allin

Except if 0 is a dummy variable.  In that case 0*NA = NA., 0 is not really
zero in ordinal data--it's arbitrary and just indicates a category.
Otherwise recoding the dummy variables changes the effective data set and
lead to unexpected results.  If 0 is really zero in the cardinal sense then
gretl handles this correctly and R/Octave do not.  If 0 is a categorical
variable gretl gets it wrong and Octave/R get it right.  One solution would
be to declare dummy variables as factors (like Jeff Racine does in his
semiparametric models) so that the handling of the interactions could be
right in both instances.

I like what gretl does for cardinal data.  But I'm not sure this is what I
would want for ordinal/categorical data.

Lee

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006