On Sun, 27 Sep 2015, Allin Cottrell wrote:
Note that whether [uninitialized] values are taken as
"numeric" or
not will in general depend on the C library in use. But either way
they're wrong and have to be changed. _If_ you can get such values
into gretl as numeric, you could fix them via something like:
foo = (abs(foo) > 0 && abs(foo) < 1.0e-100)? NA : foo
where "foo" is the name of the series to be fixed and we're
assuming that non-zero observations with absolute value less than
10^{-100} are garbage. This is not very reliable, however, as it's
_possible_ that some uninitialized doubles happen to fall in the
"normal" range and so escape correction.
After a little testing, let me rephrase that: it's more than
"possible", it's highly probable.
I wrote a little test C program which created an array of 2048
"doubles", uninitialized. For each such value I printed it into a
string variable using sprintf() with the "%g" conversion then tried
reading it back into a double using strtod(). I counted the cases
where strtod() raised the ERANGE error: 271 out of 2048. So in this
case, at least, the great majority of garbage values appeared to be
"fine": properly numeric and not subnormal.
So here's a big WARNING: on no account should one let uninitialized
values get printed into a file for use in econometric analysis.
There's no half-way reliable method for clearing them out.
[Just as a footnote: a "subnormal" number (also known as
"denormalized") is one that's too close to zero to be represented as
a C "double" to anything like the usual precision. And there's
absolutely no guarantee that the random bits in an uninitialized
double will correspond to a subnormal number.]
Allin Cottrell