[Gretl-users] warning: gdt-reading bug

Friday, 16 October 2015

We've just noticed that a bug was introduced into our code for reading 
native gretl .gdt data files in August of this year. The bug should be 
triggered only rarely, but we thought it wise to issue a warning.

Description of bug: If a gdt file contains "subnormal" values (that 
is, floating point values that are too close to zero to be represented 
with the usual precision), then when such a file is read on Linux, the 
first subnormal value to be found on a given row (observation) will be 
incorrectly copied into the remaining columns (series) on that row.

Example: A gdt file containing 10 series has a subnormal for series 
number 5 on row 25. Then when the file is read on Linux, that 
subnormal will replace the correct values for series 6 to 10 for 
observation 25.

Comment: This won't affect the reading of "primary" data (actual 
micro- or macroeconomic measurements), which will never contain 
subnormal values (we're talking about absolute values less than 10 to 
the minus 307). And the bug is not triggered on MS Windows. However, 
subnormal values may be produced by some data transformations (such as 
squaring very small numbers, or computing the normal CDF of very big 
negative values).

Fix: This is now fixed in the git source for gretl and also the 
current snapshots. And we will put out a new release soon, gretl 
2015d.

Diagnostic: If you think a dataset may suffer from this problem,
you can run the script checkdata.inp, from

http://ricardo.ecn.wfu.edu/pub/gretl/checkdata.inp

First load the dataset in question. Then open checkdata.inp and run 
it. An affected dataset may produce something like this:

<script-output>
Total number of values examined: 164122

Check for subnormal floating-point values
-----------------------------------------
Total number found: 138
Longest (row) sequence: 138
  (occurs at obs 210, starting series ID 461)
Number of sequences (of length >= 2): 1
</script-output>

The symptom of a problem is that we find a consecutive sequence of 
subnormal values on one or more rows of the dataset. This could occur 
for "natural" reasons but it may indicate corruption. Isolated 
subnormals don't indicate the bug. And again, most datasets should 
contain no subnormal values.

Allin Cottrell

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Gretl-users] warning: gdt-reading bug