I commented on Jack's Card and Krueger dataset for gretl:
And also a non-trivial piece of work, I see. With the original data
taking the form of an old-school fixed-format file plus codebook, and
an erroneous "unique ID" to be spotted and fixed. Kudos.
Follow-up: The original C&K numerical data are in a plain text file
called public.dat, with no variable names and with "." indicating NA.
It struck me that gretl ought to be able to read such a file, but it
couldn't, on two counts (see below). I've now fixed things in git so
that it works.
Problem #1:Given the ".dat" suffix, gretl was expecting a JMulTi data
file, and on that interpretation the read failed right away. But
".dat" is pretty generic, so now we don't jump to conclusions; we run
a quick test of the file's content and if it doesn't validate as
JMulTi we treat it as generic plain text ("CSV" in an extended sense).
Problem #2: Gretl scans for variable names on the first line of a
"CSV" file. It's not going to find any in public.dat, and in itself
that's OK, but the scanner was being freaked out by the "." cells,
which seemed to be neither numeric nor interpretable as variable
names. The update here is that on the varname scan we treat plain "."
as indicating NA, which can be taken as numeric. So now the scan
completes successfully (not finding any names), and gretl adds generic
variable names "v1", "v2" and so on. (On the actual data read, as
opposed to the varname scan, we were already treating "." as NA.)
Allin