There are a couple of modifications in CVS and the Windows
snapshot, based on recent discussions.
1) Non-numeric data. Up till now we've treated a given data
column as string-coded only if the first observation is
non-numeric. Now we're more generous, and treat any column that
contains non-numeric values as a coding, subject to the following
qualification (designed to catch genuine data errors):
* If there's only one non-numeric value in a given column, or if
the non-numeric values amount to less than 1 percent of the total
non-missing values, we give up on the coding and flag an error.
The user can override this qualification for specific named
columns using "set codevars <list of names>" or can override it
globally by adding the "--coded" flag to the "open" command.
Also, as suggested by Jack, we automatically flag variables
treated in this way as discrete.
2) BLS-type data files with "five quarters" or "13 months": gretl
will now read at least some such files correctly, disregarding the
extra lines. However, my feeling is that the BLS is playing silly
buggers with this sort of file and that somebody should file a bug
report with them.
If a file of this type is not recognized, then besides the nice
and easy "grep -v" method suggested by Jack, it's also easy to
clean up such files using gretl's data manipulation tools. For
example, for a "five-quarter" data file where the data start in
1950Q1:
open nonsense.csv
genr index
smpl (index % 5 > 0) --restrict
setobs 4 1950:1
store sensible.gdt
Allin
Show replies by date