On Fri, 4 Dec 2015, Sven Schreiber wrote:
Maybe it has something to do with a) string-valued series, b) having
hyphens and Umlauts in the values.
Further investigation: hyphens are not a problem, umlauts are.
Strictly speaking this is an invalid dta file. The specification for
dta 117 says clearly, "Strfs use ASCII encoding". (That's strfs =
fixed-length strings as values of a variable, as in the Bundesland
variable in this dataset.) However Stata itself in practice handles
non-ASCII string values encoded in Windows codepage 1252. I've now
revised our dta importer to do that. That's in git and snapshots.
So now "Bundesland" should display correctly.
Although it's clear now where the wonky formatting was coming from
I'm still not sure what provoked the crash on (only 32-bit?)
Windows. This may be a word-length thing unrelated to the bad
strings; it would be nice to track it down!
Allin