On Tue, 2 Feb 2016, Alan G Isaac wrote:
Once again, a problem for data analysis arises when
someone insists that at **COMMA** separated values
file should be allowed to use commas for some other purpose.
Stick to the RFC and these problems go away:
https://tools.ietf.org/html/rfc4180
I understand why it was done, but I think the decision
to handle non-conforming files that claim to be CSV
was more politically correct than scientifically correct.
For CSV data exchange to be internationally feasible,
we need to stick to an international standard.
You have a strong figure in your corner -- at least against the
decimal comma for anything other than display purposes -- namely
Jack Lucchetti, but I'm afraid I'm a wretched compromiser.
The label "CSV" has become detached from its literal meaning, and
now designates pretty much any column-oriented "delimited text" data
file, where the column delimiter may be comma, tab, space or
semicolon and the decimal separator may be dot or comma.
As a coder, I find the problem of interpreting such files quite an
interesting one, up to a point. The cases of decimal-dot plus comma
delimiter and decimal-comma plus semicolon delimiter are quite
easily figured out. Things get a lot more complicated when
thousands-separators are introduced: it is totally crazy to put such
decorations in files that are supposed to be computer-readable. But
there's a lot of it out there, so we try our best to come up with an
algorithm that will extract the signal from the noise.
Allin