Re: [Gretl-devel] gretl's CSV reader and categorical data

Tuesday, 25 July 2017

Am 23.07.2017 um 21:44 schrieb Allin Cottrell:

...
 The problem: some time ago we decided to ease the task of parsing
"CSV" 
 by deleting quotation marks from each line of input. (We can and do 
 recognize string-valued input, but only by determining that it cannot be 
 parsed as numeric.)  Quotation is sometimes used inconsistently and 
 arbitrarily in "CSV" files  
I am absolutely no csv fundamentalist (like people who don't accept 
semicolons or tabs as column separators), but could you remind us why 
coping with CSV files with inconsistent quotation has to be done? 
Spontaneously I'd say such files are really the problem of their creators.

...
 So, I've been working on a revision of our CSV reader in which we

 "respect" quotation in this sense: we do not delete quotation marks in 
 CSV input, and if it turns out that all the values in a given column are 
 quoted integers, we take that column to be an encoding of a categorical 
 variable.  
Except if they're years, I hope... No seriously, doesn't this mess with 
a lot of variables that may be only integers but that we usually treat 
as quasi-continuous?

thanks,
sven

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Gretl-devel] gretl's CSV reader and categorical data