[Gretl-devel] Re: mread possible enhancement

Sunday, 19 May 2019

On Sun, 19 May 2019, Sven Schreiber wrote:

...
 Am 19.05.2019 um 19:13 schrieb Riccardo (Jack) Lucchetti:
>> 
>> Hmm, interesting idea. I think this could be made to work quite
>> nicely. Internally, nothing prevents us from creating a new, temporary
>> "hidden" dataset (then turning it into a matrix) without disturbing
>> the existing dataset or absence of dataset.
> 
> This would be very nice of course, but in that case I would imagine the
> job would be less straightforward than it seems, because of the
> intrinsic differences between the eventual aims.

 Given that we have a function for reading a matrix from a file (mread) I
 think the natural aim should be to extend that function eventually to
 read from csv. Either with a new option or perhaps simply by recognizing
 a ".csv" file extension.
 (I'm speaking purely from a user's point of view here.)
 But if that isn't feasible in the short term, maybe a transitory
 function in "extra" could indeed be the solution. 
A few points on this.

1) Jack's csv2mat is an outstanding example of accomplishing a lot 
with just a few lines of hansl. Of course this is not in the least 
unusual from Jack, but for the rest of us it's noteworthy all the 
same!

2) I take Jack's point that the "no error" criterion for reading a 
dataset from CSV (which we already do) is more restrictive than that 
for reading a matrix from CSV -- where we don't have to care about 
valid variable names, nor about handling non-numeric values, which 
we can just map onto "NA" without further ado.

3) Nonetheless, I find that it's not too difficult to handle the 
issues under point 2 in the context of our current CSV importation 
code. In current git, you can try out reading CSV into a matrix via 
mread() when the filename (or URL) has a ".csv" extension. Two 
comments on that: (a) "CSV" really just means delimited text, the 
delimiter doesn't have to be comma; and (b) if we want to pursue 
this option we could admit some other filename extensions.

4) One point supported by Jack's hansl code that is not supported by 
our built-in CSV importer is malformed CSV (e.g. some lines have 
more fields than others). I don't think we'd want to support this in 
our C code -- and actually I kinda wonder about the wisdom of 
supporting it at all.

I'm attaching a sample script that derives from Jack's original 
upthread. It requires, and compares results with, Jack's 
csv2mat.inp.

Allin

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Gretl-devel] Re: mread possible enhancement