Re: [Gretl-devel] grabbing Alfred

Sunday, 25 August 2013

Am 24.08.2013 20:15, schrieb Sven Schreiber:
...
 Am 23.08.2013 11:21, schrieb Sven Schreiber:

>
> Here's my take at doing it in hansl (untested), but let's not forget
> that the goal is (IMHO) to make the preprocessing unnecessary
> altogether, by enabling 'join' do smaller/greater comparisons on ISO
> date strings!
>

 Following is now an actually working version, tested with the real-world
 1MB file of INDPRO. However, it is very slow, much slower than using my
 Python solution it seems to me. Don't know if there are some gretl
 string internals that could be sped up.

Specifically, the preprocessing including all calling overheads takes
<2sec with the Python solution, and roughly 120sec with native gretl.

I think the crucial lines are the following:

...
 loop repetitions	# loop over the lines in file
   sscanf(rest,"%s\t%s\t%s\t%s\n",col1,col2,col3,col4)
   string rest = strstr(rest,"\n") + 1 	# offset to drop the leading \n

That is, there are thousands of operations working on strings holding
(almost) the entire file content (in this case about 1MB as I said). I
have tried to consolidate this into a more clever sscanf line, but that
didn't really help. Glad to take more ideas.

In contrast, in python the file is read line per line. So I don't know
if it's worth it (bearing in mind that actually we don't want any
preprocessing...), but perhaps it would help if the readfile() function
could be extended to automatically (= at the C level) separate the lines
of the file; for example:

<future hansl>
bundle btemp = readfile(fname,1) # new optional 2nd arg to split lines
loop i=1..nelem(btemp) # extended nelem() for number of bundle items
  string line = btemp.line$i
  ... do stuff with line ...
endloop
</future hansl>

Again, not sure if it's worth it.

thanks,
sven

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Gretl-devel] grabbing Alfred