Re: Gretl-devel Digest, Vol 176, Issue 6
by Ignacio Diaz-Emparanza
Hi,
this file was created with Libreoffice 6.4.7.2 (in Ubuntu 20.04). I am
seeing now that a file properly saved by excel did not have this
problem. In any case, thank you for solving this error.
Ignacio
El 16/9/21 a las 0:00, gretl-devel-request(a)gretlml.univpm.it escribió:
> Asunto:
> [Gretl-devel] Re: bug in xls importer?
> De:
> Allin Cottrell <cottrell(a)wfu.edu>
> Fecha:
> 15/9/21 23:28
>
> Para:
> Gretl development <gretl-devel(a)gretlml.univpm.it>
>
>
> On Tue, 14 Sep 2021, Allin Cottrell wrote:
>
>> One more observation: if I open Ignacio's Salarios.xls in gnumeric,
>> then save it as xls, gretl can read both the numeric and the
>> string-valued columns OK.
>>
>> I'm therefore reinforced in my belief that this is an OpenOffice bug,
>> or maybe some inscrutable xls quirk.
>
> Ah, but... we were led astray by one element of the xls file, but once
> we figured out that was a false trail we failed fully to recompute.
> Now we do so, meaning that in current git (and in the next release)
> gretl will manage to import the string-valued series in Ignacio's
> Salarios.xls successfully.
>
> So thanks, Ignacio, for posting what eventually turned out to be an
> instructive example.
>
> Allin
--
Ignacio Díaz-Emparanza
Departamento de Métodos Cuantitativos
Universidad del País Vasco - Euskalherriko Unibertsitatea, UPV/EHU
Tfno: (+34) 94 601 3732
3 years, 3 months
bug in xls importer?
by Ignacio Diaz-Emparanza
Hi,
I am seeing a problem when importing an xls file with qualitative
variables. The problem is not present in importing xlsx or ods. Please
try the attached file.
(This is in Ubuntu Linux 20.04 with gretl built today. The xls file is
created by Libreoffice)
--
Ignacio Díaz-Emparanza
Departamento de Métodos Cuantitativos
Universidad del País Vasco - Euskalherriko Unibertsitatea, UPV/EHU
Tfno: (+34) 94 601 3732
3 years, 3 months
Inefficiency in join command?
by atecon
Hi all,
I just have to work a with a large panel dataset (left-hand side) to
which I would like to join a couple of series from a RHS-dataset. The
correct mapping is done via two keys.
I did some performance check, and it seems that the current
implementation runs the sorting/ mapping for each series joined
separately even though a single sorting/ mapping should be sufficient
(if I am not wrong).
In a first experiment I join all series from the RHS dataset by means of
the wildcard operator:
<join "@NAME_RHS_DATA" * --ikey=datedim,unitdim>
which takes about 5 sec. here.
Then I re-run the experiment by successively increasing the number of
series to join:
<hansl>
loop i=1..nelem(RHS_SERIES_NAMES)
printf "\nInfo: Start joining %d series.\n", $i
flush
strings tojoin = RHS_SERIES_NAMES[1:$i]
set stopwatch
join "@NAME_RHS_DATA" tojoin --ikey=datedim,unitdim
printf "\nInfo: Joining took %.2f sec.\n", $stopwatch
flush
list New = dataset - Base
delete New --force
endloop
</hansl>
The output is as follows:
<output>
Info: Joining all series took 4.91 sec.
Info: Start joining 1 series.
Info: Joining took 1.91 sec.
Info: Start joining 2 series.
Info: Joining took 2.88 sec.
Info: Start joining 3 series.
Info: Joining took 3.88 sec.
Info: Start joining 4 series.
Info: Joining took 4.84 sec.
Script done
</output>
Do you agree that the sorting or mapping overhead can in principle be
reduced when joining multiple series at once?
Thanks,
Artur
3 years, 3 months