Re: [Gretl-users] Speed of "join"

Saturday, 14 September 2013

Am 13.09.2013 16:23, schrieb Riccardo (Jack) Lucchetti:
...
 On Fri, 13 Sep 2013, Artur T. wrote:

> Hi all,
>
> I am currently working on a rather large cross-sectional data set.
> Overall I have data comprising around half a Million individuals, about
> 225000 households and 26 countries.
>
> I tried to merge the different datasets, after dropping irrelevant
> variables before. But merging households (identified by HID) and
> countries (country) by "join" really takes a lot of time. Actually,
> joining one variable takes 25 minutes on my linux machine (2.6GHz). If I
> use STATA it may take a minute or so.
>
> Why does it take that long here? I am surprised because typically gretl
> operates pretty fast.

 I find this very surprising. "join" is, normally, quite fast. Which
 options are you using? The command I use is quiet standard I guess:
<hansl>
join &quot;(a)WD/hfile.csv&quot; hhgr --ikey=country,hid
</hansl>
...

> Also, is there a way to merge all cross-sectional variables from the
> "outside" dataset with the "inner" one by a single command? At
the
> moment one has to specify a join command for each variable separately,
> right? I am just asking out of curiosity as I am fine with the way it is
> currently implemented.

 No, that's by design. However, you can use a foreach loop as in

 <hansl>
 loop foreach i foo bar baz
     join outer.csv $i <... your options...>
 end loop
 </hansl> That's also the way I've used it.
...

 -------------------------------------------------------
   Riccardo (Jack) Lucchetti
   Dipartimento di Scienze Economiche e Sociali (DiSES)

   Università Politecnica delle Marche
   (formerly known as Università di Ancona)

   r.lucchetti(a)univpm.it
   http://www2.econ.univpm.it/servizi/hpp/lucchetti
 -------------------------------------------------------

 _______________________________________________
 Gretl-users mailing list
 Gretl-users(a)lists.wfu.edu
 http://lists.wfu.edu/mailman/listinfo/gretl-users
  Artur

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Gretl-users] Speed of "join"