Hi all,
I am currently working on a rather large cross-sectional data set.
Overall I have data comprising around half a Million individuals, about
225000 households and 26 countries.
I tried to merge the different datasets, after dropping irrelevant
variables before. But merging households (identified by HID) and
countries (country) by "join" really takes a lot of time. Actually,
joining one variable takes 25 minutes on my linux machine (2.6GHz). If I
use STATA it may take a minute or so.
Why does it take that long here? I am surprised because typically gretl
operates pretty fast.
Also, is there a way to merge all cross-sectional variables from the
"outside" dataset with the "inner" one by a single command? At the
moment one has to specify a join command for each variable separately,
right? I am just asking out of curiosity as I am fine with the way it is
currently implemented.
Artur