On Tue, 19 Sep 2017, Schaff, Frederik wrote:
I am trying to load a bunch of data into gretl, with a structure
similar to the attached files (time 1600), i.e. I have horizontal
and vertical appending at the same time. I basically use a script
that opens the first file (like parameters_1.tsv) and then via a
loop appends all the other files. However, the horizontal appending
works only for the first files (belonging to *_1.tsv) and even there
not completely, afterwards a single entry is done for each file,
with "missing" values. There is a common key in the first column, so
I thought the "update-overlap" option should work [...]
Hold it there! The simple "append" mechanism has no way of defining
(or guessing at) a "key": for that you need the "join" command. As it
is, your "ABMAT_ConfigID" is treated as just a regular variable.
However, I can see that since you want to import hundreds of series,
"join" is not going to be very convenient. In this case I think I'd
recommend using matrices to piece the dataset together. A sample
script follows. (I've fixed what I take to be a bug in your version,
namely setting "path_interval_f" and "path_interval_h" to the same
string and hence appending the same data twice.)
<hansl>
set verbose off
clear
# string path_base = "F:/LSDgit/Work/EFBP_original/Results/Original/Aggregate/"
string path_base = ""
string path_ts = path_base ~ "parameters_"
string ts_name = ".tsv"
scalar first = 1
scalar last = 2 #1600
string fname = ""
strings iname = array(9)
iname[1] = "stat_0-499_"
iname[2] = "stat_500-999_"
iname[3] = "stat_1000-1499_"
iname[4] = "stat_1500-1999_"
iname[5] = "stat_1500-1504_"
iname[6] = "stat_1500-1519_"
iname[7] = "stat_1500-1549_"
iname[8] = "stat_1500-1599_"
iname[9] = "stat_1500-1749_"
matrix X = {}
# Step 1: get all the columns in place
sprintf fname "%s%d%s", path_ts, 1, ts_name
open @fname --preserve --quiet
printf " got %d vars, %d obs\n", $nvars, $nobs
X = {dataset}
strings vnames = varnames(dataset)
loop j=1..9 -q
fname = sprintf("%s%s%d%s", path_base, iname[j], 1, ts_name)
open @fname --preserve --quiet
printf " got %d vars, %d obs\n", $nvars, $nobs
vnames += varnames(dataset)
# add this dataset horizontally
X ~= {dataset}
endloop
# Step 2: append additional rows
loop i=2..last -q
sprintf fname "%s%d%s", path_ts, i, ts_name
open @fname --preserve --quiet
matrix addX = {dataset}
printf " got %d vars, %d obs\n", $nvars, $nobs
loop j=1..9 -q
fname = sprintf("%s%s%d%s", path_base, iname[j], i, ts_name)
open @fname --preserve --quiet
printf " got %d vars, %d obs\n", $nvars, $nobs
addX ~= {dataset}
endloop
printf "addX: %d x %d\n", rows(addX), cols(addX)
# add the whole addX vertically
X |= addX
endloop
colnames(X, vnames)
N = rows(X)
nulldata N --preserve
loop i=1..cols(X)
genseries(colname(X, i), X[,i])
endloop
delete index
varlist
print ABMAT_ConfigID se_Seed sh_Seed MA_Utility_MAE_uq_I9 -o
</hansl>
Allin