On Mon, 19 Sep 2011, Pietro Battiston wrote:
I'd like to have some clarification on the line
"b) that the new data carries clear observation information so that
gretl can work out how to place the values"
in the description of "append" from the Gretl Command reference.
By "observation information" gretl means the observation
markers in the first column (e.g. dates for time-series,
strings for indentifiable units such as countries).
Basically, this is my situation: I have two databases. In both, I
have a
two fields "id1" and "id2". The combined fields are unique in the
first
database, while instead in the second I can have two different rows with
the same "id1" and "id2" values.
So, first question: when I append the second db to the first, are the
common field names "clear observation information" for Gretl to do some
sort of matching between the rows of the two databases?
No, gretl is not able to guess on that basis.
Second question: the second database has a field "days",
and my aim is
that in my merged database, the (unique) row with given "id1" and
"id2"
has, in the field "days", the sum of the "days" variable in all rows
with that combination of ids. How can I do that?
You should be able to do that sort of thing by loading the
data series into a matrix. For example, consider the following
simplified version of your data:
id1 id2 days
1 2 3
1 1 4
2 1 4
3 1 5
1 3 5
2 1 2
3 1 4
The script below will do what you describe, if I've understood
you correctly.
<hansl>
scalar imin = min(id1)
scalar imax = max(id1)
scalar jmin = min(id2)
scalar jmax = max(id2)
# count the number of ordered id pairs
scalar np = 0
loop i=imin..imax -q
loop j=jmin..jmax -q
if sum(id1 == i && id2 == j) > 0
np++
endif
endloop
endloop
printf "npairs = %d\n", np
matrix C = zeros(np, 3)
colnames(C, "id1 id2 days")
# cumulate the 'days' data for each ordered id pair
scalar r = 0
loop i=imin..imax -q
loop j=jmin..jmax -q
if sum(id1 == i && id2 == j) > 0
r++
C[r,1] = i
C[r,2] = j
loop k=1..$nobs -q
if id1[k] == i && id2[k] == j
C[r,3] += days[k]
endif
endloop
endif
endloop
endloop
print C
# write out as gretl data file
nulldata np --preserve
series id1 = C[,1]
series id2 = C[,2]
series days = C[,3]
store compacted.gdt id1 id2 days
</hansl>
Allin Cottrell