On Sat, 3 Aug 2013, Andreea Bolos wrote:
Thank you for your replies. It is very helpful for me. The data I am
working with contains sensitive information and I do not have the
permission to share it, otherwise I would have sent a sample. The command
worked but now I encountered another problem. The message is "data types
not conformable for operation". Gretl asked for the aggregation option.
But I do not want to aggregate. The right-hand variables have more than
one match with the left-hand variables.
That sounds like a blocker: if there's more than one match on
the right for a given row on the left, this _must_ be handled
somehow, either by one "aggregation" method or other
(including sequence number as an option), or by application of
a filter on the right-hand data to reduce the matches to
uniqueness. (Or possibly by sub-sampling on the left to skip
rows that have multiple matches on the right.) You just can't
stick more than one value from the right into a given
row/column slot on the left.
The files contain both numerical and string variables that
need to be matched (the count of matches is not helping).
For example one column on the right-hand that needs to be
matched to the left-hand through variable NUMBER (numerical
variable) looks similar to this:
TYPEOFCONTROL (name)
type2
type2
type2
type3
type4
...
Is it possible through the JOIN function to get it in numerical form after
matching, or they need to be numerical before matching?
You can match first. Here's a trivial example:
content of bolos1.csv:
x,key
1,a
2,b
3,c
4,d
5,e
content of bolos2.csv:
TYPEOFCONTROL,key
type2,e
type2,d
type2,c
type3,b
type4,a
hansl script:
open bolos1.csv
join bolos2.csv TYPEOFCONTROL --ikey=key
print -o
output from script:
? open bolos1.csv --quiet
Read datafile
/home/cottrell/stats/esl/gretl/build/cli/bolos1.csv
? join bolos2.csv TYPEOFCONTROL --ikey=key
? print --byobs
x key TYPEOFCONTROL
1 1 1 3
2 2 2 2
3 3 3 1
4 4 4 1
5 5 5 1
Allin Cottrell