On Thu, 14 Sep 2017, Riccardo (Jack) Lucchetti wrote:
 On Wed, 13 Sep 2017, Allin Cottrell wrote:
>> Not sure about this, but my initial reaction is that it may be assuming 
>> too much about our "discrete" series.
>> 
>> In R, isn't a "factor" a variable that (in gretl parlance) has to
be 
>> "dummified" before use in regression?  That is, an arbitrary encoding
of a 
>> qualitative characteristic?
 Yes, you're right.
>> If so, then I think the above is wrong, since a gretl-discrete series 
>> could be a perfectly valid (albeit quantized) quantitative variable; for 
>> example, years of education or number of bedrooms.
>> 
>> But If I'm wrong about what a "factor" is to R, my objection may
fall.
> 
> Sorry, I should have added: we now have the facility, under the "setinfo" 
> command, of marking a series as "coded". And when we write a
"coded" series 
> as CSV we quote the numerical values, in response to which R automatically 
> treats the series as a "factor". So I think we already have what you're
> aiming at here.
 I agree that the mapping to R's factors is much more accurate if we used the 
 "coded" bit. However, R doesn't seem to make this distinction automagically
 for integer-valued coded strings. Example:
 <hansl>
 nulldata 50
 cont1 = normal()
 disc1 = floor(uniform(1,5))
 disc2 = floor(uniform(4,18))
 stringify(disc1, defarray("a", "b", "c", "d")) #
string-valued series
 list D = disc1 disc2
 loop foreach i D
    setinfo $i --coded
 endloop
 foreign language=R --send-data
    summary(gretldata);
 	is.factor(gretldata$disc1);
 	is.factor(gretldata$disc2);
 end foreign
 </hansl> 
Hmm, I see what you mean. Not sure where I got the idea that "quoted 
in CSV" means factor to R, but apparently it's not true in general.
 Perhaps we could force R to treat variables as factors via an
additional 
 option to foreign, something like
 foreign language=R --send-data --as-factors=X
 where X is a list. 
In git there's now something more automated than that: we have a 
variant of what you sketched in
http://lists.wfu.edu/pipermail/gretl-devel/2017-September/007916.html
whereby we send R a matrix that identifies any "coded" series in
gretl as factors for R. (We handle the case where the data passed to 
R are a subset of the full dataset, as in --send-data=Rlist.)
Allin