On Wed, 13 Sep 2017, Allin Cottrell wrote:
On Wed, 13 Sep 2017, Riccardo (Jack) Lucchetti wrote:
> I was thinking that we might want to map what we call "discrete" series
> into what R calls "factors". The idea is quite simple, and exemplified in
> the script below. My question to the list is: is this a good idea? Is it
> worth the coding effort (very small IMO)?
>
> Comments welcome.
>
> <hansl>
> nulldata 50
>
> cont1 = normal()
> cont2 = normal()
> disc1 = floor(uniform(1,5))
> disc2 = floor(uniform(4,18))
>
> discrete disc1 disc2
>
> list D = dataset
> loop foreach i D
> if !isdiscrete($i)
> D -= $i
> endif
> endloop
>
> matrix mD = D
> mwrite(mD, "discrete.mat", 1)
>
> foreign language=R --send-data
> D <- gretl.loadmat("discrete.mat");
> for (i in D) {gretldata[,i] <- as.factor(gretldata[,i])};
> summary(gretldata);
> end foreign
> </hansl>
Not sure about this, but my initial reaction is that it may be assuming too
much about our "discrete" series.
In R, isn't a "factor" a variable that (in gretl parlance) has to be
"dummified" before use in regression? That is, an arbitrary encoding of a
qualitative characteristic?
If so, then I think the above is wrong, since a gretl-discrete series could
be a perfectly valid (albeit quantized) quantitative variable; for example,
years of education or number of bedrooms.
But If I'm wrong about what a "factor" is to R, my objection may fall.
Sorry, I should have added: we now have the facility, under the
"setinfo" command, of marking a series as "coded". And when we write
a "coded" series as CSV we quote the numerical values, in response
to which R automatically treats the series as a "factor". So I think
we already have what you're aiming at here.
Allin