On Mon, 13 Nov 2023, Artur T. wrote:
Hi all,
I just stumbled over the following challenge.
In the first example, I run a binary logit with a numeric
(discrete) encoded variable "dirnum".
In the second example, the dependent is string-valued series
"dirstr" with only two distinct values ("down", "up").
The logit command estimates a binary logit for the first example
(as expected) and a ordered logit for the 2nd example, however.
The reason might be that for "stringified" series, the minimal
distinct value is 1 and not zero.
Even though the magnitude of the coefficients are equals, the sign
may differ. Also, some statistics such as R^2 and the contigency
table are not printed for the ordered case.
Is this an expected behavior?
Well, I think it might be expected, since the logit doc starts with
"If the dependent variable is a binary variable (all values are 0 or
1)...". But it's not intentional, nor (I think) desirable.
Now in git master, if a string-valued series has just two values,
and reduces to a straight dummy variable on subtracting 1 from its
numeric codes, we'll treat it as a binary case. Note that this is a
special dispensation for string-valued series, which much always
have a minimum numeric code of 1; we're not going to do this for
regular numeric series that have values 1 and 2 -- subtract 1,
please!
On R-squared: there's no reason why we shouldn't show a
pseudo-R-squared for ordered logit as well as binary, and that too
is now in git.
On signs of coefficients: If you're using a string-valued dependent
variable with two values you need to pay attention to this point. In
the "stringifying" of a series the first string will get attached to
whichever numeric value occurs first, but for binary logit you
presumably want the "on"/"true"/"yes" value to attach to the
higher
of the two numeric codes (which will turn into 1 on subtraction).
The changes mentioned above will be in the snapshots before long.
An example follows:
<hansl>
open greene19_1.gdt
# regular binary logit
logit GRADE 0 GPA TUCE PSI --p-values
series y = GRADE + 1
# invokes the "ordered" variant: now shows McFadden
logit y 0 GPA TUCE PSI --p-values
# note: the order matters in defarray()
stringify(y, defarray("down", "up"))
# invokes the binary variant
logit y 0 GPA TUCE PSI --p-values
</hansl>
Allin