On Mon, 14 Sep 2009, Hamaad Shah wrote:
There seems to be a problem in the code for the ordered
logit/probit model in GRETL. I'm getting different predictions
in GRETL and R. It's possible that there is some bug in the
code.
You're right, there was a problem here (now fixed in CVS and the
Windows snapshot).
The $yhat accessor produces correct values for z-hat_i, the
estimate of the latent variable at each observation (as described
in chapter 24 of the User's Guide). And from these values one can
compute predictions for the response variable that are in
agreement with R. BUT two other things were wrong:
(a) The count of "cases correctly predicted" (printed beneath the
model) was not done in the standard manner, where predicted y is
the response-value with the greatest estimated probability.
(b) The values you'd get from the "fcast" command for ordered
logit and probit were, unfortunately, rubbish. I'm afraid we
never noticed that we were generating "forecasts" as if the model
were a regular binary logit/probit.
The results from gretl's "fcast" for an ordered model should now
agree with what R says [if you use predict() after polr()].
Note that correct predictions based on gretl's $yhat can be
constructed as in the following example (though now that fcast is
working correctly you don't have to do this):
<script>
open greene22_2
genr rat = Z8 - 1
probit rat 0 Z1 Z2
series zhat = $yhat
matrix cut = $coeff[3:]
scalar n = $nobs
scalar ncut = rows(cut)
series pred = NA
loop i=1..n --quiet
# find the response with the greatest probability
# for the probit case
pmax = cdf(N, cut[1]-zhat[i])
cpmax = 1
loop j=2..ncut --quiet
prob = cdf(N, cut[j]-zhat[i]) - cdf(N, cut[j-1]-zhat[i])
if (prob > pmax)
pmax = prob
cpmax = j
endif
endloop
prob = 1 - cdf(N, cut[ncut]-zhat[i])
if (prob > pmax)
cpmax = ncut+1
endif
pred[i] = cpmax-1
endloop
print rat pred --byobs
</script>
The series "pred" above is what "fcast" now produces.
Allin Cottrell