Hi Jack, Allin, Sven and all others on the gretl team,
As you know, "Number of cases ’correctly predicted’" in a logit/probit model can
be miss-leading even in a 50/50 split case.
What we should be comparing is not zero ‘correctly predicted’, but rather random
assignments based on sample mean.
If a sample is 50/50 split, a random assignment would get 50% "correctly
predicted", in theory. If our model's 'correctly predicted' is 70%, we
are only 20 percentage points higher than a model based on random assignment, representing
an improvement over the random assignment model by only 40%.
Thus, I would like to propose an alternative output from gretl, i.e. the “Extra number of
cases 'correctly predicted' over random assignment” (or something like that), call
this dot_R-square perhaps.
Dot_R-saure = (Y_hat_model - Y_hat_random) / (1-Y_hat_random)
where Y_hat_model = sum(Y_hat_model_i=Y_i)/N
Y_hat_random = Y_hat^2 + (1-Y_hat)^2
Y_hat is the sample mean
Y_hat_model_i = Pro(Y_i = 1) >Y_hat
Pro(Y_i = 1) >Y_hat = 1, if Pro(Y_i = 1) >Y_hat is true
Unlike the McFadden R-squared, the interpretation of this is fairly straight forward, i.e.
the percent that our model is better off than a model based on random assignment.
Best,
Fred