On Sat, 17 Dec 2022, Fred Engst wrote:
Hi Jack, Allin, Sven and all others on the gretl team,
As you know, "Number of cases ’correctly predicted’" in a logit/probit model
can be miss-leading even in a 50/50 split case.
What we should be comparing is not zero ‘correctly predicted’, but rather random
assignments based on sample mean.
If a sample is 50/50 split, a random assignment would get 50% "correctly
predicted", in theory. If our model's 'correctly predicted' is 70%, we
are only 20 percentage points higher than a model based on random assignment, representing
an improvement over the random assignment model by only 40%.
Thus, I would like to propose an alternative output from gretl, i.e. the “Extra number
of cases 'correctly predicted' over random assignment” (or something like that),
call this dot_R-square perhaps.
Dot_R-saure = (Y_hat_model - Y_hat_random) / (1-Y_hat_random)
where Y_hat_model = sum(Y_hat_model_i=Y_i)/N
Y_hat_random = Y_hat^2 + (1-Y_hat)^2
Y_hat is the sample mean
Y_hat_model_i = Pro(Y_i = 1) >Y_hat
Pro(Y_i = 1) >Y_hat = 1, if Pro(Y_i = 1) >Y_hat is true
Unlike the McFadden R-squared, the interpretation of this is fairly straight forward,
i.e. the percent that our model is better off than a model based on random assignment.
There is a fair number of similar statistics available in the "extra"
package, under the name "scores2x2". Have you checked if your proposed
statistic is in there already?
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------