New subject: [EXTERNAL] Re: Some suggestions on the "Number of cases ’correctly predicted'" from a logit/probit outputs

Saturday, 17 December 2022

Hi Jack, Allin, Sven and all others on the gretl team,
As you know, "Number of cases ’correctly predicted’" in a logit/probit model can
be miss-leading even in a 50/50 split case.
What we should be comparing is not zero ‘correctly predicted’, but rather random
assignments based on sample mean.
If a sample is 50/50 split, a random assignment would get 50% "correctly
predicted", in theory. If our model's 'correctly predicted' is 70%, we
are only 20 percentage points higher than a model based on random assignment, representing
an improvement over the random assignment model by only 40%.
Thus, I would like to propose an alternative output from gretl, i.e. the  “Extra number of
cases 'correctly predicted' over random assignment” (or something like that), call
this dot_R-square perhaps.

Dot_R-saure = (Y_hat_model - Y_hat_random) / (1-Y_hat_random)
where 	Y_hat_model = sum(Y_hat_model_i=Y_i)/N
		Y_hat_random = Y_hat^2 + (1-Y_hat)^2
		Y_hat is the sample mean
		Y_hat_model_i = Pro(Y_i = 1) >Y_hat
		Pro(Y_i = 1) >Y_hat = 1, if Pro(Y_i = 1) >Y_hat is true

Unlike the McFadden R-squared, the interpretation of this is fairly straight forward, i.e.
the percent that our model is better off than a model based on random assignment.

Best,
Fred

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Some suggestions on the "Number of cases ’correctly predicted'" from a logit/probit outputs