On Thu, 9 Apr 2009, Allin Cottrell wrote:
On Wed, 8 Apr 2009, Starr, Brian wrote:
> Is there a straightforward way to display a goodness of fit
> statistic for the Heckit two step estimator? I'm not seeing one
> listed in the standard output.
There's no "correct" R-squared value for this sort of regression,
and it's not ML so the likelihood is not strictly relevant. I
suppose you could calculate the square of the correlation between
the the actual and fitted values of the dependent variable.
Hmm. That would be an ingenious way to get "something like an R^2", but I
doubt it'd be of much use for models, like heckit, where you have some
form of censoring/truncation etc. On the other hand, a similar line of
criticism involving R^2 in the context of IV models is well known, so it
may be worthwhile to think this R2 thing out once and for all.
The way I personally see it is, people like R2 because it's easy to read,
although its "real" meaning is not particularly clear. Yes, for OLS models
it _happens_ to be equal to the squared correlation coefficient beween y
and yhat, but for me it's just a number that tells me if a particular
model contains any explanatory variables worth keeping, apart from the
constant; the closer to 1, the better.
Maybe a nice way to generalise R2, that we could use for every model we
have, is to define
R2 = W/(W + $T)
where W is a Wald-type test for zeroing all explanatory variables
(constant excluded). That is: suppose we have a model with the intercept
as first explanatory variable,
<foo> y const x1 x2 x3 < possibly more stuff... >
where <foo> can be two-step heckit or whatever you like. Then
matrix b= $coeff[2:]
matrix V = $vcv[2:,2:]
W = qform(b', invpd(V))
R2 = W / ( W + $T )
This would have several advantages:
1) It'd be defined for every model you can reasonably speak of
"explanatory variables" and perform hypothesis tests on their
coefficients (so that includes tsls, probit, logit, heckit, arma,
etcetera)
2) The estimation method you use (ml, cml, gmm, least squares
etcetera) wouldn't matter, as long as you can make tests.
3) It lies between 0 and 1 by construction, since a Wald-type test is a pd
quadratic form.
4) It would coincide with the usual R2 for OLS.
Do you guys like the idea?
Riccardo (Jack) Lucchetti
Dipartimento di Economia
Università Politecnica delle Marche
r.lucchetti(a)univpm.it
http://www.econ.univpm.it/lucchetti