Heckit goodness of fit - Gretl-users - gretlml.univpm.it

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Heckit goodness of fit

plot problems

Correcting for inflation

Starr, Brian

Thursday, 9 April 2009 Thu, 9 Apr '09

1:25 p.m.

Is there a straightforward way to display a goodness of fit statistic for the Heckit two step estimator? I'm not seeing one listed in the standard output. Thanks,

Reply

Show replies by date

Allin Cottrell

Friday, 10 April Fri, 10 Apr

8:47 a.m.

On Wed, 8 Apr 2009, Starr, Brian wrote:

Is there a straightforward way to display a goodness of fit statistic for the Heckit two step estimator? I'm not seeing one listed in the standard output.

There's no "correct" R-squared value for this sort of regression, and it's not ML so the likelihood is not strictly relevant. I suppose you could calculate the square of the correlation between the the actual and fitted values of the dependent variable. Allin Cottrell

Reply

Riccardo (Jack) Lucchetti

Saturday, 11 April Sat, 11 Apr

7:26 p.m.

On Thu, 9 Apr 2009, Allin Cottrell wrote:

On Wed, 8 Apr 2009, Starr, Brian wrote: > Is there a straightforward way to display a goodness of fit > statistic for the Heckit two step estimator? I'm not seeing one > listed in the standard output. There's no "correct" R-squared value for this sort of regression, and it's not ML so the likelihood is not strictly relevant. I suppose you could calculate the square of the correlation between the the actual and fitted values of the dependent variable.

Hmm. That would be an ingenious way to get "something like an R^2", but I doubt it'd be of much use for models, like heckit, where you have some form of censoring/truncation etc. On the other hand, a similar line of criticism involving R^2 in the context of IV models is well known, so it may be worthwhile to think this R2 thing out once and for all. The way I personally see it is, people like R2 because it's easy to read, although its "real" meaning is not particularly clear. Yes, for OLS models it _happens_ to be equal to the squared correlation coefficient beween y and yhat, but for me it's just a number that tells me if a particular model contains any explanatory variables worth keeping, apart from the constant; the closer to 1, the better. Maybe a nice way to generalise R2, that we could use for every model we have, is to define R2 = W/(W + $T) where W is a Wald-type test for zeroing all explanatory variables (constant excluded). That is: suppose we have a model with the intercept as first explanatory variable, <foo> y const x1 x2 x3 < possibly more stuff... > where <foo> can be two-step heckit or whatever you like. Then matrix b= $coeff[2:] matrix V = $vcv[2:,2:] W = qform(b', invpd(V)) R2 = W / ( W + $T ) This would have several advantages: 1) It'd be defined for every model you can reasonably speak of "explanatory variables" and perform hypothesis tests on their coefficients (so that includes tsls, probit, logit, heckit, arma, etcetera) 2) The estimation method you use (ml, cml, gmm, least squares etcetera) wouldn't matter, as long as you can make tests. 3) It lies between 0 and 1 by construction, since a Wald-type test is a pd quadratic form. 4) It would coincide with the usual R2 for OLS. Do you guys like the idea? Riccardo (Jack) Lucchetti Dipartimento di Economia Università Politecnica delle Marche r.lucchetti(a)univpm.it http://www.econ.univpm.it/lucchetti

Reply

Allin Cottrell

Sunday, 12 April Sun, 12 Apr

1:37 a.m.

On Sat, 11 Apr 2009, Riccardo (Jack) Lucchetti wrote:

On Thu, 9 Apr 2009, Allin Cottrell wrote: > There's no "correct" R-squared value for this sort of regression, > and it's not ML so the likelihood is not strictly relevant. I > suppose you could calculate the square of the correlation between > the the actual and fitted values of the dependent variable. Hmm. That would be an ingenious way to get "something like an R^2", but I doubt it'd be of much use for models, like heckit, where you have some form of censoring/truncation etc. On the other hand, a similar line of criticism involving R^2 in the context of IV models is well known, so it may be worthwhile to think this R2 thing out once and for all.

That would be nice!

Maybe a nice way to generalise R2, that we could use for every model we have, is to define R2 = W/(W + $T) where W is a Wald-type test for zeroing all explanatory variables (constant excluded). That is: suppose we have a model with the intercept as first explanatory variable, <foo> y const x1 x2 x3 < possibly more stuff... > where <foo> can be two-step heckit or whatever you like. Then matrix b= $coeff[2:] matrix V = $vcv[2:,2:] W = qform(b', invpd(V)) R2 = W / ( W + $T )

The variant R2 = W / ( W + ($T - $ncoeff)) would give the regular R^2 for OLS estimates.

This would have several advantages...

Yes, I like it. But note that at present it produces a horrid mess for two-step heckit since the covariance matrix is stuffed with NAs/nans. I guess we should be able to fix that up without too much difficulty. Allin.

Reply

Riccardo (Jack) Lucchetti

1:47 a.m.

On Sat, 11 Apr 2009, Allin Cottrell wrote:

> Maybe a nice way to generalise R2, that we could use for every model we > have, is to define > > R2 = W/(W + $T) >

...

The variant R2 = W / ( W + ($T - $ncoeff)) would give the regular R^2 for OLS estimates.

Fine by me.

> This would have several advantages... Yes, I like it. But note that at present it produces a horrid mess for two-step heckit since the covariance matrix is stuffed with NAs/nans. I guess we should be able to fix that up without too much difficulty.

Not really, if the coefficients we test for 0 are only those for the main equation, and leave the selection equation alone (as I think we should: the selection equation may be interesting in its own right, but the model you care about is the main equation); $vcv is block diagonal, so we should be ok. Riccardo (Jack) Lucchetti Dipartimento di Economia Università Politecnica delle Marche r.lucchetti(a)univpm.it http://www.econ.univpm.it/lucchetti

Reply

Allin Cottrell

2:03 a.m.

On Sat, 11 Apr 2009, Riccardo (Jack) Lucchetti wrote:

On Sat, 11 Apr 2009, Allin Cottrell wrote:

> Yes, I like it. But note that at present it produces a horrid > mess for two-step heckit since the covariance matrix is stuffed > with NAs/nans. I guess we should be able to fix that up without > too much difficulty. Not really, if the coefficients we test for 0 are only those for the main equation, and leave the selection equation alone (as I think we should: the selection equation may be interesting in its own right, but the model you care about is the main equation); $vcv is block diagonal, so we should be ok.

OK, granted; you just have to careful to limit the test to the main equation. I do have one reservation. As you put it, one typically wants the R^2 as a quick check on whether "a particular model contains any explanatory variables worth keeping". Yes, there's that, but one also wants a simple measure of "goodness of fit", and the two can diverge. Here's a silly tsls example: <script> open data4-10 ols ENROLL 0 2 3 tsls ENROLL 0 2 3 ; 3 4 5 6 matrix b= $coeff[2:] matrix V = $vcv[2:,2:] W = qform(b', invpd(V)) R2 = W / ( W + ($T-$ncoeff) ) R2 = corr(ENROLL, $yhat)^2 </script> The correlation-based R^2 that we print currently (and which is reproduced at the end) is just slightly lower for the tsls model than the OLS. And in one sense that seems right -- the _fit_ is only slightly reduced by instrumenting variable 2. On the other hand, no coefficient is significant in the tsls variant, and the Wald-based R^2 is much lower. Allin.

Reply

Riccardo (Jack) Lucchetti

7:49 a.m.

On Sat, 11 Apr 2009, Allin Cottrell wrote:

I do have one reservation. As you put it, one typically wants the R^2 as a quick check on whether "a particular model contains any explanatory variables worth keeping". Yes, there's that, but one also wants a simple measure of "goodness of fit", and the two can diverge.

True. You have a valid point here. However, let me state my point of view more clearly (sorry, this WILL be slightly verbose). OLS is quite exceptional among estimation methods, in that the OLS statistic \hat{\beta} has a dual interpretation: it is at the same time a nice descriptive statistic (the solution of a purely algebraic minimisation problem, or geometric if you like), which also happens to be a smart choice as an estimator, under certain circumstances. This lucky coincidence allows the R2 statistic to have a dual interpretation too: "goodness of fit" and "overall validity of the model". The first interpretation comes quite natural when the dependent variable is continuous and you think of a statistical model as a machine that yields the "best" approximation to it. It makes very good sense to judge the approximation on the basis of correlation (and square it if you feel like doing it). A notable advantage of this interpretation is that it involves no probability/inference concepts. To take it to the extreme, it's just the square cosine of an angle. As such, it's a nice measure. Unfortunately, this interpretation often breaks down in several cirumstances. In a Tobit model, for example, you have a non-null share of the sample for which that the dependent variable is 0. Does it make sense to use those values when computing the correlation? Or should you just ignore them? What should you do when the dependent variable is discrete, like in a probit model? Or, worse, for a _multinomial_ probit model, in which the numbers often have only a conventional value? And the list goes on. Hence, what I had in mind when I proposed the Wald-based R2 is to leverage on the second interpretation instead (note that other versions of "generalised R2" that have been proposed in the literature, like McFadden's, have a similar justification). In this sense, the TSLS example you make is very well chosen, as TSLS lies very near to the border. It makes sense to compute fitted values for TSLS models, but in general you should not expect them to fit the data particularly well. If you have a model y = Xb + u in which you have reasons to say that X and u are correlated, you're saying (with a choice of words that has well-known historical reasons) that the vector of coefficients you're interested in is not the parameter of the conditional mean, but something else, which has a richer behavioural meaning. As a consequence, TSLS is not an "approximation machine", but rather a very clever way to solve a problem of interpretation of the estimates, since the estimates of "b" you end up with have no special property in terms of "fit" (well, ok, they have, but only if you think in terms of oblique projections rather than orthogonal). In this context, the "goodness of fit" measure may diverge from the "overall significance" by a large extent. The question is: which one should we appoint to the R2 office? Which of course has no obvious answer. So... I don't know! Maybe we just ought to leave this to somebody else and confine R2 to least-squares based models. Maybe not. I'm open! PS By the way: happy Easter to everybody! Riccardo (Jack) Lucchetti Dipartimento di Economia Università Politecnica delle Marche r.lucchetti(a)univpm.it http://www.econ.univpm.it/lucchetti

Reply

5924

days inactive

5926

days old

gretl-users@gretlml.univpm.it

Manage subscription

6 comments

3 participants

tags (0)

participants (3)

Allin Cottrell
Riccardo (Jack) Lucchetti
Starr, Brian