Re: [Gretl-devel] an Odd example with restrict -- NA as a standard error

Sunday, 18 November 2012

Here is a followup that illustrates what Jack is saying about when is a
small number really zero. Being too strict about the size of the value may
lead to other unintended results.  This example is based on the same
dataset (br.gdt) but uses restrictions that are nearly true.

? square sqft bedrooms
? logs price
? series price = price/100
? list xlist = const sqft sq_sqft sq_bedrooms bedrooms baths age
? matrix Rmat = zeros(3,4)~I(3)
? matrix r = {  700 ; 400; -10 }
? ols price xlist

Model 1: OLS, using observations 1-1080
Dependent variable: price

                coefficient     std. error    t-ratio    p-value
  ---------------------------------------------------------------
  const         168.782        216.484           0.7797 0.4358
  sqft           -0.758827       0.0741780     -10.23   1.68e-023 ***
  sq_sqft         0.000248214    1.03688e-05    23.94   8.12e-102 ***
  sq_bedrooms  -117.075         19.4308        -6.025   2.32e-09  ***
  bedrooms      694.058        138.416          5.014   6.22e-07  ***
  baths         379.550         46.2502         8.206   6.48e-016 ***
  age           -8.34062        1.14878        -7.260   7.40e-013 ***

Mean dependent var   1548.632   S.D. dependent var   1229.128
Sum squared resid    4.01e+08   S.E. of regression   611.6813
R-squared            0.753717   Adjusted R-squared   0.752340
F(6, 1073)           547.2962   P-value(F)           0.000000
Log-likelihood      -8458.451   Akaike criterion     16930.90
Schwarz criterion    16965.79   Hannan-Quinn         16944.11

? restrict --full
? R=Rmat
? q=r
? end restrict

Test statistic: F(3, 1073) = 0.955727, with p-value = 0.412961

Model 2: Restricted OLS, using observations 1-1080
Dependent variable: price

                coefficient     std. error     t-ratio     p-value
  ------------------------------------------------------------------
  const         201.080        88.6191         2.269       0.0235    **
  sqft          -0.783296      0.0645078    -12.14        6.88e-032 ***
  sq_sqft         0.000250439   9.22345e-06   27.15        4.42e-124 ***
  sq_bedrooms  -118.625         5.21332      -22.75        6.95e-094 ***
  bedrooms      700.000         0.000000      NA          NA
  baths         400.000         4.64027e-07    8.620e+08   0.0000    ***
  age           -10.0000        0.000000      NA          NA

Notice that the std error on sq_sqft squared is very small (but not zero)
and the one on baths (which is technically zero) is only 1 decimal smaller.
 If you didn't know that the se is supposed to be zero on a restricted
coefficient (like many of my students) you'd report something that was
obviously wrong.  In the original example, the problem was not so much in
the restrictions, but the conditioning of the data themselves, which
remains very bad even in this case of "good" restrictions.  It's not clear
to me how sorting this out based on size is possible.  Is there a complex
eigenvalue associated with the R*inv(X'X)*R' that might identify which
should be NA?

Lee

On Sun, Nov 18, 2012 at 11:11 AM, Lee Adkins <lee.adkins(a)okstate.edu&gt; wrote:

...

 On Sun, Nov 18, 2012 at 8:51 AM, Allin Cottrell <cottrell(a)wfu.edu&gt; wrote:

> On Sun, 18 Nov 2012, Riccardo (Jack) Lucchetti wrote:
>
> > On Sun, 18 Nov 2012, Allin Cottrell wrote:
> >
> >> If we were to do this, I'd favour restricting the "clean-up"
to the
> >> standard errors (printing 0 rather than NA) and let the $vcv
> >> accessor show what was actually computed, warts and all.
> >
> > I think that the flaws from machine precision are of great didactical
> value.
> > IMHO, teaching students that 1.2345e-30 is in fact zero and they should
> > _distrust_ software that writes "0" instead of 1.2345e-30 is part of
> teaching
> > good econometrics. That said, in a case like the one Lee brought up,
> better
> > to have 0 than NA.
>
> Agreed.
>
> There are two aspects of our policy to date that are
> questionable. First, when computing standard errors from a
> variance matrix we've always set the s.e. to NA when we
> encounter a negative diagonal entry. This is reasonable in
> general, but is arguably too strict when we're producing
> restricted estimates.
>
> When we estimate subject to restriction, the "ideal" result is
> that (a) the restrictions are met exactly and (b) whenever a
> restriction stipulates a definite numerical value for a
> parameter, its variance is exactly zero. But -- in line with
> your point above -- in general that ain't gonna happen in
> digital arithmetic. In a case like Lee's example, with a bunch
> of zero restrictions, we expect to find the computed variances
> distributed "randomly" in the close neighbourhood of zero,
> with some of them likely negative. In that case I think it's
> reasonable to print the standard errors as zero, if they're
> close enough, but provide the "true" (numerical, ugly)
> variance matrix for those who want to see it. That's now in
> CVS.
>
> Second, when it comes to retrieving the coefficient or
> standard error vector from a model, we've checked for NAs and
> if any are found we refuse to supply the object (as Lee
> observed). OK, that seems too delicate, or paternalistic, or
> something. So now in CVS you can access $stderr even if it
> contains NAs.
>
> Allin
>

 I like this solution since, at least for my purposes, it solves an
 immediate problem.  I'm trying to stuff the std errors after a _restrict
 --full_ statement into a bundle.  The function that initiates the bundle
 fails because $stderr is not returned.  I can _catch_ the error and put
 something into the matrix, but that defeats the purpose of using the
 _restrict --full_ in this case.   I realize the example is extreme, but I'm
 stress testing the set of functions for a RLS Stein-rule package I'm
 working on.

 I probably need to reassess  how I'm computing the restricted estimates in
 order to make the thing backward compatible with previous versions of
 gretl.  My first version used something similar to the Greene example Allin
 gave, but the restrict function is so elegant I couldn't resist using it
 instead.  Still, being able to put matrices that contain NA into a bundle
 will pay dividends down the road I think, especially because there are so
 many accessors available for subsequent use  ....  gretl does contain
 several ways to dress these up once they are available (e.g.,  misszero()
 ).

 Thanks,
 Lee

 --
 Lee Adkins
 Professor of Economics
 lee.adkins(a)okstate.edu

 learneconometrics.com

-- 
Lee Adkins
Professor of Economics
lee.adkins(a)okstate.edu

learneconometrics.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Gretl-devel] an Odd example with restrict -- NA as a standard error