Allin,
A free lecture in all senses of the word. It's a very nice way into
conducting Monte Carlo tests within -hansl-. Unfortunately, I'm not an
econometrician or a methods innovator - I just use other people's work for
my benefit!!
C
On 10 October 2015 at 00:56, Allin Cottrell <cottrell(a)wfu.edu> wrote:
On Sat, 10 Oct 2015, Clive Nicholas wrote:
On 9 October 2015 at 18:24, Allin Cottrell <cottrell(a)wfu.edu> wrote:
>
> [...]
>
> For the record, then, let's point out that the two basic approaches to
>
>> heteroskedasticity in gretl -- namely, switching to "robust" standard
>> errors, or switching from OLS to GLS via the "hsk" command -- do not
>> require taking logs of negative numbers. The following script
>> illustrates.
>> The series y and x contain both positive and negative values, and the
>> data-generating process is heteroskedastic by construction.
>>
>> <hansl>
>> nulldata 50
>> set seed 3711
>> series x = normal()
>> # generate heteroskedastic y
>> series y = -1 + 3*x + normal()*x
>> # verify we have negative values in both y and x
>> print y x --byobs
>> # run OLS
>> ols y 0 x
>> # try robust standard errors: no problem
>> ols y 0 x --robust
>> # try GLS: again, no problem
>> hsk y 0 x
>> </hansl>
>>
>> In this case the "hsk" command produces a closer approximation to the
>> true
>> x-slope of 3.0 (2.997, versus 3.098 from OLS), although obviously one
>> would
>> have to replicate the example a large number of times to verify that (as
>> theory says) the hsk estimates are more efficient, given
>> heteroskedasticity.
>>
>
> [...]
>>
>
> Very interesting things happen when you keep increasing the simulation
> sample size by factors of 10.
>
> At N=500, the OLS estimate is closer to the true value of X (3.01) than
> the
> GLS estimate (3.06). At N=5000, they were pretty much identical, yet
> further away from X than before (2.97). At N=50000, they both tended back
> to the true value at 2.99.
>
> As Arthur Atkinson without his washboard would have said, "How queer!"
>
Actually (sorry, but) not really so interesting, or queer. The sample size
is basically irrelevant. What matters is how many replications you run at a
given sample size. Consider the following:
<hansl>
nulldata 50
set seed 3711
series x = normal()
# generate heteroskedastic y
series y = -1 + 3*x + normal()*x
# verify we have negative values in both y and x
print y x --byobs
# run OLS
ols y 0 x
# try robust standard errors: no problem
ols y 0 x --robust
# try GLS: again, no problem
hsk y 0 x
scalar N=5000
matrix B1 = zeros(N, 2)
loop i=1..N -q
series y = -1 + 3*x + normal()*x
ols y 0 x --quiet
B1[i,1] = $coeff[2]
hsk y 0 x --quiet
B1[i,2] = $coeff[2]
endloop
eval meanc(B1)
eval sdc(B1)
</hansl>
This produces (the tail of the output):
? eval meanc(B1)
3.0002 2.9993
? eval sdc(B1)
0.26747 0.19028
Since heteroskedasticity does not bias the coefficient estimates it is
unsurprising that the means of both columns are very close to 3.0 (the
known "true" slope). But heteroskedasticity makes OLS inefficient compared
to GLS, and that is amply confirmed by the substantially larger standard
deviation of the estimated slopes under OLS (column 1 of matrix B1)
relative to GLS (column 2).
(Reckoned worth replying since this sort of thing shows off the ease of
doing Monte Carlo in hansl.)
Allin Cottrell
_______________________________________________
Gretl-users mailing list
Gretl-users(a)lists.wfu.edu
http://lists.wfu.edu/mailman/listinfo/gretl-users
--
Clive Nicholas
"My colleagues in the social sciences talk a great deal about methodology.
I prefer to call it style." -- Freeman J. Dyson