OK, this is more interesting than I thought (and I'd welcome
comments/criticism from others).
Salvatore's question is prompted by a difference in results between
gretl's Breusch-Pagan test for heteroskedasticity (via the command
modtest --breusch-pagan) and the test as implemented in Eviews.
First point: on the face of it there's a difference between the test
actually proposed by Breusch and Pagan (Econometrica, 1979), which
I'll call Test1, and that performed in Eviews, which I'll call
Test2. (Eviews is by no means alone in using Test2, it's also the
one described in the Wikipedia entry.) Both tests involve auxiliary
regressions estimated via OLS, with the same set of independent
variables, but
* In Test1 the dependent variable is the squared residual from the
original regression divided by the ML estimator of the error
variance for that regression, and the LM test statistic is half of
the explained sum of squares from the auxiliary regression. (This is
very clearly set out on the first page of the B-P paper.)
* In Test2 the dependent variable is the (unscaled) squared residual
from the original regression and the LM test statistic is sample
size times R^2 from the auxiliary regression.
At first I thought this difference might be merely apparent (that
is, the results should be numerically identical if done right). But
I couldn't prove that and (unless I've messed up) simulation shows
that they're not at all numerically identical. (Maybe they're
asymptotically equivalent?) In that case, which one is "right" or
"better"?
Monte Carlo to the rescue. I show my test script below. It can be
run in two modes, depending on the value of "H0_true" near the top
of the script.
* If H0_true = 1 the simulated error is homoskedastic and the
rejection rates for the two tests give a measure of their sizes. I'm
using the 5 percent significance level, so ideally the empirical
rejection rate should be close to 0.05.
* If H0_true = 0 the simulated error is by construction
heteroskedastic and the rejection rates give a measure of power.
On running the script in both modes several times, here's what I
see:
* Size: both tests seem about right. Sometimes one is closer to the
nominal size of 0.05, sometimes the other.
* Power: Test1 (the original) seems to be substantially better.
Here are rejection rates from a few runs:
H0_true = 1
Test1 0.049, Test2 0.053
Test1 0.043, Test2 0.046
Test1 0.048, Test2 0.047
H0_true = 0
Test1 0.548, Test2 0.311
Test1 0.546, Test2 0.316
Test1 0.698, Test2 0.520
If anyone sees an error in my simulation, please let me know!
<hansl>
set verbose off
# sample size for regressions
nulldata 100
# uncomment below for reproducible results
# set seed 76543
# error term is homoskedastic?
H0_true = 1
# independent variable
series x = normal()
# 5 percent critical value for chi-square(1)
scalar X1crit = critical(X, 1, 0.05)
# number of replications
K = 5000
# recorders
matrix reject = zeros(1,2)
matrix LMdiff = zeros(K,1)
loop i=1..K --quiet
e = normal()
if H0_true
# homoskedastic
y = 10 + x + e
else
# heteroskedastic
y = 10 + x + 0.2*x*e
endif
# estimate original regression
ols y 0 x --quiet
u2 = $uhat^2
# Test1: as Breusch and Pagan (1979)
vml = $ess / $T
g = u2/vml
ols g 0 x --quiet
LM1 = (sst(g) - $ess)/2
reject[1] += LM1 > X1crit
# Test2: as Wikipedia, Eviews
ols u2 0 x --quiet
LM2 = $trsq
reject[2] += LM2 > X1crit
LMdiff[i] = LM1 - LM2
endloop
reject /= K
printf "Rejection rates at 5 percent level:\n"
printf " Test1 %.3f, Test2 %.3f\n", reject[1], reject[2]
colnames(LMdiff, "LMdiff")
summary --matrix=LMdiff --simple
</hansl>
Allin