On Wed, 22 Apr 2020, Allin Cottrell wrote:
On Wed, 22 Apr 2020, John C Frain wrote:
> On Wed, 22 Apr 2020 at 11:32, Sven Schreiber <svetosch(a)gmx.net> wrote:
>
>> Am 21.04.2020 um 21:54 schrieb robdans2(a)gmail.com:
>>> Thanks for the quick answer, what do you think would be a better fix?
>> Changing the variable (for example making it a log)?
>>
>> All I'm saying is that we're talking about an outlier that is going to
>> be removed. I know that this affects the test outcome of a diagnostic
>> test for heteroskedasticity, but it is still a different topic, most of
>> the time.
>
> Failure of your heteroskedasticity test can be regarded as indicating
> misspecification. [...]
True indeed. I have one more observation to throw in.
In his introductory economics textbook, Wooldridge has a house-price example
which he treats as a poster-case for heteroskedasticity, its detection and
treatment. I was messing around with the dataset in question (scatter plots
and so on) and noticed that it contained one serious outlier: a house with a
huge lotsize but very low price. In any regression of price on a list of
regressors containing lotsize, the standard tests for heteroskedasticity
would light up like Christmas trees. But remove that one observation and
everything was fine and dandy. This was actually a poster-case for outlier
detection.
While we're at it, let me chime in with a pet peeve of mine. There's no
such a thing as heteroskedasticity "in the data"; heteroskedasticity is,
possibly, "in the model". Example: consider a variable y such that
y_i = b * x_i + e_i * z_i
where b is a constant parameter, e_i is white noise with variance s^2 and
x_i and z_i are observable and independent of each other. Of course the
regression of y on x (aka the conditional expectation of y given x) is (i)
linear and (ii) homoskedastic, since v(y|x) = v(e*z | x) = s^2 * v(z) (a
constant). But if you consider z as an explanatory variable, you move
to a model of y | x,z; the regression stays the same (E(y | x,z) = b*x),
but the model becomes heteroskedastic, since V(y | x,y) = s^2 * z_i.
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------