On Mon, 7 Sep 2015, Alecos Papadopoulos wrote:
I need to estimate a "Linear Probability" Specification in
a Binary Choice
model, using specifically maximum likelihood estimation.
The log-likelihood of such a specification is
loglik = y*log(x.*b) + (1-y)*(1-x.*b)
where y is the scalar DepVar, "x" is the vector of regressors and "b"
is the
vector of unknown parameters.
I guess you mean
loglik = y*log(x.*b) + (1-y)*log(1-x.*b)
and besides, since y is binary, you might want to use
<hansl>
series ndx = lincomb(X, b)
series loglik = y ? ln(ndx) : ln(1-ln(ndx))
</hansl>
instead, which saves a few floating-point operations.
The argument of the natural logarithm is not inherently constrained
to range
in (0,1), so in the iterative maximization process, negative values of x.*b
or (1-x.*b) may be encountered at some of the observations, which is not
admissible for the logarithm.
The syntax I suggested above circumvents the problem somehow, since it
ensures that you take the log only once per observation, instead of twice.
Left as is, with a "catch" preceding the mle command, Gretl
ignores the lot
and moves to the next round of randomly generated regressors (this is a Monte
Carlo study) etc. But there are cases where this happens with /all/ generated
samples. Note that the true specification here is indeed the Linear
Probability model (underlying error is Uniform).
Is there a way to tell Gretl to continue "insisting" on the sample where at
some step it encountered inadmissible values for the logarithm?
I guess you can introduce a few checks via an ad-hoc function (sample code
follows):
<hansl>
function scalar badobs(series m)
scalar up = sum(m>1)
scalar dn = sum(m<0)
if xmax(up, dn) > 0
printf "UP = %d, DOWN = %d\n", up, dn
endif
return up + dn
end function
nulldata 1000
series x = uniform()
series p = -0.5 + 2 * x
series y = uniform() < p
list X = const x
ols y X
matrix b = $coeff
series loglik = NA
mle loglik = y ? ln(ndx) : ln(1-ndx)
series ndx = lincomb(X, b)
scalar bad = badobs(ndx)
params b
end mle -v
series B = missing(loglik)
xtab y B
</hansl>
Note that with the solution above, having "bad" values for the index
function is not really a problem if the happen to occur for the
"appropriate" value of y (for example: ndx < 0 is not a problem if y=1).
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)
Università Politecnica delle Marche
(formerly known as Università di Ancona)
r.lucchetti(a)univpm.it
http://www2.econ.univpm.it/servizi/hpp/lucchetti
-------------------------------------------------------