Jack,
thanks for the suggestions, but I may have been wasting your time (I
apologize). Why? Because "suddenly", the following simple script (assume
variables have been generated previously, and ols estimation has been run)
<hansl>
matrix b = $coeff
list X = const Bet E B
function series unifll(series Ybin, list X, matrix b)
series ndx = lincomb(X, b)
series liky = Ybin ? ln(ndx) : ln(1-ndx)
return liky
end function
catch mle logl = unifll(Ybin, X, b)
params b
end mle
</hansl>
returns in the output window a series of
<<Warning: log: Domain error>>
messages
but below them the mle estimation results! (and very sensible estimates
too). So it appears the mle command does "insist" on samples with non
admissible values without any help from "if" statements, at least when
the log-likelihood is embedded into a user-defined function.
I don't know what was happening before, evidently some silly mistake of
mine.
Thanks again.
Alecos Papadopoulos
PhD Candidate
Athens University of Economics and Business, Greece
Department of Economics
https://alecospapadopoulos.wordpress.com/
On 7/9/2015 12:15, gretl-users-request(a)lists.wfu.edu wrote:
Message: 5
Date: Mon, 7 Sep 2015 11:15:23 +0200 (CEST)
From: "Riccardo (Jack) Lucchetti"<r.lucchetti(a)univpm.it>
To: Gretl list<gretl-users(a)lists.wfu.edu>
Subject: Re: [Gretl-users] Maximum Likelihood estimation of Linear
Probability specification (Binary Choice model)
Message-ID:<alpine.DEB.2.20.1509071045130.8887@ec-4.econ.univpm.it>
Content-Type: text/plain; charset="iso-8859-15"; Format="flowed"
On Mon, 7 Sep 2015, Alecos Papadopoulos wrote:
> >I need to estimate a "Linear Probability" Specification in a Binary
Choice
> >model, using specifically maximum likelihood estimation.
> >The log-likelihood of such a specification is
> >
> >loglik = y*log(x.*b) + (1-y)*(1-x.*b)
> >
> >where y is the scalar DepVar, "x" is the vector of regressors and
"b" is the
> >vector of unknown parameters.
I guess you mean
loglik = y*log(x.*b) + (1-y)*log(1-x.*b)
and besides, since y is binary, you might want to use
<hansl>
series ndx = lincomb(X, b)
series loglik = y ? ln(ndx) : ln(1-ln(ndx))
</hansl>
instead, which saves a few floating-point operations.
> >The argument of the natural logarithm is not inherently constrained to range
> >in (0,1), so in the iterative maximization process, negative values of x.*b
> >or (1-x.*b) may be encountered at some of the observations, which is not
> >admissible for the logarithm.
The syntax I suggested above circumvents the problem somehow, since it
ensures that you take the log only once per observation, instead of twice.
> >Left as is, with a "catch" preceding the mle command, Gretl ignores the
lot
> >and moves to the next round of randomly generated regressors (this is a Monte
> >Carlo study) etc. But there are cases where this happens with/all/ generated
> >samples. Note that the true specification here is indeed the Linear
> >Probability model (underlying error is Uniform).
> >
> >Is there a way to tell Gretl to continue "insisting" on the sample
where at
> >some step it encountered inadmissible values for the logarithm?
I guess you can introduce a few checks via an ad-hoc function (sample code
follows):
<hansl>
function scalar badobs(series m)
scalar up = sum(m>1)
scalar dn = sum(m<0)
if xmax(up, dn) > 0
printf "UP = %d, DOWN = %d\n", up, dn
endif
return up + dn
end function
nulldata 1000
series x = uniform()
series p = -0.5 + 2 * x
series y = uniform() < p
list X = const x
ols y X
matrix b = $coeff
series loglik = NA
mle loglik = y ? ln(ndx) : ln(1-ndx)
series ndx = lincomb(X, b)
scalar bad = badobs(ndx)
params b
end mle -v
series B = missing(loglik)
xtab y B
</hansl>
Note that with the solution above, having "bad" values for the index
function is not really a problem if the happen to occur for the
"appropriate" value of y (for example: ndx < 0 is not a problem if y=1).
-------------------------------------------------------
Riccardo (Jack) Lucchetti
Dipartimento di Scienze Economiche e Sociali (DiSES)