I just read in the manual (and saw in the GUI dialog) that there are
robust standard errors for the panel fixed effects estimator.
Shouldn't that be also documented in the built-in command reference for
the panel command? I mean it's ok to refer to the manual for details,
but '--robust' should at least be included in the options list, no?
* the time dummies in fixed effects ("dt_1") are labeled differently
from arbond ("T1")
* the tab-key or arrow-key circulation order is odd in the dialog how to
interpret a dataset (undated, time series, panel)
I discovered what's IMHO a relative severe bug (severe because it can go
unnoticed and would lead to rubbish afterwards):
Missing values are not propagated when multiplying dummy variables. Example:
series check1 = (Q>100)
series check2 = zeromiss(LF <0.5)
series checkmult = check1*check2
and 'checkmult' should have missings where check2 has them but it doesn't.
I discovered it in a panel context but I haven't tested whether it's
more general than that.
If we want to distinguish between true NAs and nan/inf (as we
probably should), some other design questions come up, as a
consequence of the fact that we would be allowing non-finite
values in series and scalar variables. (Unless, that is, we make
it an error to put non-finite values into such variables.)
I presume that in simple, per observation, calculations such as y
= log(x) or y = x*z we'd want to let IEEE rules prevail, but what
about more complex calculations?
At present we automatically exclude observations with NAs from
regression calculations, means and variances and so on. Should we
do the same for nan/inf, or should we let IEEE rules prevail -- or
should we add a "set" switch to control this?
A practical use case is this:
series lx = log(x)
ols y 0 lx
where the series x contains non-positive values. Right now the bad
log x values are converted to NA and skipped. If we leave them as
nan or -inf then what should we do?
Although the logic of what Jack says is valid, I think the current
implementation is a bit confusing--basically because it's unexpected. I
can't think of another program that does this, though I'm sure there must be
some. The others that I use (R, Stata, Gauss) don't.
Since the coding of dummy variable is arbitrary (e.g., you could use 1 -1)
why should zero be treated differently from all other scalars and real
numbers? I think it could lead to some unexpected results (for those who
don't dive into the manual). For the sake of consistency I would prefer
that all numbers be treated alike.
On the other hand if the zero*NA were treated as the exception and handled
with a special command, that would make more logical sense--at least to me.
I find the current implementation is kind of confusing, especially for
gretl's main audience (non programmers like me) ans students coming from
other programs that treat the interaction likewise (e.g., Stata and R).
Just my 2 cents, but I think 'm with Berend on this one....
Professor of Economics
it just happened to me that after repeatedly saving residuals (from
different VARs) via the GUI with name 'uhat5' I got several series named
'uhat5'. I thought they would get overwritten instead. I didn't know
that having several variables with the same names (althought different
Id numbers) is possible. Is that behavior a bug?
(cvs from last month running here)
just a thought:
There was the idea that maybe (just maybe) it could make sense to
enforce clear programming by requiring to use 'matrix', 'series' etc.
instead of the generic 'genr', at least in scripts.
However, when doing that (as I do) the long keywords appearing almost
everywhere are a bit annoying. So what about introducing 'mat' as alias
for 'matrix', 'ser' for 'series', 'sca' for 'scalar', 'str' for
'string'; 'list' could stay like that I guess, or maybe to have 3-letter
conformity also introduce 'lis'.
As I said, just a thought...
when I wanted to use lrvar() I discovered it's only for single series.
So I ported some code to gretl to estimate the long-run covariance
matrix for multivariate series (along with a helper function
autocovar()). I haven't tested it yet, probably some bugs remaining.
Actually I wanted to ask whether it would make sense to allow the
multivariate case in built-in lrvar()?
function matrix autocovar(matrix indatamat, int lag, bool demeaned)
/* computes the autocovariance at lag "lag" for a multivariate time
returns a NxN covariance matrix
N = cols(indatamat)
T = rows(indatamat)
matrix datademeaned = cdemean(indatamat)
matrix datademeaned = indatamat
matrix result = transp(datademeaned[lag+1:T,])*datademeaned[1:T-lag,]
matrix result /= T
function matrix longrunvar(matrix indatamat, bool demeaned, lagtrunc)
/* indatamat is a TxN time series matrix
returns an NxN matrix (spectral density at frequency zero)
N = cols(indatamat)
matrix result = zeros(N,N)
loop for tau=1..lagtrunc
matrix Gamma = autocovar(indatamat,tau,demeaned)
# positive and negative tau range together:
matrix result += (1 -tau/(lagtrunc+1)) * (Gamma+Gamma')
# add the tau=0 part:
matrix result += autocovar(indatamat,0,demeaned)