On Wed, 27 Apr 2016, Mikael Postila wrote:
We've been experiencing a problem where exactly same data yields
different regression results depending on A) which computer is used
and B) when regression is being run on same computer. [...]
As Jack says, we'd really have to see what commands you're executing.
In principle all these regressions ought to be solvable in closed
form. Just wondering if one of the following could be the reason:
- some algorithm is used in order to make calculations faster
Certainly, we use a highly optimized Lapack library (OpenBLAS), but
this should not in itself produce results that vary between runs
(though there might be minor variation across different OpenBLAS
releases).
- somewhere in the Gretl code a random number generator is used
Explicit randomization in the context of estimation occurs only if (a)
the user calls for it (Monte Carlo), or (b) the --jitter option is
used when estimating a VECM (simulated annealing) or (c) the estimator
requires use of the GHK algorithm for evaluating a multidimensional
normal distribution. Variation from these sources can be suppressed by
using "set seed <some integer>" at the start of a gretl script.
- some rounding rule applies computer internal clock (odd/even date)
Not that we're aware of, though it's conceivable the Microsoft C
library does some such thing.
We have a large collection of gretl scripts of all sorts that we run
before each release -- to check, among other things, for any
differences in gretl output from previous "known good" results. On
that basis I can say that the printed results always agree exactly
except for cases where we have deliberately changed the gretl code, or
-- where results depend on numerical optimization -- we've updated the
C compiler. Nonlinear results are quite likely to vary at the 5th or
6th digit across different compilers.
Allin Cottrell