The revised version of the BigCrush test suite has now completed on
my Linux system. It reports no failures and CPU time of 32 hours
13.5 mins. This is a little under 9% longer than the execution time
for the Voss 24-bit version of the ziggurat.
Gordon
A word of warning about running the BigCrush test. Looking through
> the results of the first run I noticed that some of the tests
> generate test statistics that would reject the relevant hypothesis at
> the 1% confidence level, even though all of the tests were reported
> as having passed (since the criterion is p in the range
> [0.001,0.999]). Since we are dealing with a random number generator
> it is possible that one run may lead to no failures but another may
> generate a number of failures.
If all is well, one expects to see p-values randomly distributed
on (0, 1): if you didn't get some values < .01 on a large set of
tests that would itself be suspicious.
It would require a modified test program, but in case one gets
p-values that look "too small" on some tests, it's possible to
rerun those tests in particular rather than redoing the whole
thing.
> Hence I ran the same test a second time. The execution times were
> very similar (29h 40m vs 29h 41m). The second run reported a single
> failure - Test 89 PeriodsInStrings , r = 20 with a p-value of
> 6.4e-4.
Personally I wouldn't be too worried about that -- it looks
within the bounds of what you might expect. P-values of less
than, say, 10^{-6} would be problematic. The failures that
Doornik talks about are values < 10^{-300} and I've seen nothing
like that with our code.
> Sven reported that the updated version of the ziggurat executes
> substantially faster than the earlier version. The early tests
> in the BigCrush suite give a different picture - the execution
> times are all slightly longer using the new gretl_one_snormal
> than using the previous ran_normal_ziggurat.
As Sven said later, this is expected. The new code is
substantially faster than what we had before we started down the
Ziggurat road; but it's necessarily a little slower than the Voss
code, which uses 24-bit values.
Allin.