A word of warning about running the BigCrush test. Looking through
the results of the first run I noticed that some of the tests
generate test statistics that would reject the relevant hypothesis at
the 1% confidence level, even though all of the tests were reported
as having passed (since the criterion is p in the range
[0.001,0.999]). Since we are dealing with a random number generator
it is possible that one run may lead to no failures but another may
generate a number of failures.
Hence I ran the same test a second time. The execution times were
very similar (29h 40m vs 29h 41m). The second run reported a single
failure - Test 89 PeriodsInStrings , r = 20 with a p-value of
6.4e-4. Hence, one should really run these tests several times to
get a proper assessment of the frequency of failures - a bit tedious
given the amount of time required but essential. Further,
comparisons across operating systems or hardware shouldn't be based
on a single run only.
I am now running the revised version of glibtest with the Dec 29th
version of glib.c and will report the results when they
finish. However, I have one initial observation. Sven reported that
the updated version of the ziggurat executes substantially faster
than the earlier version. The early tests in the BigCrush suite give
a different picture - the execution times are all slightly longer
using the new gretl_one_snormal than using the previous
ran_normal_ziggurat. Is this a consequence of the change needed to
use one and a quarter random ints per normal draw, since the previous
code in ran_normal_ziggurat seems to correspond to the Voss
procedure? As a rough guess the increase in execution time is of the
order of 5-10%.
Gordon