Am 05.12.2016 um 23:15 schrieb Riccardo (Jack) Lucchetti:
On Mon, 5 Dec 2016, Riccardo (Jack) Lucchetti wrote:
> The only problem is that the elements of "White" are not independent,
> since their sum MUST equal 5 * 605, so that complicates things a little.
And that precisely why the hypergeometric distribution arises. However,
with n as large as 69, the difference between a hypergeometric and a
binomial (and hence a Gaussian, for large N) should be barely noticeable.
OK, fair enough. Which means you just want to test the data itself for
(near) normality, without any detours via any uniform distributions.
I also don't understand Allin's implementation, however. If the null
hypothesis is the uniform, then you just use invcdf() on that input
data, right? (Well, apart from squeezing it into the [0;1] interval
first.) And if you do that, the test will have power also if you feed
normal data into it.
Something like this:
<hansl>
nulldata 1000
series no1 = normal()
# shift to non-negative realiz.
no = no1 + abs(min(no1))
# squeeze into [0;1]
series so = no / max(no)
series nono = invcdf(N, so)
</hansl>
The result 'nono' is clearly non-normal, you won't get p-values near 1
by running normality test on these transformed data. (Change 'normal()'
to 'uniform()' to work under the null.)
But I'm tired and maybe this is besides the point. Good night...
sven