On Mon, 5 Dec 2016, Sven Schreiber wrote:
Am 05.12.2016 um 22:31 schrieb Allin Cottrell:
> Yes, this seems to be an artifact coming from the fact that the vector
> of cumulated sorted relative white-ball frequencies is, so to speak,
> "super-uniform" -- much more uniform than a set of draws from U[0,1].
I still don't think this invented terminology is helpful (too Gaussian to be
Gaussian, super-uniform,... what's next :-) but anyway that's not the core
of the issue...
> So I guess my question at this point is how to (or whether it's even
> possible to) map from a set of counts produced by equi-probable draws to
> U[0,1]. Note that the counts themselves will surely not be uniform:
> counts in the neighborhood of the expected value should be more numerous
> than extreme values.
This is one of the first hits that I've found:
https://www.mnlottery.com/games/figuring_the_odds/hypergeometric_distribu...
I haven't checked and cannot vouch that that is correct. But it would seem to
confirm my initial feeling that the normal distribution has nothing to do
with this setup.
The way the normal distribution got into this question was, on the
face of it, reasonable. If you have an r.v. that is supposed to be
U(0,1), one way of testing it for uniformity is to pass it through
the inverse normal cdf and test the result for normality (for which
we have more than one good routine "off the shelf").
However, I'm now coming to the conclusion that there's no obvious
way (and perhaps no way) to take the white-ball count data and turn
it into something that ought theoretically to be U(0,1) -- which
would be step zero towards applying the method just mentioned. I
think Jack's point about failure of independence is important:
cumulating sorted relative count-frequencies will give you a
quantity that's distributed on [0,1] OK, but it won't act like a set
of independent random draws.
Allin