How Computers Generate Random Numbers - Chapter 2
Up until now we have been dealing with probability theory. We started
with a chance experiment in which the probability of each result is
given, and then worked our way forward to find out what the probability
of a particular compound experiment would be. We now turn to the field
of inferential statistics, where we attempt to do the reverse. Given a
sequence of numbers created by N trials of a chance experiment, we
attempt to work our way backwards and estimate what the defining
properties of the chance experiment are.
For our die throwing experiment, we would like to estimate the
probability of each outcome based on experimental data. If we throw
the die N times, then the best estimate we could make for the
probability of a particular outcome would be to count the number
of times occurred in N trials and form the ratio:
If we define a frequency function f() to be the number of times
occurred in N trials, we then would have:
This is our estimate of the probability of based on N trials of the
chance experiment. Note that some people use this equation as the
definition for the probability of an event when N goes to infinity:
For our die throwing experiment, assume we want to estimate the
probability of , (the side with one dot shows up). Say we throw
the die 6 times, and out of those 6 times occurred twice. We
then calculate the probability of as 2/6.
We now have the problem of deciding which value to accept as correct,
and how much faith we can place on our choice. Let us define two
possible hypotheses, which we will denote as H0 and H1.
H0 : The probability of
is 1/6
H1 : The probability of
is 2/6
In choosing which hypothesis to believe, there are two possible
errors we could make, known as a type I error and a type II error:
Type I error : Choose H1 when H0 is actually true.
Type II error : Choose H0 when H1 is actually true.
We can determine the probability of making each type of error by
examining the theoretical probability distributions for f()
for each hypothesis being tested.
We rolled the die six times, so the possible number of times can
occur is between 0 and 6 times. Let f() be the number of times
occurred in six rolls of the die. The following binomial
formula will calculate the probability of having occur k times
in n trials:
Where:
Pr( f(
) = k ) = probability of exactly k occurrences of
in n trials
f(
) = number of occurrences of
in n trials
n = number of trials
p = Pr(
) (probability of
)
q = 1 - p
(n things taken k at a time)
With n = 6 trials, and assuming hypothesis H0 to be true (the
probability of throwing is 1/6), we obtain the following table:
f() | | Pr ( f() | H0 ) | | sideways bar graph |
0 | | 0.334897 | | |================ |
1 | | 0.401877 | | |==================== |
2 | | 0.200938 | | |========== |
3 | | 0.053583 | | |== |
4 | | 0.008037 | | | |
5 | | 0.000643 | | | |
6 | | 0.000021 | | | |
As you can see from the numbers, the highest probability is for
throwing one , although there is still a high probability for
throwing zero or two 's. If we repeat the binomial calculations
this time assuming hypothesis H1 is true (the probability of is
2/6), we arrive at the following slightly different table:
f() | | Pr ( f() | H1 ) | | sideways bar graph |
0 | | 0.087791 | | |==== |
1 | | 0.263374 | | |============= |
2 | | 0.329218 | | |================ |
3 | | 0.219478 | | |========== |
4 | | 0.082304 | | |==== |
5 | | 0.016460 | | |= |
6 | | 0.001371 | | | |
We can now calculate the probability of making a type I error and of
making a type II error.
Define:
M : f(
) = 2 (the event that
occurs twice in six rolls)
H0 : The probability of
is 1/6
H1 : The probability of
is 2/6
Notation:
Pr(A|B) : The probability of A given B
From Bayes's Theorem we then have:
Pr(H0|M) =
applying the principle of total probability to Pr(M)
Pr(H0|M) =
and if we allow both hypothesis H0 and H1 to be equally likely so that Pr(H0) = Pr(H1) = 1/2
Pr(H0|M) =
Similarly, we can repeat the above steps to obtain
Pr(H1|M) =
From these two formulas and the tables above we can now calculate the
probability of making a type I error and a type II error.
First note that
Pr(M|H0) | = | probability of rolling two out of 6 times when probability of rolling is 1/6 |
| = | 0.200938 (from table above) |
Pr(M|H1) | = | probability of rolling two out of 6 times |
| | when probability of rolling = 2/6 |
| = | 0.329218 (from table above) |
Then from the above formulas and data we have:
Pr( type I error ) | = | Pr(H0|M) | = | |
|
| = | |
|
| = | 0.379 |
|
Pr( type II error ) | = | Pr(H1|M) | = | |
|
| = | |
|
| = | 0.621 |
In this case we have a 38% chance of being wrong if we choose to
believe Pr() = 2/6 and a 62% chance of being wrong if we choose
to believe Pr() = 1/6. Either way we're likely to be wrong. The
data does not allow us to differentiate between the two possibilities
with much confidence.
Let us repeat the above calculations this time assuming we rolled
five out of six times. This would lead us to the estimate that
Pr() = 5/6. Let us create a new hypothesis:
H1 : The probability of
is 5/6
and let us repeat the binomial calculations assuming hypothesis H1 is
true (the probability of is 5/6), we arrive at the following
table:
f() | | Pr ( f() | H1 ) | | sideways bar graph |
6 | | 0.001371 | | | |
0 | | 0.000021 | | | |
1 | | 0.000430 | | | |
2 | | 0.008037 | | | |
3 | | 0.053583 | | |== |
4 | | 0.200938 | | |========== |
5 | | 0.401877 | | |==================== |
6 | | 0.334897 | | |================ |
Following the same procedure as above:
Define:
M : f(
) = 5 (the event that
occurs twice in six rolls)
H0 : The probability of
is 1/6
H1 : The probability of
is 5/6
Pr( type I error ) | = | Pr(H0|M) | = | |
|
| = | |
|
| = | 0.001 |
|
Pr( type II error ) | = | Pr(H1|M) | = | |
|
| = | |
|
| = | 0.998 |
In this case, given a choice between believing the theoretician who
claims Pr() = 1/6 and believing our own tests which claim
Pr() = 5/6, if we choose to believe the theoretician we have a
99.8% chance of being wrong, and if we choose to believe our own
experimental results we have a 0.1% chance of being wrong. Clearly the
odds are in favor of our being right. And we only threw the die six
times!
This then is why a sequence consisting of all ones { 1, 1, ..., 1 } is
unlikely to be from a compound experiment in which each outcome is
equally likely. If we hypothesize that each outcome is equally likely
(hypothesis H0), then all elements in our sample space Sn do have an
equal probability. But if we conjure up an alternative hypothesis
about the probability of outcomes (hypothesis H1), then all the
elements in our sample space Sn no longer have an equal probability.
The probability will be higher for some elements and lower for other
elements. Our hypothesis H1 is specifically formulated to give
whatever sequence we actually got the highest probability possible.
If the resulting probability of our sequence occurring under our
original H0 hypothesis is very low, then it's almost for certain that
it will loose out under an alternate hypothesis H1. Thus if all we
want to do is test whether or not to reject our original H0 hypothesis,
all we need to do is see what the probability of our sequence occurring
under our original H0 hypothesis is. If it is very low, we can reject
our H0 hypothesis under the implied knowledge that a better hypothesis
H1 probably exists. We don't actually have to formulate a better
hypothesis H1, we just have to know that we could make one if we cared
to.
Thus from here on out we won't go the whole distance of formulating an
alternative hypothesis H1 and then calculating the probabilities of
type I errors and type II errors. We'll just say if an actual outcome
of a compound experiment has a very low probability of occurring under
our H0 hypothesis (or null hypothesis), then that hypothesis is
probably not true.
Our general procedure then for testing the results of a compound
experiment is:
1. | Formulate a null hypothesis H0 about the single chance
experiment which was repeated N times to create a sequence of
N values. To test a sequence of supposedly random numbers,
our null hypothesis H0 is that each outcome of the chance
experiment is equally likely, and that each trial of the
chance experiment is independent of all previous trials. |
|
2. | Formulate a real valued function g which somehow tests the
null hypothesis H0. To test a sequence of supposedly random
numbers, our function g might be one which counts the number
of occurrences of a particular outcome. |
|
3. | Under our null hypothesis H0, mathematically define a sequence
of N random variables: |
|
| | | ( X1, X2, ..., Xn ) |
|
| and apply our function g to the sequence of N random variables
creating a new random variable Z: |
|
| | Z = g | ( X1, X2, ..., Xn ) |
|
| Then determine the probability density function of Z either by
mathematical calculations or by finding a table of the
particular probability density function we're interested in. |
|
4. | Take the specific sequence of values supposedly obtained by
performing a chance experiment N times: |
|
| | | ( x1, x2, ..., xn ) |
|
| and apply the function g to obtain a specific value z |
|
| | z = g | ( x1, x2, ..., xn ) |
|
| 5. | Determine from the probability density function of Z how
likely or unlikely we are to obtain our value of z assuming
our null hypothesis H0 is true. If the probability is very
low, we may reject our hypothesis H0 as being most likely
incorrect. |
We come now to our first real test for a sequence of random numbers.
The test is the (Chi-Square, pronounced ki-square) test. It tests the assumption
that the probability distribution for a given chance experiment is as
specified. For our die throwing experiment, it tests the probability
that each possible outcome is equally likely with a probability of 1/6.
The formula for our g function is:
Define:
N | = | number of trials |
k | = | number of possible outcomes of the chance experiment |
f() | = | number of occurrences of in N trials |
E() | = | The expected number of occurrences of in N trials = N*Pr(). |
|
|
We now go through the steps as defined above:
1. | Formulate the null hypothesis H0. Our null hypothesis H0 for
the die throwing experiment is that each of the six possible
outcomes is equally likely with a probability of 1/6. |
|
2. | Formulate a real valued function g which somehow tests the null
hypothesis. Our g function is the test. (This is
not exactly the function, but for N large enough it
approaches the distribution.) Note that if the
result is zero, then each possible outcome occurred exactly
the expected number of times. On the other hand, if the result
is very large, it indicates that one or more of the
possible outcomes of the chance experiment occurred far less or
far more than would be expected if our null hypothesis H0 was
true. |
|
3. | Determine the probability density function of Z = g(X1,X2,...Xn).
As mentioned in step 2, our test function g is not
exactly the function, but for N large enough it
approaches the distribution of which tables are
available for in any book on probability theory or any book of
mathematical tables such as the CRC Standard Mathematical Tables
book. The stated rule of thumb is N should be large enough so
that every possible outcome is expected to occur at least 5
times. Thus for our die throwing experiment we should repeat it
at least 30 times, since 30*(1/6) = 5. |
|
| Each row of the distribution table is for a
particular number of "degrees of freedom" which we will
abbreviate to df. The value to choose for df is one less than
the number of possible outcomes for our chance experiment. Thus
for our die throwing experiment where there are 6 possible
outcomes, we have df = 6 - 1 = 5.
|
|
| Below is the distribution for 5 degrees of freedom:
|
p=1% | | p=5% | | p=25% | | p=50% | | p=75% | | p=95% | | p=99% |
0.5543 | | 1.1455 | | 2.675 | | 4.351 | | 6.626 | | 11.07 | | 15.09 |
| (Note: See NUMERICAL RECIPES: The Art of Scientific Computing listed in the references for computer code that will calculate values so you don't have to look them up in a table.) |
|
| This is the cumulative distribution. If we recall
the distribution itself is typically bell shaped, then the
extremes of the bell shaped curve where the probability is very
low corresponds to the extremes of the cumulative distribution.
Thus if we are below the p=5% mark or above the p=95% mark, the
corresponding probability is quite low. On the other hand, if
we are between the p=25% mark and the p=75% mark, the
corresponding probability is high. |
|
| (Note that some tables list the probability for (1 - p) instead
of p, so that 0.5543 is listed as p=99% instead of p=1%, and
15.09 is listed as p=1% instead of p=99%. It should be easy to
tell by inspection which probability the table is listing.) |
|
4. | We can now apply our test to a specific outcome of
rolling the die 30 times. Assume we threw the die 30 times and
obtained the following results: |
|
f() = 7 |
f() = 5 |
f() = 4 |
f() = 6 |
f() = 6 |
f() = 2 |
|
| Then our test becomes: |
|
| |
| = 3.2 |
|
5. | We can now check our result of 3.2 against the table
above in step 3 and note that it falls between the p=25% mark
and the p=50% mark. This is a high probability region and so we
are not led to believe our null hypothesis H0 is false. |
Further notes on :
Note that not only are overly large values of considered
highly improbable, but so are overly small values of . Whereas
overly large values of lead us to believe the probabilities of
each possible outcome of our chance experiment are not what we
thought them to be, overly small values of lead us to believe
the N trials of our chance experiment were not independent. |
|
As an example, if we threw a single die 6 times, there would be 6**6
or 46656 possible outcomes. If we calculate the value for
each of the possible values and tabulate them we get the following
table: |
| | f() |
------ | | --------- |
0 | | 720 |
2 | | 10800 |
4 | | 16200 |
6 | | 9000 |
8 | | 7200 |
12 | | 2100 |
14 | | 450 |
20 | | 180 |
30 | | 6 |
Although a value of 0 happens 720 times, a value of 2
happens 10800 times, making it much more likely. We can compare our
values here with the distribution table above for 5
degrees of freedom. Our values of = 2 through 12 which appear
to be the most likely ones lie above the p=5% mark and below the
p=95% mark on the table, where we would expect them to lie. Note
that this comparison was for N = 6, although we have as a rule of
thumb that N should be at least 30 for this test. |
|
Another note about the distribution, most tables of the
distribution go up to 30 degrees of freedom. Above that
the distribution approaches the normal distribution. If your chance
experiment has more than 30 degrees of freedom (number of possible
outcomes minus one), you can use the following formula to convert the
distribution for df > 30 into the normal distribution with mean = 0
and variance = 1: |
given: (and) df |
|
Z = | |
|
Here Z is a normally distributed variable with mean = 0 and
variance = 1. As an example, let's assume we are simulating
the random choosing of a single card from a deck of 52. There
are 52 possible outcomes. We will need to repeat this
experiment 52*5=260 times so that each possible outcome occurs
about 5 times. Our number of degrees of freedom is 52 - 1 = 51.
We calculate and get = 30.4 . We now calculate
Z from the above equation given = 30.4 and df = 51: |
|
Z = | |
= | -2.33 |
|
We now look up in a table of the cumulative normal distribution
and find this value of Z corresponds to the first 1% of the
distribution. Thus our value is much lower than
expected. We could conclude that something is probably amiss. |
One last note about . If you are simulating the flipping
of a coin, then the number of degrees of freedom is 1 (two possible
outcomes). Our rule of thumb about performing enough experiments so
that each possible outcome occurs at least 5 times is not adequate
in this case. Because the number of degrees of freedom is so low,
we need to compensate for this by increasing our number of trials.
Enough trials so that each outcome is expected to occur at least 75
times is required in this case. |
If we repeat the die throwing experiment 30 times, we can perform one
test on the results. If we repeat the die throwing
experiment 3000 times, we can either perform one test to get
an idea of the long term probabilities of each possible outcome, or we
can perform 300 tests for each block of 30 trials to get an
idea of the relative short term fluctuations. In the latter case, we
would expect the distribution of our 300 tests to approach
the distribution.
This brings us to our next subject, which is testing a set of data
points to see how closely it matches a given continuous distribution.
The Kolmogorov-Smirnov (KS) test compares a set of nc data points Snc
(e.g. set of a number of data points) with a given
cumulative distribution function P(x) (e.g. the cumulative
distribution).
Given a set of nc data points Snc, we can create a cumulative
distribution function Snc(x) from the data points. The function will
be a step function, rising from 0 to 1, with each step size equal to
1/nc. Below is a generalized example which compares a cumulative distribution
step function Snc(x) for 9 data values against a continuous cumulative
distribution P(x) for a function which is uniform between 0 and 1:
Kolmogorov-Smirnov statistic D. A measured distribution of values in x (
shown as N dots on the lower abscissa) is to be compared with a theoretical distribution whose
cumulative probability distribution is plotted as P(x). A step-function cumulative
probability distribution Snc(x) (labeled in
the figure above) is constructed, one which rises an equal amount at each measured point.
D is the greatest distance between the two cumulative distributions.
(Reprinted from NUMERICAL RECIPES: The Art of Scientific Computing) |
The KS test is now very simple. Define D as the maximum vertical
difference between these two functions:
D = max |Snc(x) - P(x)|
And define DQ as:
where nc is the number of data points.
We can then go directly to a table for the KS test (similar to the
table), and read off where in the cumulative distribution
our DQ value lies, remembering that values below p=5% only occur 5% of
the time, and values above p=95% only occur 5% of the time. Tables for
this KS test are not as easy to find as tables for the test.
Fortunately, as in the case, if our number of data values nc
in set Snc is greater than 30, the distribution approaches something
which can be easily calculated as follows:
Given: | D = max |Snc(x) - P(x)| |
| nc > 30 (number of data points) |