Chapter 2 | Chapter 4 | |||||

## index | ||||||

## Back to Deley's Homepage |

In the following subsections we test the MTH$RANDOM and RANDU random number generators using tests and compare the results. We will soon find out why the infamous RANDU generator is considered a very poor random number generator. The details of what's inside these generators is deferred until section 5.0 . For now accept that we can set up a random number generator to generate random numbers between 1 and NBINS, and that we can use a SEED value of 1 to initialize these generators.

Arbitrarily choosing NBINS = 30 (number of bins), SEED = 1 (initial SEED value), and BALLS = 300, the test was performed 10 times on the MTH$RANDOM generator with the following results:

30 BINS, 300 BALLS, 10 TESTS, MTH$RANDOM | ||||
---|---|---|---|---|

BALLS= 300 | = 35.1999969 | PROB= 0.8019531 | ||

BALLS= 300 | = 22.8000031 | PROB= 0.2143845 | ||

BALLS= 300 | = 36.7999992 | PROB= 0.8485908 | ||

BALLS= 300 | = 19.8000011 | PROB= 0.1009650 | ||

BALLS= 300 | = 48.7999992 | PROB= 0.9878765 | ||

BALLS= 300 | = 29.3999996 | PROB= 0.5556139 | ||

BALLS= 300 | = 22.8000011 | PROB= 0.2143840 | ||

BALLS= 300 | = 36.5999985 | PROB= 0.8432655 | ||

BALLS= 300 | = 29.3999996 | PROB= 0.5556139 | ||

BALLS= 300 | = 18.6000004 | PROB= 0.0688844 |

Here is the chi-square value, and PROB is the percentage of random sequences (with 30 - 1 = 29 degrees of freedom) which would have a lower value. An extremely high value of PROB, say above 0.990, would indicate that some bins had a much higher than expected number of balls in them while other bins have a much lower than expected number of balls in them. In general, an extremely high value of PROB indicates that the experimental results do not reinforce the hypothesis that each bin had an equally likely chance of getting a ball. So far all of the PROB values appear reasonable.

We can further analyze the results of the 10 tests above by noting that the distribution of the 10 values should approximate the distribution (for 30 - 1 = 29 degrees of freedom). We can apply the KS test to the 10 values to see how well their distribution approximates the distribution for d.f. = 29.

KS D= 0.2019531 PROB= 0.8093885

Here D is the KS D value which is the maximum distance between the approximate and the real function, and PROB is the percentage of times we would have a larger D value, meaning the approximation could be worse. A very low value of PROB here means it couldn't get much worse. Anything above a very low value means our distribution bears at least some resemblance to the function we're comparing it to. Again this result appears reasonable.

The following values were obtained when the above test was applied 10 times to the RANDU generator:

30 BINS, 300 BALLS, 10 TESTS, RANDU | ||||
---|---|---|---|---|

BALLS= 300 | = 31.3999996 | PROB= 0.6531839 | ||

BALLS= 300 | = 60.7999992 | PROB= 0.9995093 | ||

BALLS= 300 | = 33.4000015 | PROB= 0.7381005 | ||

BALLS= 300 | = 24.3999977 | PROB= 0.2910264 | ||

BALLS= 300 | = 20.8000011 | PROB= 0.1336691 | ||

BALLS= 300 | = 16.6000023 | PROB= 0.0319657 | ||

BALLS= 300 | = 32.0000000 | PROB= 0.6801265 | ||

BALLS= 300 | = 30.2000027 | PROB= 0.5959312 | ||

BALLS= 300 | = 31.2000027 | PROB= 0.6439425 | ||

BALLS= 300 | = 45.5999985 | PROB= 0.9742958 |

And a KS test on the above 10 values:

KS D= 0.2959312 PROB= 0.3452127

Again there is nothing as yet unusual about either the MTH$RANDOM or RANDU generators.

We can also test the MTH$RANDOM and RANDU generators using the KS test directly to see how well the output fits the hypothesized uniform distribution.

KS TEST ON MTH$RANDOM (Start with SEED = 1) | |||||||
---|---|---|---|---|---|---|---|

KS | NPOINTS= | 10 | D= 0.2142200 | PROB= 0.7484154 | |||

KS | NPOINTS= | 100 | D= 0.0944867 | PROB= 0.3338285 | |||

KS | NPOINTS= | 1000 | D= 0.0314864 | PROB= 0.2746516 | |||

KS | NPOINTS= | 10000 | D= 0.0079555 | PROB= 0.5514056 | |||

KS | NPOINTS= | 100000 | D= 0.0017763 | PROB= 0.9105781 | |||

KS | NPOINTS= | 1000000 | D= 0.0009270 | PROB= 0.3565834 |

KS TEST ON MTH$RANDOM (Start with SEED = 1) | |||||||
---|---|---|---|---|---|---|---|

KS | NPOINTS= | 10 | D= 0.6555050 | PROB= 0.0003705 | |||

KS | NPOINTS= | 100 | D= 0.1326773 | PROB= 0.0591586 | |||

KS | NPOINTS= | 1000 | D= 0.0337385 | PROB= 0.2050490 | |||

KS | NPOINTS= | 10000 | D= 0.0063568 | PROB= 0.8138239 | |||

KS | NPOINTS= | 100000 | D= 0.0042999 | PROB= 0.0495556 | |||

KS | NPOINTS= | 1000000 | D= 0.0007990 | PROB= 0.5457213 |

Note the only thing questionable here is the extremely low value of PROB on RANDU for the first 10 points. This indicates that perhaps a seed value of 1 for the RANDU generator is not a good place to start it. Indeed, the first 10 numbers from RANDU starting with SEED = 1 are:

RANDU first 10 numbers starting with seed = 1 | ||
---|---|---|

1 | 0.0001831 | |

2 | 0.0032959 | |

3 | 0.0444950 | |

4 | 0.5339386 | |

5 | 0.0068024 | |

6 | 0.8734164 | |

7 | 0.1705012 | |

8 | 0.3222913 | |

9 | 0.9906484 | |

10 | 0.7260775 |

Notice that half of the first 10 values are either below 0.006 or above 0.99. This alone does not indicate a problem with the RANDU generator, it does indicate that a SEED value of 1 is a poor choice to start it with. Note that the MTH$RANDOM generator does not have this problem when started with SEED = 1.

If we throw a die and get a 5, what is the probability that the next throw of the die will be a 3? If the probability is still 1/6, we say the next outcome is independent of the previous outcome. i.e. knowledge of the previous outcome does not give us any help in predicting the next outcome.

The mathematical definition of statistical independence between two random variables can be written as:

Pr( X(i+1) | X(i) ) = Pr( X(i+1) )

The serial test in two dimensions checks to see if there is any correlation between two consecutive outcomes of the random number generator. This is a test performed on pairs of random numbers. For the die throwing experiment, we would construct a total of 36 bins numbered BIN(1,1) - BIN(6,6). One bin for each possible pair of numbers we may get. We then throw the die twice to get an ordered pair of numbers, and drop a ball in the corresponding bin. We throw the die enough times so that each bin is expected to get an average of at least 5 balls. Then we perform a test on the results to see if the balls appear to be more or less evenly distributed among all the bins.

This serial test for pairs of outcomes was done on both the MTH$RANDOM and RANDU generators with the following results:

SERIAL 2D BINS=30x30 BALLS = 9000 TESTS = 10 MTH$RANDOM | ||||
---|---|---|---|---|

BALLS= 900 | = 895.799865 | PROB= 0.4761399 | ||

BALLS= 900 | = 945.200134 | PROB= 0.8615244 | ||

BALLS= 900 | = 883.600036 | PROB= 0.3633031 | ||

BALLS= 900 | = 905.000000 | PROB= 0.5624363 | ||

BALLS= 900 | = 902.398986 | PROB= 0.5382197 | ||

BALLS= 900 | = 911.800170 | PROB= 0.6241364 | ||

BALLS= 900 | = 932.400573 | PROB= 0.7863315 | ||

BALLS= 900 | = 865.400085 | PROB= 0.2157318 | ||

BALLS= 900 | = 909.599670 | PROB= 0.6043593 | ||

BALLS= 900 | = 901.799438 | PROB= 0.5325246 |

KS test to see how well the 10 values fit the chi-square distribution:

SERIAL 2D BINS=30x30 BALLS = 9000 TESTS = 10 RANDU | ||||
---|---|---|---|---|

BALLS= 900 | = 848.799926 | PROB= 0.1168594 | ||

BALLS= 900 | = 891.799987 | PROB= 0.4386667 | ||

BALLS= 900 | = 874.199829 | PROB= 0.2827876 | ||

BALLS= 900 | = 884.199768 | PROB= 0.3687753 | ||

BALLS= 900 | = 799.599975 | PROB= 0.0077433 | ||

BALLS= 900 | = 841.800598 | PROB= 0.0864621 | ||

BALLS= 900 | = 867.800720 | PROB= 0.2330545 | ||

BALLS= 900 | = 892.601013 | PROB= 0.4461077 | ||

BALLS= 900 | = 952.400085 | PROB= 0.8945085 | ||

BALLS= 900 | = 921.599914 | PROB= 0.7069101 |

KS test to see how well the 10 values fit the chi-square distribution:

KS D= 0.3631590 PROB= 0.1430003

So far there is nothing blatantly wrong with either the MTH$RANDOM or RANDU generators. All the values are within reason, and the distribution of the values appears to fit the chi-square distribution.

This 3-dimensional test extends the test for statistical independence. If we assume the first two variables are independent, then this test checks to see if the third variable is also independent. The mathematical definition of statistical independence between two random variables can be written as:

Pr( X(i+2) | X(i+1),X(i) ) = Pr( X(i+2) )

This serial test for triples of outcomes was done on both the MTH$RANDOM and RANDU generators. The generators were set up to simulate the throwing of a 30 sided die. A total of 270000 balls were tossed so that an average of 10 balls were expected to fall in each bin. A test was then performed on the results. This was repeated a total of 10 times, and a KS test was done on the resulting values to see how well they fit the chi-square distribution. Below are the results:

SERIAL 3D BINS=30x30 BALLS = 270000 TESTS = 10 MTH$RANDOM | ||||
---|---|---|---|---|

BALLS= 270 | = 27233.4375 | PROB= 0.8438070 | ||

BALLS= 270 | = 26732.8027 | PROB= 0.1262939 | ||

BALLS= 270 | = 26866.4550 | PROB= 0.2845250 | ||

BALLS= 270 | = 26765.3710 | PROB= 0.1561499 | ||

BALLS= 270 | = 26650.6250 | PROB= 0.0659529 | ||

BALLS= 270 | = 26665.5117 | PROB= 0.0751096 | ||

BALLS= 270 | = 27165.1523 | PROB= 0.7621238 | ||

BALLS= 270 | = 26861.5625 | PROB= 0.2786521 | ||

BALLS= 270 | = 27002.1171 | PROB= 0.5027421 | ||

BALLS= 270 | = 27090.8613 | PROB= 0.6547577 |

KS test to see how well the 10 values fit the chi-square distribution:

KS D= 0.3162388 PROB= 0.2699622

SERIAL 3D BINS=30x30 BALLS = 270000 TESTS = 10 RANDU | ||||
---|---|---|---|---|

BALLS= 270 | = 454159.375 | PROB= 1.0000000 | ||

BALLS= 270 | = 452698.125 | PROB= 1.0000000 | ||

BALLS= 270 | = 455055.343 | PROB= 1.0000000 | ||

BALLS= 270 | = 455938.281 | PROB= 1.0000000 | ||

BALLS= 270 | = 454747.718 | PROB= 1.0000000 | ||

BALLS= 270 | = 453017.812 | PROB= 1.0000000 | ||

BALLS= 270 | = 453818.781 | PROB= 1.0000000 | ||

BALLS= 270 | = 454553.593 | PROB= 1.0000000 | ||

BALLS= 270 | = 453205.687 | PROB= 1.0000000 | ||

BALLS= 270 | = 454122.062 | PROB= 1.0000000 |

KS test to see how well the 10 values fit the chi-square distribution:

KS D= 1.0000000 PROB= 0.0000000

It is immediately obvious that the RANDU generator has failed miserably on the serial test for triples! The value is way too high on every one of the 10 trials.

Before we use these tests to analyze some random number generators we should first take a look at what's inside these generators and how to properly use them.

Chapter 2 | Chapter 4 | |||||

## index | ||||||

## Back to Deley's Homepage |