|-------------------------------------------------------------| | This is the BIRTHDAY SPACINGS TEST | |Choose m birthdays in a "year" of n days. List the spacings | |between the birthdays. Let j be the number of values that | |occur more than once in that list, then j is asymptotically | |Poisson distributed with mean m^3/(4n). Experience shows n | |must be quite large, say n>=2^18, for comparing the results | |to the Poisson distribution with that mean. This test uses | |n=2^24 and m=2^10, so that the underlying distribution for j | |is taken to be Poisson with lambda=2^30/(2^26)=16. A sample | |of 200 j''s is taken, and a chi-square goodness of fit test | |provides a p value. The first test uses bits 1-24 (counting | |from the left) from integers in the specified file. Then the| |file is closed and reopened, then bits 2-25 of the same inte-| |gers are used to provide birthdays, and so on to bits 9-32. | |Each set of bits provides a p-value, and the nine p-values | |provide a sample for a KSTEST. | |------------------------------------------------------------ | RESULTS OF BIRTHDAY SPACINGS TEST FOR tmp1 (no_bdays=1024, no_days/yr=2^24, lambda=16.00, sample size=500) Bits used mean chisqr p-value 1 to 24 15.78 24.9023 0.096917 2 to 25 15.55 22.7658 0.157031 3 to 26 15.53 28.3873 0.040612 4 to 27 15.56 18.1447 0.379768 5 to 28 15.70 19.9799 0.275259 6 to 29 15.85 10.6095 0.876130 7 to 30 15.87 14.5555 0.627465 8 to 31 15.62 17.0835 0.448722 9 to 32 15.78 17.6947 0.408343 degree of freedoms is: 17 --------------------------------------------------------------- p-value for KStest on those 9 p-values: 0.318211 |-------------------------------------------------------------| |This is the BINARY RANK TEST for 31x31 matrices. The leftmost| |31 bits of 31 random integers from the test sequence are used| |to form a 31x31 binary matrix over the field {0,1}. The rank | |is determined. That rank can be from 0 to 31, but ranks< 28 | |are rare, and their counts are pooled with those for rank 28.| |Ranks are found for 40,000 such random matrices and a chisqu-| |are test is performed on counts for ranks 31,30,28 and <=28. | |-------------------------------------------------------------| Rank test for binary matrices (31x31) from tmp1 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=28 205 211.4 0.195 0.195 r=29 5110 5134.0 0.112 0.307 r=30 23075 23103.0 0.034 0.341 r=31 11610 11551.5 0.296 0.637 chi-square = 0.637 with df = 3; p-value = 0.888 -------------------------------------------------------------- |-------------------------------------------------------------| |This is the BINARY RANK TEST for 32x32 matrices. A random 32x| |32 binary matrix is formed, each row a 32-bit random integer.| |The rank is determined. That rank can be from 0 to 32, ranks | |less than 29 are rare, and their counts are pooled with those| |for rank 29. Ranks are found for 40,000 such random matrices| |and a chisquare test is performed on counts for ranks 32,31,| |30 and <=29. | |-------------------------------------------------------------| Rank test for binary matrices (32x32) from tmp1 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=29 199 211.4 0.729 0.729 r=30 5164 5134.0 0.175 0.905 r=31 23101 23103.0 0.000 0.905 r=32 11536 11551.5 0.021 0.926 chi-square = 0.926 with df = 3; p-value = 0.819 -------------------------------------------------------------- |-------------------------------------------------------------| |This is the BINARY RANK TEST for 6x8 matrices. From each of | |six random 32-bit integers from the generator under test, a | |specified byte is chosen, and the resulting six bytes form a | |6x8 binary matrix whose rank is determined. That rank can be| |from 0 to 6, but ranks 0,1,2,3 are rare; their counts are | |pooled with those for rank 4. Ranks are found for 100,000 | |random matrices, and a chi-square test is performed on | |counts for ranks 6,5 and (0,...,4) (pooled together). | |-------------------------------------------------------------| Rank test for binary matrices (6x8) from tmp1 bits 1 to 8 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 946 944.3 0.003 0.003 r=5 21614 21743.9 0.776 0.779 r=6 77440 77311.8 0.213 0.992 chi-square = 0.992 with df = 2; p-value = 0.609 -------------------------------------------------------------- bits 2 to 9 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 962 944.3 0.332 0.332 r=5 21620 21743.9 0.706 1.038 r=6 77418 77311.8 0.146 1.184 chi-square = 1.184 with df = 2; p-value = 0.553 -------------------------------------------------------------- bits 3 to 10 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 931 944.3 0.187 0.187 r=5 21547 21743.9 1.783 1.970 r=6 77522 77311.8 0.572 2.542 chi-square = 2.542 with df = 2; p-value = 0.281 -------------------------------------------------------------- bits 4 to 11 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 930 944.3 0.217 0.217 r=5 21531 21743.9 2.085 2.301 r=6 77539 77311.8 0.668 2.969 chi-square = 2.969 with df = 2; p-value = 0.227 -------------------------------------------------------------- bits 5 to 12 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 937 944.3 0.056 0.056 r=5 21725 21743.9 0.016 0.073 r=6 77338 77311.8 0.009 0.082 chi-square = 0.082 with df = 2; p-value = 0.960 -------------------------------------------------------------- bits 6 to 13 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 894 944.3 2.679 2.679 r=5 21757 21743.9 0.008 2.687 r=6 77349 77311.8 0.018 2.705 chi-square = 2.705 with df = 2; p-value = 0.259 -------------------------------------------------------------- bits 7 to 14 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 943 944.3 0.002 0.002 r=5 21518 21743.9 2.347 2.349 r=6 77539 77311.8 0.668 3.016 chi-square = 3.016 with df = 2; p-value = 0.221 -------------------------------------------------------------- bits 8 to 15 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 916 944.3 0.848 0.848 r=5 21669 21743.9 0.258 1.106 r=6 77415 77311.8 0.138 1.244 chi-square = 1.244 with df = 2; p-value = 0.537 -------------------------------------------------------------- bits 9 to 16 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 912 944.3 1.105 1.105 r=5 21608 21743.9 0.849 1.954 r=6 77480 77311.8 0.366 2.320 chi-square = 2.320 with df = 2; p-value = 0.313 -------------------------------------------------------------- bits 10 to 17 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 920 944.3 0.625 0.625 r=5 21792 21743.9 0.106 0.732 r=6 77288 77311.8 0.007 0.739 chi-square = 0.739 with df = 2; p-value = 0.691 -------------------------------------------------------------- bits 11 to 18 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 947 944.3 0.008 0.008 r=5 21809 21743.9 0.195 0.203 r=6 77244 77311.8 0.059 0.262 chi-square = 0.262 with df = 2; p-value = 0.877 -------------------------------------------------------------- bits 12 to 19 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 922 944.3 0.527 0.527 r=5 21758 21743.9 0.009 0.536 r=6 77320 77311.8 0.001 0.537 chi-square = 0.537 with df = 2; p-value = 0.765 -------------------------------------------------------------- bits 13 to 20 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 940 944.3 0.020 0.020 r=5 21867 21743.9 0.697 0.716 r=6 77193 77311.8 0.183 0.899 chi-square = 0.899 with df = 2; p-value = 0.638 -------------------------------------------------------------- bits 14 to 21 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 979 944.3 1.275 1.275 r=5 21745 21743.9 0.000 1.275 r=6 77276 77311.8 0.017 1.292 chi-square = 1.292 with df = 2; p-value = 0.524 -------------------------------------------------------------- bits 15 to 22 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 976 944.3 1.064 1.064 r=5 21707 21743.9 0.063 1.127 r=6 77317 77311.8 0.000 1.127 chi-square = 1.127 with df = 2; p-value = 0.569 -------------------------------------------------------------- bits 16 to 23 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 961 944.3 0.295 0.295 r=5 21809 21743.9 0.195 0.490 r=6 77230 77311.8 0.087 0.577 chi-square = 0.577 with df = 2; p-value = 0.749 -------------------------------------------------------------- bits 17 to 24 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 970 944.3 0.699 0.699 r=5 21739 21743.9 0.001 0.701 r=6 77291 77311.8 0.006 0.706 chi-square = 0.706 with df = 2; p-value = 0.703 -------------------------------------------------------------- bits 18 to 25 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 918 944.3 0.732 0.732 r=5 21670 21743.9 0.251 0.984 r=6 77412 77311.8 0.130 1.114 chi-square = 1.114 with df = 2; p-value = 0.573 -------------------------------------------------------------- bits 19 to 26 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 906 944.3 1.553 1.553 r=5 21711 21743.9 0.050 1.603 r=6 77383 77311.8 0.066 1.669 chi-square = 1.669 with df = 2; p-value = 0.434 -------------------------------------------------------------- bits 20 to 27 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 997 944.3 2.941 2.941 r=5 21631 21743.9 0.586 3.527 r=6 77372 77311.8 0.047 3.574 chi-square = 3.574 with df = 2; p-value = 0.167 -------------------------------------------------------------- bits 21 to 28 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1014 944.3 5.145 5.145 r=5 21845 21743.9 0.470 5.615 r=6 77141 77311.8 0.377 5.992 chi-square = 5.992 with df = 2; p-value = 0.050 -------------------------------------------------------------- bits 22 to 29 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1037 944.3 9.100 9.100 r=5 21693 21743.9 0.119 9.219 r=6 77270 77311.8 0.023 9.242 chi-square = 9.242 with df = 2; p-value = 0.010 -------------------------------------------------------------- bits 23 to 30 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 979 944.3 1.275 1.275 r=5 21560 21743.9 1.555 2.830 r=6 77461 77311.8 0.288 3.118 chi-square = 3.118 with df = 2; p-value = 0.210 -------------------------------------------------------------- bits 24 to 31 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 936 944.3 0.073 0.073 r=5 21862 21743.9 0.641 0.714 r=6 77202 77311.8 0.156 0.870 chi-square = 0.870 with df = 2; p-value = 0.647 -------------------------------------------------------------- bits 25 to 32 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 945 944.3 0.001 0.001 r=5 21773 21743.9 0.039 0.039 r=6 77282 77311.8 0.011 0.051 chi-square = 0.051 with df = 2; p-value = 0.975 -------------------------------------------------------------- TEST SUMMARY, 25 tests on 100,000 random 6x8 matrices These should be 25 uniform [0,1] random variates: 0.609060 0.553316 0.280573 0.226639 0.959954 0.258577 0.221311 0.536899 0.313463 0.691063 0.877181 0.764665 0.637932 0.524204 0.569175 0.749464 0.702525 0.573064 0.434142 0.167446 0.049985 0.009843 0.210305 0.647154 0.974847 The KS test for those 25 supposed UNI's yields KS p-value = 0.863886 |-------------------------------------------------------------| | THE BITSTREAM TEST | |The file under test is viewed as a stream of bits. Call them | |b1,b2,... . Consider an alphabet with two "letters", 0 and 1| |and think of the stream of bits as a succession of 20-letter | |"words", overlapping. Thus the first word is b1b2...b20, the| |second is b2b3...b21, and so on. The bitstream test counts | |the number of missing 20-letter (20-bit) words in a string of| |2^21 overlapping 20-letter words. There are 2^20 possible 20| |letter words. For a truly random string of 2^21+19 bits, the| |number of missing words j should be (very close to) normally | |distributed with mean 141,909 and sigma 428. Thus | | (j-141909)/428 should be a standard normal variate (z score)| |that leads to a uniform [0,1) p value. The test is repeated | |twenty times. | |-------------------------------------------------------------| THE OVERLAPPING 20-TUPLES BITSTREAM TEST for tmp1 (20 bits/word, 2097152 words 20 bitstreams. No. missing words should average 141909.33 with sigma=428.00.) ---------------------------------------------------------------- BITSTREAM test results for tmp1. Bitstream No. missing words z-score p-value 1 141493 -0.97 0.834657 2 142183 0.64 0.261276 3 141951 0.10 0.461220 4 141891 -0.04 0.517080 5 141855 -0.13 0.550506 6 141457 -1.06 0.854709 7 142045 0.32 0.375627 8 142173 0.62 0.268930 9 141782 -0.30 0.616958 10 141336 -1.34 0.909805 11 141506 -0.94 0.826996 12 141671 -0.56 0.711184 13 141719 -0.44 0.671730 14 141508 -0.94 0.825797 15 142732 1.92 0.027295 16 142397 1.14 0.127265 17 142000 0.21 0.416114 18 141841 -0.16 0.563421 19 141980 0.17 0.434426 20 141521 -0.91 0.817879 ---------------------------------------------------------------- |-------------------------------------------------------------| | OPSO means Overlapping-Pairs-Sparse-Occupancy | |The OPSO test considers 2-letter words from an alphabet of | |1024 letters. Each letter is determined by a specified ten | |bits from a 32-bit integer in the sequence to be tested. OPSO| |generates 2^21 (overlapping) 2-letter words (from 2^21+1 | |"keystrokes") and counts the number of missing words---that | |is 2-letter words which do not appear in the entire sequence.| |That count should be very close to normally distributed with | |mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should| |be a standard normal variable. The OPSO test takes 32 bits at| |a time from the test file and uses a designated set of ten | |consecutive bits. It then restarts the file for the next de- | |signated 10 bits, and so on. | |------------------------------------------------------------ | OPSO test for file tmp1 Bits used No. missing words z-score p-value 23 to 32 142232 1.1127 0.132928 22 to 31 142273 1.2540 0.104915 21 to 30 141715 -0.6701 0.748604 20 to 29 141135 -2.6701 0.996209 19 to 28 141709 -0.6908 0.755152 18 to 27 141938 0.0989 0.460624 17 to 26 142180 0.9333 0.175321 16 to 25 142023 0.3920 0.347542 15 to 24 141417 -1.6977 0.955217 14 to 23 141816 -0.3218 0.626208 13 to 22 141888 -0.0736 0.529316 12 to 21 142049 0.4816 0.315038 11 to 20 141746 -0.5632 0.713353 10 to 19 141960 0.1747 0.430648 9 to 18 141830 -0.2736 0.607785 8 to 17 141618 -1.0046 0.842452 7 to 16 141910 0.0023 0.499078 6 to 15 141452 -1.5770 0.942602 5 to 14 141852 -0.1977 0.578356 4 to 13 142006 0.3333 0.369437 3 to 12 141542 -1.2667 0.897361 2 to 11 142023 0.3920 0.347542 1 to 10 142606 2.4023 0.008146 ----------------------------------------------------------------- |------------------------------------------------------------ | | OQSO means Overlapping-Quadruples-Sparse-Occupancy | | The test OQSO is similar, except that it considers 4-letter| |words from an alphabet of 32 letters, each letter determined | |by a designated string of 5 consecutive bits from the test | |file, elements of which are assumed 32-bit random integers. | |The mean number of missing words in a sequence of 2^21 four- | |letter words, (2^21+3 "keystrokes"), is again 141909, with | |sigma = 295. The mean is based on theory; sigma comes from | |extensive simulation. | |------------------------------------------------------------ | OQSO test for file tmp1 Bits used No. missing words z-score p-value 28 to 32 141964 0.1853 0.426488 27 to 31 142035 0.4260 0.335054 26 to 30 141987 0.2633 0.396164 25 to 29 142202 0.9921 0.160574 24 to 28 142233 1.0972 0.136280 23 to 27 141608 -1.0215 0.846481 22 to 26 141887 -0.0757 0.530169 21 to 25 141922 0.0429 0.482871 20 to 24 141735 -0.5909 0.722723 19 to 23 142160 0.8497 0.197738 18 to 22 141830 -0.2689 0.606002 17 to 21 141934 0.0836 0.466676 16 to 20 141544 -1.2384 0.892217 15 to 19 141960 0.1718 0.431812 14 to 18 141543 -1.2418 0.892844 13 to 17 141349 -1.8994 0.971246 12 to 16 142015 0.3582 0.360095 11 to 15 142507 2.0260 0.021382 10 to 14 141848 -0.2079 0.582346 9 to 13 141508 -1.3604 0.913155 8 to 12 141678 -0.7842 0.783530 7 to 11 142182 0.9243 0.177664 6 to 10 142100 0.6463 0.259030 5 to 9 141898 -0.0384 0.515318 4 to 8 141633 -0.9367 0.825547 3 to 7 141550 -1.2181 0.888401 2 to 6 142224 1.0667 0.143059 1 to 5 141914 0.0158 0.493685 ----------------------------------------------------------------- |------------------------------------------------------------ | | The DNA test considers an alphabet of 4 letters: C,G,A,T,| |determined by two designated bits in the sequence of random | |integers being tested. It considers 10-letter words, so that| |as in OPSO and OQSO, there are 2^20 possible words, and the | |mean number of missing words from a string of 2^21 (over- | |lapping) 10-letter words (2^21+9 "keystrokes") is 141909. | |The standard deviation sigma=339 was determined as for OQSO | |by simulation. (Sigma for OPSO, 290, is the true value (to | |three places), not determined by simulation. | |------------------------------------------------------------ | DNA test for file tmp1 Bits used No. missing words z-score p-value 31 to 32 141611 -0.8800 0.810578 30 to 31 142286 1.1111 0.133258 29 to 30 141922 0.0374 0.485093 28 to 29 141572 -0.9951 0.840150 27 to 28 141520 -1.1485 0.874612 26 to 27 142296 1.1406 0.127014 25 to 26 141898 -0.0334 0.513331 24 to 25 141505 -1.1927 0.883509 23 to 24 142120 0.6214 0.267153 22 to 23 141803 -0.3137 0.623109 21 to 22 142265 1.0492 0.147049 20 to 21 141949 0.1170 0.453422 19 to 20 141822 -0.2576 0.601646 18 to 19 142427 1.5271 0.063374 17 to 18 141843 -0.1957 0.577563 16 to 17 142061 0.4474 0.327292 15 to 16 141755 -0.4553 0.675536 14 to 15 141694 -0.6352 0.737348 13 to 14 141964 0.1613 0.435941 12 to 13 141698 -0.6234 0.733487 11 to 12 142219 0.9135 0.180495 10 to 11 142148 0.7040 0.240703 9 to 10 141741 -0.4965 0.690246 8 to 9 141687 -0.6558 0.744037 7 to 8 141751 -0.4671 0.679768 6 to 7 142137 0.6716 0.250921 5 to 6 141902 -0.0216 0.508625 4 to 5 142183 0.8073 0.209751 3 to 4 142101 0.5654 0.285901 2 to 3 142843 2.7542 0.002942 1 to 2 141353 -1.6411 0.949611 ----------------------------------------------------------------- |-------------------------------------------------------------| | This is the COUNT-THE-1''s TEST on a stream of bytes. | |Consider the file under test as a stream of bytes (four per | |32 bit integer). Each byte can contain from 0 to 8 1''s, | |with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let | |the stream of bytes provide a string of overlapping 5-letter| |words, each "letter" taking values A,B,C,D,E. The letters are| |determined by the number of 1''s in a byte: 0,1,or 2 yield A,| |3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus| |we have a monkey at a typewriter hitting five keys with vari-| |ous probabilities (37,56,70,56,37 over 256). There are 5^5 | |possible 5-letter words, and from a string of 256,000 (over- | |lapping) 5-letter words, counts are made on the frequencies | |for each word. The quadratic form in the weak inverse of | |the covariance matrix of the cell counts provides a chisquare| |test: Q5-Q4, the difference of the naive Pearson sums of | |(OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts. | |-------------------------------------------------------------| Test result for the byte stream from tmp1 (Degrees of freedom: 5^4-5^3=2500; sample size: 2560000) chisquare z-score p-value 2458.65 -0.585 0.720643 |-------------------------------------------------------------| | This is the COUNT-THE-1''s TEST for specific bytes. | |Consider the file under test as a stream of 32-bit integers. | |From each integer, a specific byte is chosen , say the left- | |most: bits 1 to 8. Each byte can contain from 0 to 8 1''s, | |with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let | |the specified bytes from successive integers provide a string| |of (overlapping) 5-letter words, each "letter" taking values | |A,B,C,D,E. The letters are determined by the number of 1''s,| |in that byte: 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D, | |and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter | |hitting five keys with with various probabilities: 37,56,70, | |56,37 over 256. There are 5^5 possible 5-letter words, and | |from a string of 256,000 (overlapping) 5-letter words, counts| |are made on the frequencies for each word. The quadratic form| |in the weak inverse of the covariance matrix of the cell | |counts provides a chisquare test: Q5-Q4, the difference of | |the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5- | |and 4-letter cell counts. | |-------------------------------------------------------------| Test results for specific bytes from tmp1 (Degrees of freedom: 5^4-5^3=2500; sample size: 256000) bits used chisquare z-score p-value 1 to 8 2467.39 -0.461 0.677682 2 to 9 2481.47 -0.262 0.603339 3 to 10 2509.88 0.140 0.444463 4 to 11 2508.88 0.126 0.450052 5 to 12 2517.91 0.253 0.400004 6 to 13 2537.88 0.536 0.296072 7 to 14 2529.17 0.413 0.339959 8 to 15 2534.39 0.486 0.313359 9 to 16 2562.48 0.884 0.188440 10 to 17 2479.56 -0.289 0.613741 11 to 18 2592.06 1.302 0.096480 12 to 19 2561.72 0.873 0.191385 13 to 20 2572.96 1.032 0.151090 14 to 21 2424.06 -1.074 0.858584 15 to 22 2572.50 1.025 0.152604 16 to 23 2440.35 -0.844 0.800534 17 to 24 2467.59 -0.458 0.676627 18 to 25 2447.82 -0.738 0.769704 19 to 26 2440.41 -0.843 0.800319 20 to 27 2539.97 0.565 0.285962 21 to 28 2553.33 0.754 0.225346 22 to 29 2558.21 0.823 0.205186 23 to 30 2464.33 -0.504 0.693038 24 to 31 2511.79 0.167 0.433794 25 to 32 2311.40 -2.667 0.996176 |-------------------------------------------------------------| | THIS IS A PARKING LOT TEST | |In a square of side 100, randomly "park" a car---a circle of | |radius 1. Then try to park a 2nd, a 3rd, and so on, each | |time parking "by ear". That is, if an attempt to park a car | |causes a crash with one already parked, try again at a new | |random location. (To avoid path problems, consider parking | |helicopters rather than cars.) Each attempt leads to either| |a crash or a success, the latter followed by an increment to | |the list of cars already parked. If we plot n: the number of | |attempts, versus k: the number successfully parked, we get a | |curve that should be similar to those provided by a perfect | |random number generator. Theory for the behavior of such a | |random curve seems beyond reach, and as graphics displays are| |not available for this battery of tests, a simple characteriz| |ation of the random experiment is used: k, the number of cars| |successfully parked after n=12,000 attempts. Simulation shows| |that k should average 3523 with sigma 21.9 and is very close | |to normally distributed. Thus (k-3523)/21.9 should be a st- | |andard normal variable, which, converted to a uniform varia- | |ble, provides input to a KSTEST based on a sample of 10. | |-------------------------------------------------------------| CDPARK: result of 10 tests on file tmp1 (Of 12000 tries, the average no. of successes should be 3523.0 with sigma=21.9) No. succeses z-score p-value 3488 -1.5982 0.944998 3500 -1.0502 0.853193 3521 -0.0913 0.536383 3474 -2.2374 0.987371 3530 0.3196 0.374623 3520 -0.1370 0.554479 3535 0.5479 0.291865 3523 0.0000 0.500000 3540 0.7763 0.218799 3523 0.0000 0.500000 Square side=100, avg. no. parked=3515.40 sample std.=20.18 p-value of the KSTEST for those 10 p-values: 0.426954 |-------------------------------------------------------------| | THE MINIMUM DISTANCE TEST | |It does this 100 times: choose n=8000 random points in a | |square of side 10000. Find d, the minimum distance between | |the (n^2-n)/2 pairs of points. If the points are truly inde-| |pendent uniform, then d^2, the square of the minimum distance| |should be (very close to) exponentially distributed with mean| |.995 . Thus 1-exp(-d^2/.995) should be uniform on [0,1) and | |a KSTEST on the resulting 100 values serves as a test of uni-| |formity for random points in the square. Test numbers=0 mod 5| |are printed but the KSTEST is based on the full set of 100 | |random choices of 8000 points in the 10000x10000 square. | |-------------------------------------------------------------| This is the MINIMUM DISTANCE test for file tmp1 Sample no. d^2 mean equiv uni 5 1.6646 0.7477 0.812318 10 1.3650 0.8245 0.746367 15 0.0199 0.8340 0.019814 20 0.0321 1.0034 0.031724 25 0.6377 1.0902 0.473164 30 0.3433 1.0538 0.291807 35 0.3486 1.0331 0.295533 40 2.1529 1.0203 0.885109 45 0.3442 1.0417 0.292458 50 0.8189 1.0376 0.560904 55 0.0686 1.0250 0.066592 60 0.3737 1.0360 0.313141 65 2.2789 1.0365 0.898765 70 0.1812 1.0471 0.166488 75 3.4134 1.1079 0.967631 80 0.9577 1.0912 0.618078 85 0.1611 1.1055 0.149504 90 2.4516 1.1036 0.914896 95 0.9292 1.0961 0.606987 100 1.8324 1.0709 0.841433 -------------------------------------------------------------- Result of KS test on 100 transformed mindist^2's: p-value=0.235271 |-------------------------------------------------------------| | THE 3DSPHERES TEST | |Choose 4000 random points in a cube of edge 1000. At each | |point, center a sphere large enough to reach the next closest| |point. Then the volume of the smallest such sphere is (very | |close to) exponentially distributed with mean 120pi/3. Thus | |the radius cubed is exponential with mean 30. (The mean is | |obtained by extensive simulation). The 3DSPHERES test gener-| |ates 4000 such spheres 20 times. Each min radius cubed leads| |to a uniform variable by means of 1-exp(-r^3/30.), then a | | KSTEST is done on the 20 p-values. | |-------------------------------------------------------------| The 3DSPHERES test for file tmp1 sample no r^3 equiv. uni. 1 51.729 0.821705 2 57.739 0.854070 3 42.076 0.754024 4 11.785 0.324868 5 12.393 0.338397 6 70.348 0.904145 7 14.305 0.379249 8 39.725 0.733971 9 2.168 0.069712 10 13.727 0.367181 11 15.670 0.406866 12 21.771 0.516009 13 16.999 0.432566 14 1.571 0.051024 15 8.459 0.245697 16 52.613 0.826879 17 15.987 0.413092 18 8.839 0.255207 19 38.696 0.724696 20 0.049 0.001643 -------------------------------------------------------------- p-value for KS test on those 20 p-values: 0.649487 |-------------------------------------------------------------| | This is the SQUEEZE test | | Random integers are floated to get uniforms on [0,1). Start-| | ing with k=2^31=2147483647, the test finds j, the number of | | iterations necessary to reduce k to 1, using the reduction | | k=ceiling(k*U), with U provided by floating integers from | | the file being tested. Such j''s are found 100,000 times, | | then counts for the number of times j was <=6,7,...,47,>=48 | | are used to provide a chi-square test for cell frequencies. | |-------------------------------------------------------------| RESULTS OF SQUEEZE TEST FOR tmp1 Table of standardized frequency counts (obs-exp)^2/exp for j=(1,..,6), 7,...,47,(48,...) 0.6 -2.4 -1.1 -0.4 -0.5 -2.1 0.4 -0.7 -0.2 -2.6 1.3 -0.2 1.5 -1.2 -0.2 0.1 -0.3 0.5 -0.7 1.0 0.0 0.6 0.6 1.0 -2.4 0.5 0.0 1.0 0.7 -0.7 1.1 -0.5 0.3 0.3 -1.5 -0.5 1.7 0.5 0.1 -0.1 -0.6 -1.0 -0.1 Chi-square with 42 degrees of freedom:45.151402 z-score=0.343846, p-value=0.341643 _____________________________________________________________ |-------------------------------------------------------------| | The OVERLAPPING SUMS test | |Integers are floated to get a sequence U(1),U(2),... of uni- | |form [0,1) variables. Then overlapping sums, | | S(1)=U(1)+...+U(100), S2=U(2)+...+U(101),... are formed. | |The S''s are virtually normal with a certain covariance mat- | |rix. A linear transformation of the S''s converts them to a | |sequence of independent standard normals, which are converted| |to uniform variables for a KSTEST. | |-------------------------------------------------------------| Results of the OSUM test for tmp1 Test no p-value 1 0.645617 2 0.564544 3 0.270074 4 0.060222 5 0.187531 6 0.582887 7 0.588654 8 0.194735 9 0.788569 10 0.820659 _____________________________________________________________ p-value for 10 kstests on 100 kstests:0.812862 |-------------------------------------------------------------| | This is the RUNS test. It counts runs up, and runs down,| |in a sequence of uniform [0,1) variables, obtained by float- | |ing the 32-bit integers in the specified file. This example | |shows how runs are counted: .123,.357,.789,.425,.224,.416,.95| |contains an up-run of length 3, a down-run of length 2 and an| |up-run of (at least) 2, depending on the next values. The | |covariance matrices for the runs-up and runs-down are well | |known, leading to chisquare tests for quadratic forms in the | |weak inverses of the covariance matrices. Runs are counted | |for sequences of length 10,000. This is done ten times. Then| |another three sets of ten. | |-------------------------------------------------------------| The RUNS test for file tmp1 (Up and down runs in a sequence of 10000 numbers) Set 1 runs up; ks test for 10 p's: 0.705192 runs down; ks test for 10 p's: 0.242110 Set 2 runs up; ks test for 10 p's: 0.766427 runs down; ks test for 10 p's: 0.789183 |-------------------------------------------------------------| |This the CRAPS TEST. It plays 200,000 games of craps, counts| |the number of wins and the number of throws necessary to end | |each game. The number of wins should be (very close to) a | |normal with mean 200000p and variance 200000p(1-p), and | |p=244/495. Throws necessary to complete the game can vary | |from 1 to infinity, but counts for all>21 are lumped with 21.| |A chi-square test is made on the no.-of-throws cell counts. | |Each 32-bit integer from the test file provides the value for| |the throw of a die, by floating to [0,1), multiplying by 6 | |and taking 1 plus the integer part of the result. | |-------------------------------------------------------------| RESULTS OF CRAPS TEST FOR tmp1 No. of wins: Observed Expected 98363 98585.858586 z-score=-0.997, pvalue=0.84056 Analysis of Throws-per-Game: Throws Observed Expected Chisq Sum of (O-E)^2/E 1 66367 66666.7 1.347 1.347 2 37767 37654.3 0.337 1.684 3 27017 26954.7 0.144 1.828 4 19185 19313.5 0.854 2.682 5 13960 13851.4 0.851 3.534 6 10101 9943.5 2.493 6.027 7 7174 7145.0 0.117 6.144 8 5112 5139.1 0.143 6.287 9 3747 3699.9 0.600 6.888 10 2642 2666.3 0.221 7.109 11 1899 1923.3 0.308 7.417 12 1384 1388.7 0.016 7.433 13 970 1003.7 1.132 8.565 14 706 726.1 0.559 9.124 15 523 525.8 0.015 9.139 16 367 381.2 0.525 9.665 17 335 276.5 12.359 22.023 18 236 200.8 6.159 28.182 19 131 146.0 1.538 29.720 20 97 106.2 0.800 30.520 21 280 287.1 0.176 30.696 Chisq= 30.70 for 20 degrees of freedom, p= 0.05933 SUMMARY of craptest on tmp1 p-value for no. of wins: 0.840558 p-value for throws/game: 0.059331 _____________________________________________________________