Random number in the range 1 to sys.maxsize is always 1 mod 2^10

后端 未结 3 1266
被撕碎了的回忆
被撕碎了的回忆 2021-01-01 08:55

I am trying to find the statistical properties of the PRNGs available in Python (2.7.10) by using the frequency test, runs test and the chi squared test.

For carryin

3条回答
  •  太阳男子
    2021-01-01 09:22

    @roeland hinted at the cause: in Python 2, sample() uses int(random.random() * n) repeatedly. Look at the source code (in your Python's Lib/random.py) for full details. In short, random.random() returns no more than 53 significant (non-zero) leading bits; then int() fills the rest of the low-order bits with zeroes (you're obviously on a machine where sys.maxsize == 2**63 - 1); then indexing your base (xrange(1, sys.maxsize)) by an even integer with "a lot" of of low-order 0 bits always returns an odd integer with the same number of low-order 0 bits (except for the last).

    In Python 3 none of that happens - random in Python 3 uses stronger algorithms, and only falls back to random.random() when necessary. For example, here under Python 3.4.3:

    >>> hex(random.randrange(10**70))
    '0x91fc11ed768be3a454bd66f593c218d8bbfa3b99f6285291e1d9f964a9'
    >>> hex(random.randrange(10**70))
    '0x7b07ff02b6676801e33094fca2fcca7f6e235481c479c521643b1acaf4'
    

    EDIT

    Here's a more directly relevant example, under 3.4.3 on a 64-bit box:

    >>> import random, sys
    >>> sys.maxsize == 2**63 - 1
    True
    >>> for i in random.sample(range(1, sys.maxsize), 6):
    ...    print(bin(i))
    0b10001100101001001111110110011111000100110100111001100000010110
    0b100111100110110100111101001100001100110001110010000101101000101
    0b1100000001110000110100111101101010110001100110101111011100111
    0b111110100001111100101001001001101101100100011001001010100001110
    0b1100110100000011100010000011010010100100110111001111100110100
    0b10011010000110101010101110001000101110111100100001111101110111
    

    Python 3 doesn't invoke random.random() at all in this case, but instead iteratively grabs chunks of 32 bits from the underlying Mersenne Twister (32-bit unsigned ints are "the natural" outputs from this implementation of MT) , pasting them together to build a suitable index. So, in Python 3, platform floats have nothing to do with it; in Python 2, quirks of float behavior have everything to do with it.

提交回复
热议问题