Create random list of integers in Python

后端 未结 4 802
温柔的废话
温柔的废话 2020-11-29 20:17

I\'d like to create a random list of integers for testing purposes. The distribution of the numbers is not important. The only thing that is counting is time

相关标签:
4条回答
  • 2020-11-29 20:40

    Firstly, you should use randrange(0,1000) or randint(0,999), not randint(0,1000). The upper limit of randint is inclusive.

    For efficiently, randint is simply a wrapper of randrange which calls random, so you should just use random. Also, use xrange as the argument to sample, not range.

    You could use

    [a for a in sample(xrange(1000),1000) for _ in range(10000/1000)]
    

    to generate 10,000 numbers in the range using sample 10 times.

    (Of course this won't beat NumPy.)

    $ python2.7 -m timeit -s 'from random import randrange' '[randrange(1000) for _ in xrange(10000)]'
    10 loops, best of 3: 26.1 msec per loop
    
    $ python2.7 -m timeit -s 'from random import sample' '[a%1000 for a in sample(xrange(10000),10000)]'
    100 loops, best of 3: 18.4 msec per loop
    
    $ python2.7 -m timeit -s 'from random import random' '[int(1000*random()) for _ in xrange(10000)]' 
    100 loops, best of 3: 9.24 msec per loop
    
    $ python2.7 -m timeit -s 'from random import sample' '[a for a in sample(xrange(1000),1000) for _ in range(10000/1000)]'
    100 loops, best of 3: 3.79 msec per loop
    
    $ python2.7 -m timeit -s 'from random import shuffle
    > def samplefull(x):
    >   a = range(x)
    >   shuffle(a)
    >   return a' '[a for a in samplefull(1000) for _ in xrange(10000/1000)]'
    100 loops, best of 3: 3.16 msec per loop
    
    $ python2.7 -m timeit -s 'from numpy.random import randint' 'randint(1000, size=10000)'
    1000 loops, best of 3: 363 usec per loop
    

    But since you don't care about the distribution of numbers, why not just use:

    range(1000)*(10000/1000)
    

    ?

    0 讨论(0)
  • 2020-11-29 20:41

    Your question about performance is moot—both functions are very fast. The speed of your code will be determined by what you do with the random numbers.

    However it's important you understand the difference in behaviour of those two functions. One does random sampling with replacement, the other does random sampling without replacement.

    0 讨论(0)
  • 2020-11-29 20:51

    It is not entirely clear what you want, but I would use numpy.random.randint:

    import numpy.random as nprnd
    import timeit
    
    t1 = timeit.Timer('[random.randint(0, 1000) for r in xrange(10000)]', 'import random') # v1
    
    ### Change v2 so that it picks numbers in (0, 10000) and thus runs...
    t2 = timeit.Timer('random.sample(range(10000), 10000)', 'import random') # v2
    t3 = timeit.Timer('nprnd.randint(1000, size=10000)', 'import numpy.random as nprnd') # v3
    
    print t1.timeit(1000)/1000
    print t2.timeit(1000)/1000
    print t3.timeit(1000)/1000
    

    which gives on my machine:

    0.0233682730198
    0.00781716918945
    0.000147947072983
    

    Note that randint is very different from random.sample (in order for it to work in your case I had to change the 1,000 to 10,000 as one of the commentators pointed out -- if you really want them from 0 to 1,000 you could divide by 10).

    And if you really don't care what distribution you are getting then it is possible that you either don't understand your problem very well, or random numbers -- with apologies if that sounds rude...

    0 讨论(0)
  • 2020-11-29 21:00

    All the random methods end up calling random.random() so the best way is to call it directly:

    [int(1000*random.random()) for i in xrange(10000)]
    

    For example,

    • random.randint calls random.randrange.
    • random.randrange has a bunch of overhead to check the range before returning istart + istep*int(self.random() * n).

    NumPy is much faster still of course.

    0 讨论(0)
提交回复
热议问题