Why does my Sieve of Eratosthenes work faster with integers than with booleans?

♀尐吖头ヾ 提交于 2019-12-18 14:49:34

问题


I wrote a simple Sieve of Eratosthenes, which uses a list of ones and turns them into zeros if not prime, like so:

def eSieve(n): #Where m is fixed-length list of all integers up to n
    '''Creates a list of primes less than or equal to n'''
    m = [1]*(n+1)
    for i in xrange(2,int((n)**0.5)+1):
        if m[i]:
            for j in xrange(i*i,n+1,i):
                m[j]=0
    return [i for i in xrange(2,n) if m[i]]

I tested the speed it ran with %timeit and got:

#n: t
#10**1: 7 μs
#10**2: 26.6 μs
#10**3: 234 μs
#10**4: 2.46 ms
#10**5: 26.4 ms
#10**6: 292 ms
#10**7: 3.27 s

I assumed, if I changed [1] and 0 to booleans, it would run faster... but it does the opposite:

#n: t
#10**1: 7.31 μs
#10**2: 29.5 μs
#10**3: 297 μs
#10**4: 2.99 ms
#10**5: 29.9 ms
#10**6: 331 ms
#10**7: 3.7 s

Why are the booleans slower?


回答1:


This happens because True and False are looked up as globals in Python 2. The 0 and 1 literals are just constants, looked up by a quick array reference, while globals are dictionary lookups in the global namespace (falling through to the built-ins namespace):

>>> import dis
>>> def foo():
...     a = True
...     b = 1
... 
>>> dis.dis(foo)
  2           0 LOAD_GLOBAL              0 (True)
              3 STORE_FAST               0 (a)

  3           6 LOAD_CONST               1 (1)
              9 STORE_FAST               1 (b)
             12 LOAD_CONST               0 (None)
             15 RETURN_VALUE        

The True value is looked up with the LOAD_GLOBAL bytecode, while the 1 literal value is copied to the stack with LOAD_CONST.

If you make True and False locals you can make them just as fast again:

def eSieve(n, True=True, False=False):
    m = [True]*(n+1)
    for i in xrange(2,int((n)**0.5)+1):
        if m[i]:
            for j in xrange(i*i,n+1,i):
                m[j]=False
    return [i for i in xrange(2,n) if m[i]]

Assigning True and False as default values to for arguments gives the function those names as locals, with the exact same values; again using a simplified version:

>>> def bar(True=True, False=False):
...     True == False
... 
>>> dis.dis(bar)
  2           0 LOAD_FAST                0 (True)
              3 LOAD_FAST                1 (False)
              6 COMPARE_OP               2 (==)
              9 POP_TOP             
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        

Note the LOAD_FAST opcodes, now with indices just like the LOAD_CONST bytecodes; locals in a CPython function are stored in an array just like bytecode constants.

With that change, using booleans wins out, albeit by a small margin; my timings:

# n      integers  globals  locals
# 10**1  4.31 µs   4.2 µs   4.2 µs
# 10**2  17.1 µs   17.3 µs  16.5 µs
# 10**3  147 µs    158 µs   144 µs
# 10**4  1.5 ms    1.66 ms  1.48 ms
# 10**5  16.4 ms   18.2 ms  15.9 ms
# 10**6  190 ms    215 ms   189 ms   
# 10**7  2.21 s    2.47 s   2.18 s

The difference isn't really that much because Python booleans are just an int subclass.

Note that in Python 3, True and False have become keywords and can no longer be assigned to, making it possible to treat them just like integer literals.



来源:https://stackoverflow.com/questions/31459623/why-does-my-sieve-of-eratosthenes-work-faster-with-integers-than-with-booleans

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!