Sieve of Eratosthenes - Primes between X and N

后端 未结 2 1155
再見小時候
再見小時候 2020-12-02 03:05

I found this highly optimised implementation of the Sieve of Eratosthenes for Python on Stack Overflow. I have a rough idea of what it\'s doing but I must admit the details

相关标签:
2条回答
  • 2020-12-02 03:55

    The implementation you've borrowed is able to start at 3 because it replaces sieving out the multiples of 2 by just skipping all even numbers; that's what the 2*… that appear multiple times in the code are about. The fact that 3 is the next prime is also hardcoded in all over the place, but let's ignore that for the moment, because if you can't get past the special-casing of 2, the special-casing of 3 doesn't matter.

    Skipping even numbers is a special case of a "wheel". You can skip sieving multiples of 2 by always incrementing by 2; you can skip sieving multiples of 2 and 3 by alternately incrementing by 2 and 4; you can skip sieving multiples of 2, 3, 5, and 7 by alternately incrementing by 2, 4, 2, 4, 6, 2, 6, … (there's 48 numbers in the sequence), and so on. So, you could extend this code by first finding all the primes up to x, then building a wheel, then using that wheel to find all the primes between x and n.

    But that's adding a lot of complexity. And once you get too far beyond 7, the cost (both in time, and in space for storing the wheel) swamps the savings. And if your whole goal is not to find the primes before x, finding the primes before x so you don't have to find them seems kind of silly. :)

    The simpler thing to do is just find all the primes up to n, and throw out the ones below x. Which you can do with a trivial change at the end:

    primes = numpy.r_[2,result]
    return primes[primes>=x]
    

    Or course there are ways to do this without wasting storage for those initial primes you're going to throw away. They'd be a bit complicated to work into this algorithm (you'd probably want to build the array in sections, then drop each section that's entirely < x as you go, then stack all the remaining sections); it would be far easier to use a different implementation of the algorithm that isn't designed for speed and simplicity over space…

    And of course there are different prime-finding algorithms that don't require enumerating all the primes up to x in the first place. But if you want to use this implementation of this algorithm, that doesn't matter.

    0 讨论(0)
  • 2020-12-02 03:59

    Since you're now interested in looking into other algorithms or other implementations, try this one. It doesn't use numpy, but it is rather fast. I've tried a few variations on this theme, including using sets, and pre-computing a table of low primes, but they were all slower than this one.

    #! /usr/bin/env python
    
    ''' Prime range sieve.
    
        Written by PM 2Ring 2014.10.15
    
        For range(0, 30000000) this is actually _faster_ than the 
        plain Eratosthenes sieve in sieve3.py !!!
    '''
    
    import sys
    
    def potential_primes():
        ''' Make a generator for 2, 3, 5, & thence all numbers coprime to 30 '''
        s = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
        for i in s:
            yield i
        s = (1,) + s[3:] 
        j = 30
        while True:
            for i in s:
                yield j + i
            j += 30
    
    
    def range_sieve(lo, hi):
        ''' Create a list of all primes in the range(lo, hi) '''
    
        #Mark all numbers as prime
        primes = [True] * (hi - lo)
    
        #Eliminate 0 and 1, if necessary
        for i in range(lo, min(2, hi)):
            primes[i - lo] = False
    
        ihi = int(hi ** 0.5)
        for i in potential_primes():
            if i > ihi: 
                break
    
            #Find first multiple of i: i >= i*i and i >= lo
            ilo = max(i, 1 + (lo - 1) // i ) * i
    
            #Determine how many multiples of i >= ilo are in range
            n = 1 + (hi - ilo - 1) // i
    
            #Mark them as composite
            primes[ilo - lo : : i] = n * [False]
    
        return [i for i,v in enumerate(primes, lo) if v]
    
    
    def main():
        lo = int(sys.argv[1]) if len(sys.argv) > 1 else 0
        hi = int(sys.argv[2]) if len(sys.argv) > 2 else lo + 30
        #print lo, hi
    
        primes = range_sieve(lo, hi)
        #print len(primes)
        print primes
        #print primes[:10], primes[-10:]
    
    
    if __name__ == '__main__':
        main()
    

    And here's a link to the plain Eratosthenes sieve that I mentioned in the docstring, in case you want to compare this program to that one.

    You could improve this slightly by getting rid of the loop under #Eliminate 0 and 1, if necessary. And I guess it might be slightly faster if you avoided looking at even numbers; it'd certainly use less memory. But then you'd have to handle the cases when 2 was inside the range, and I figure that the less tests you have the faster this thing will run.


    Here's a minor improvement to that code: replace

        #Mark all numbers as prime
        primes = [True] * (hi - lo)
    
        #Eliminate 0 and 1, if necessary
        for i in range(lo, min(2, hi)):
            primes[i - lo] = False
    

    with

        #Eliminate 0 and 1, if necessary
        lo = max(2, lo)
    
        #Mark all numbers as prime
        primes = [True] * (hi - lo)
    

    However, the original form may be preferable if you want to return the plain bool list rather than performing the enumerate to build a list of integers: the bool list is more useful for testing if a given number is prime; OTOH, the enumerate could be used to build a set rather than a list.

    0 讨论(0)
提交回复
热议问题