Making Sieve of Eratosthenes more memory efficient in python?

前端 未结 5 854
心在旅途
心在旅途 2021-01-20 00:41

Sieve of Eratosthenes memory constraint issue

Im currently trying to implement a version of the sieve of eratosthenes for a Kattis problem, however, I am running in

5条回答
  •  甜味超标
    2021-01-20 00:59

    Here is an example of a segmented sieve approach that should not exceed 8MB of memory.

    def primeSieve(n,X,window=10**6): 
        primes     = []       # only store minimum number of primes to shift windows
        primeCount = 0        # count primes beyond the ones stored
        flags      = list(X)  # numbers will be replaced by 0 or 1 as we progress
        base       = 1        # number corresponding to 1st element of sieve
        isPrime    = [False]+[True]*(window-1) # starting sieve
        
        def flagPrimes(): # flag x values for current sieve window
            flags[:] = [isPrime[x-base]*1 if x in range(base,base+window) else x
                        for x in flags]
        for p in (2,*range(3,n+1,2)):       # potential primes: 2 and odd numbers
            if p >= base+window:            # shift sieve window as needed
                flagPrimes()                # set X flags before shifting window
                isPrime = [True]*window     # initialize next sieve window
                base    = p                 # 1st number in window
                for k in primes:            # update sieve using known primes 
                    if k>base+window:break
                    i = (k-base%k)%k + k*(k==p)  
                    isPrime[i::k] = (False for _ in range(i,window,k))
            if not isPrime[p-base]: continue
            primeCount += 1                 # count primes 
            if p*p<=n:primes.append(p)      # store shifting primes, update sieve
            isPrime[p*p-base::p] = (False for _ in range(p*p-base,window,p))
    
        flagPrimes() # update flags with last window (should cover the rest of them)
        return primeCount,flags     
            
    

    output:

    print(*primeSieve(9973,[1,2,3,4,9972,9973]))
    # 1229, [0, 1, 1, 0, 0, 1]
    
    print(*primeSieve(10**8,[1,2,3,4,9972,9973,1000331]))
    # 5761455 [0, 1, 1, 0, 0, 1, 0]
    

    You can play with the window size to get the best trade off between execution time and memory consumption. The execution time (on my laptop) is still rather long for large values of n though:

    from timeit import timeit
    for w in range(3,9):
        t = timeit(lambda:primeSieve(10**8,[],10**w),number=1)
        print(f"10e{w} window:",t)
    
    10e3 window: 119.463959956
    10e4 window: 33.33273301199999
    10e5 window: 24.153761258999992
    10e6 window: 24.649398391000005
    10e7 window: 27.616014667
    10e8 window: 27.919413531000004
    

    Strangely enough, window sizes beyond 10^6 give worse performance. The sweet spot seems to be somewhere between 10^5 and 10^6. A window of 10^7 would exceed your 50MB limit anyway.

提交回复
热议问题