Most efficient code for the first 10000 prime numbers?

前端 未结 30 1402
日久生厌
日久生厌 2020-11-29 19:09

I want to print the first 10000 prime numbers. Can anyone give me the most efficient code for this? Clarifications:

  1. It does not matter if your code is ineffici
相关标签:
30条回答
  • 2020-11-29 19:31

    Here is a C++ solution, using a form of SoE:

    #include <iostream>
    #include <deque>
    
    typedef std::deque<int> mydeque;
    
    void my_insert( mydeque & factors, int factor ) {
        int where = factor, count = factors.size();
        while( where < count && factors[where] ) where += factor;
        if( where >= count ) factors.resize( where + 1 );
        factors[ where ] = factor;
    }
    
    int main() {
        mydeque primes;
        mydeque factors;
        int a_prime = 3, a_square_prime = 9, maybe_prime = 3;
        int cnt = 2;
        factors.resize(3);
        std::cout << "2 3 ";
    
        while( cnt < 10000 ) {
            int factor = factors.front();
            maybe_prime += 2;
            if( factor ) {
                my_insert( factors, factor );
            } else if( maybe_prime < a_square_prime ) {
                std::cout << maybe_prime << " ";
                primes.push_back( maybe_prime );
                ++cnt;
            } else {
                my_insert( factors, a_prime );
                a_prime = primes.front();
                primes.pop_front();
                a_square_prime = a_prime * a_prime;
            }
            factors.pop_front();
        }
    
        std::cout << std::endl;
        return 0;
    }
    

    Note that this version of the Sieve can compute primes indefinitely.

    Also note, the STL deque takes O(1) time to perform push_back, pop_front, and random access though subscripting.

    The resize operation takes O(n) time, where n is the number of elements being added. Due to how we are using this function, we can treat this is a small constant.

    The body of the while loop in my_insert is executed O(log log n) times, where n equals the variable maybe_prime. This is because the condition expression of the while will evaluate to true once for each prime factor of maybe_prime. See "Divisor function" on Wikipedia.

    Multiplying by the number of times my_insert is called, shows that it should take O(n log log n) time to list n primes... which is, unsurprisingly, the time complexity which the Sieve of Eratosthenes is supposed to have.

    However, while this code is efficient, it's not the most efficient... I would strongly suggest using a specialized library for primes generation, such as primesieve. Any truly efficient, well optimized solution, will take more code than anyone wants to type into Stackoverflow.

    0 讨论(0)
  • 2020-11-29 19:32

    @Matt: log(log(10000)) is ~2

    From the wikipedia article (which you cited) Sieve of Atkin:

    This sieve computes primes up to N using O(N/log log N) operations with only N1/2+o(1) bits of memory. That is a little better than the sieve of Eratosthenes which uses O(N) operations and O(N1/2(log log N)/log N) bits of memory (A.O.L. Atkin, D.J. Bernstein, 2004). These asymptotic computational complexities include simple optimizations, such as wheel factorization, and splitting the computation to smaller blocks.

    Given asymptotic computational complexities along O(N) (for Eratosthenes) and O(N/log(log(N))) (for Atkin) we can't say (for small N=10_000) which algorithm if implemented will be faster.

    Achim Flammenkamp wrote in The Sieve of Eratosthenes:

    cited by:

    @num1

    For intervals larger about 10^9, surely for those > 10^10, the Sieve of Eratosthenes is outperformed by the Sieve of Atkins and Bernstein which uses irreducible binary quadratic forms. See their paper for background informations as well as paragraph 5 of W. Galway's Ph.D. thesis.

    Therefore for 10_000 Sieve of Eratosthenes can be faster then Sieve of Atkin.

    To answer OP the code is prime_sieve.c (cited by num1)

    0 讨论(0)
  • 2020-11-29 19:32

    The deque sieve algorithm mentioned by BenGoldberg deserves a closer look, not only because it is very elegant but also because it can occasionally be useful in practice (unlike the Sieve of Atkin, which is a purely academical exercise).

    The basic idea behind the deque sieve algorithm is to use a small, sliding sieve that is only large enough to contain at least one separate multiple for each of the currently 'active' prime factors - i.e. those primes whose square does not exceed the lowest number currently represented by the moving sieve. Another difference to the SoE is that the deque sieve stores the actual factors into the slots of composites, not booleans.

    The algorithm extends the size of the sieve window as needed, resulting in fairly even performance over a wide range until the sieve starts exceeding the capacity of the CPU's L1 cache appreciably. The last prime that fits fully is 25,237,523 (the 1,579,791st prime), which gives a rough ballpark figure for the reasonable operating range of the algorithm.

    The algorithm is fairly simple and robust, and it has even performance over a much wider range than an unsegmented Sieve of Eratosthenes. The latter is a lot faster as long its sieve fits fully into the cache, i.e. up to 2^16 for an odds-only sieve with byte-sized bools. Then its performance drops more and more, although it always remains significantly faster than the deque despite the handicap (at least in compiled languages like C/C++, Pascal or Java/C#).

    Here is a rendering of the deque sieve algorithm in C#, because I find that language - despite its many flaws - much more practical for prototyping algorithms and experimentation than the supremely cumbersome and pedantic C++. (Sidenote: I'm using the free LINQPad which makes it possible to dive right in, without all the messiness with setting up projects, makefiles, directories or whatnot, and it gives me the same degree of interactivity as a python prompt).

    C# doesn't have an explicit deque type but the plain List<int> works well enough for demonstrating the algorithm.

    Note: this version does not use a deque for the primes, because it simply doesn't make sense to pop off sqrt(n) out of n primes. What good would it be to remove 100 primes and to leave 9900? At least this way all the primes are collected in a neat vector, ready for further processing.

    static List<int> deque_sieve (int n = 10000)
    {
        Trace.Assert(n >= 3);
    
        var primes = new List<int>()  {  2, 3  };
        var sieve = new List<int>()  {  0, 0, 0  };
    
        for (int sieve_base = 5, current_prime_index = 1, current_prime_squared = 9; ; )
        {
            int base_factor = sieve[0];
    
            if (base_factor != 0)
            {
                // the sieve base has a non-trivial factor - put that factor back into circulation
                mark_next_unmarked_multiple(sieve, base_factor);
            }
            else if (sieve_base < current_prime_squared)  // no non-trivial factor -> found a non-composite
            {
                primes.Add(sieve_base);
    
                if (primes.Count == n)
                    return primes;
            }
            else // sieve_base == current_prime_squared
            {
                // bring the current prime into circulation by injecting it into the sieve ...
                mark_next_unmarked_multiple(sieve, primes[current_prime_index]);
    
                // ... and elect a new current prime
                current_prime_squared = square(primes[++current_prime_index]);
            }
    
            // slide the sieve one step forward
            sieve.RemoveAt(0);  sieve_base += 2;
        }
    }
    

    Here are the two helper functions:

    static void mark_next_unmarked_multiple (List<int> sieve, int prime)
    {
        int i = prime, e = sieve.Count;
    
        while (i < e && sieve[i] != 0)
            i += prime;
    
        for ( ; e <= i; ++e)  // no List<>.Resize()...
            sieve.Add(0);
    
        sieve[i] = prime;
    }
    
    static int square (int n)
    {
        return n * n;
    }
    

    Probably the easiest way of understanding the algorithm is to imagine it as a special segmented Sieve of Eratosthenes with a segment size of 1, accompanied by an overflow area where the primes come to rest when they shoot over the end of the segment. Except that the single cell of the segment (a.k.a. sieve[0]) has already been sieved when we get to it, because it got run over while it was part of the overflow area.

    The number that is represented by sieve[0] is held in sieve_base, although sieve_front or window_base would also be a good names that allow to draw parallels to Ben's code or implementations of segmented/windowed sieves.

    If sieve[0] contains a non-zero value then that value is a factor of sieve_base, which can thus be recognised as composite. Since cell 0 is a multiple of that factor it is easy to compute its next hop, which is simply 0 plus that factor. Should that cell be occupied already by another factor then we simply add the factor again, and so on until we find a multiple of the factor where no other factor is currently parked (extending the sieve if needed). This also means that there is no need for storing the current working offsets of the various primes from one segment to the next, as in a normal segmented sieve. Whenever we find a factor in sieve[0], its current working offset is 0.

    The current prime comes into play in the following way. A prime can only become current after its own occurrence in the stream (i.e. when it has been detected as a prime, because not marked with a factor), and it will remain current until the exact moment that sieve[0] reaches its square. All lower multiples of this prime must have been struck off due to the activities of smaller primes, just like in a normal SoE. But none of the smaller primes can strike off the square, since the only factor of the square is the prime itself and it is not yet in circulation at this point. That explains the actions taken by the algorithm in the case sieve_base == current_prime_squared (which implies sieve[0] == 0, by the way).

    Now the case sieve[0] == 0 && sieve_base < current_prime_squared is easily explained: it means that sieve_base cannot be a multiple of any of the primes smaller than the current prime, or else it would have been marked as composite. I cannot be a higher multiple of the current prime either, since its value is less than the current prime's square. Hence it must be a new prime.

    The algorithm is obviously inspired by the Sieve of Eratosthenes, but equally obviously it is very different. The Sieve of Eratosthenes derives its superior speed from the simplicity of its elementary operations: one single index addition and one store for each step of the operation is all that it does for long stretches of time.

    Here is a simple, unsegmented Sieve of Eratosthenes that I normally use for sieving factor primes in the ushort range, i.e. up to 2^16. For this post I've modified it to work beyond 2^16 by substituting int for ushort

    static List<int> small_odd_primes_up_to (int n)
    {
        var result = new List<int>();
    
        if (n < 3)
            return result;
    
        int sqrt_n_halved = (int)(Math.Sqrt(n) - 1) >> 1, max_bit = (n - 1) >> 1;
        var odd_composite = new bool[max_bit + 1];
    
        for (int i = 3 >> 1; i <= sqrt_n_halved; ++i)
            if (!odd_composite[i])
                for (int p = (i << 1) + 1, j = p * p >> 1; j <= max_bit; j += p)
                    odd_composite[j] = true;
    
        result.Add(3);  // needs to be handled separately because of the mod 3 wheel
    
        // read out the sieved primes
        for (int i = 5 >> 1, d = 1; i <= max_bit; i += d, d ^= 3)
            if (!odd_composite[i])
                result.Add((i << 1) + 1);
    
        return result;
    }
    

    When sieving the first 10000 primes a typical L1 cache of 32 KiByte will be exceeded but the function is still very fast (fraction of a millisecond even in C#).

    If you compare this code to the deque sieve then it is easy to see that the operations of the deque sieve are a lot more complicated, and it cannot effectively amortise its overhead because it always does the shortest possible stretch of crossings-off in a row (exactly one single crossing-off, after skipping all multiples that have been crossed off already).

    Note: the C# code uses int instead of uint because newer compilers have a habit of generating substandard code for uint, probably in order to push people towards signed integers... In the C++ version of the code above I used unsigned throughout, naturally; the benchmark had to be in C++ because I wanted it be based on a supposedly adequate deque type (std::deque<unsigned>; there was no performance gain from using unsigned short). Here are the numbers for my Haswell laptop (VC++ 2015/x64):

    deque vs simple: 1.802 ms vs 0.182 ms
    deque vs simple: 1.836 ms vs 0.170 ms 
    deque vs simple: 1.729 ms vs 0.173 ms
    

    Note: the C# times are pretty much exactly double the C++ timings, which is pretty good for C# and ìt shows that List<int> is no slouch even if abused as a deque.

    The simple sieve code still blows the deque out of the water, even though it is already operating beyond its normal working range (L1 cache size exceeded by 50%, with attendant cache thrashing). The dominating part here is the reading out of the sieved primes, and this is not affected much by the cache problem. In any case the function was designed for sieving the factors of factors, i.e. level 0 in a 3-level sieve hierarchy, and typically it has to return only a few hundred factors or a low number of thousands. Hence its simplicity.

    Performance could be improved by more than an order of magnitude by using a segmented sieve and optimising the code for extracting the sieved primes (stepped mod 3 and unrolled twice, or mod 15 and unrolled once) , and yet more performance could be squeezed out of the code by using a mod 16 or mod 30 wheel with all the trimmings (i.e. full unrolling for all residues). Something like that is explained in my answer to Find prime positioned prime number over on Code Review, where a similar problem was discussed. But it's hard to see the point in improving sub-millisecond times for a one-off task...

    To put things a bit into perspective, here are the C++ timings for sieving up to 100,000,000:

    deque vs simple: 1895.521 ms vs 432.763 ms
    deque vs simple: 1847.594 ms vs 429.766 ms
    deque vs simple: 1859.462 ms vs 430.625 ms
    

    By contrast, a segmented sieve in C# with a few bells and whistles does the same job in 95 ms (no C++ timings available, since I do code challenges only in C# at the moment).

    Things may look decidedly different in an interpreted language like Python where every operation has a heavy cost and the interpreter overhead dwarfs all differences due to predicted vs. mispredicted branches or sub-cycle ops (shift, addition) vs. multi-cycle ops (multiplication, and perhaps even division). That is bound to erode the simplicity advantage of the Sieve of Eratosthenes, and this could make the deque solution a bit more attractive.

    Also, many of the timings reported by other respondents in this topic are probably dominated by output time. That's an entirely different war, where my main weapon is a simple class like this:

    class CCWriter
    {
        const int SPACE_RESERVE = 11;  // UInt31 + '\n'
    
        public static System.IO.Stream BaseStream;
        static byte[] m_buffer = new byte[1 << 16];  // need 55k..60k for a maximum-size range
        static int m_write_pos = 0;
        public static long BytesWritten = 0;         // for statistics
    
        internal static ushort[] m_double_digit_lookup = create_double_digit_lookup();
    
        internal static ushort[] create_double_digit_lookup ()
        {
            var lookup = new ushort[100];
    
            for (int lo = 0; lo < 10; ++lo)
                for (int hi = 0; hi < 10; ++hi)
                    lookup[hi * 10 + lo] = (ushort)(0x3030 + (hi << 8) + lo);
    
            return lookup;
        }
    
        public static void Flush ()
        {
            if (BaseStream != null && m_write_pos > 0)
                BaseStream.Write(m_buffer, 0, m_write_pos);
    
            BytesWritten += m_write_pos;
            m_write_pos = 0;
        }
    
        public static void WriteLine ()
        {
            if (m_buffer.Length - m_write_pos < 1)
                Flush();
    
            m_buffer[m_write_pos++] = (byte)'\n';
        }
    
        public static void WriteLinesSorted (int[] values, int count)
        {
            int digits = 1, max_value = 9;
    
            for (int i = 0; i < count; ++i)
            {
                int x = values[i];
    
                if (m_buffer.Length - m_write_pos < SPACE_RESERVE)
                    Flush();
    
                while (x > max_value)
                    if (++digits < 10)
                        max_value = max_value * 10 + 9;
                    else
                        max_value = int.MaxValue;               
    
                int n = x, p = m_write_pos + digits, e = p + 1;
    
                m_buffer[p] = (byte)'\n';
    
                while (n >= 10)
                {
                    int q = n / 100, w = m_double_digit_lookup[n - q * 100];
                    n = q;
                    m_buffer[--p] = (byte)w;
                    m_buffer[--p] = (byte)(w >> 8);
                }
    
                if (n != 0 || x == 0)
                    m_buffer[--p] = (byte)((byte)'0' + n);
    
                m_write_pos = e;
            }
        }
    }
    

    That takes less than 1 ms for writing 10000 (sorted) numbers. It's a static class because it is intended for textual inclusion in coding challenge submissions, with a minimum of fuss and zero overhead.

    In general I found it to be much faster if focussed work is done on entire batches, meaning sieve a certain range, then extract all primes into a vector/array, then blast out the whole array, then sieve the next range and so on, instead of mingling everything together. Having separate functions focussed on specific tasks also makes it easier to mix and match, it enables reuse, and it eases development/testing.

    0 讨论(0)
  • 2020-11-29 19:32

    Here is my code which finds first 10,000 primes in 0.049655 sec on my laptop, first 1,000,000 primes in under 6 seconds and first 2,000,000 in 15 seconds
    A little explanation. This method uses 2 techniques to find prime number

    1. first of all any non-prime number is a composite of multiples of prime numbers so this code test by dividing the test number by smaller prime numbers instead of any number, this decreases calculation by atleast 10 times for a 4 digit number and even more for a bigger number
    2. secondly besides dividing by prime, it only divides by prime numbers that are smaller or equal to the root of the number being tested further reducing the calculations greatly, this works because any number that is greater than root of the number will have a counterpart number that has to be smaller than root of the number but since we have tested all numbers smaller than the root already, Therefore we don't need to bother with number greater than the root of the number being tested.

    Sample output for first 10,000 prime number
    https://drive.google.com/open?id=0B2QYXBiLI-lZMUpCNFhZeUphck0 https://drive.google.com/open?id=0B2QYXBiLI-lZbmRtTkZETnp6Ykk

    Here is the code in C language, Enter 1 and then 10,000 to print out the first 10,000 primes.
    Edit: I forgot this contains math library ,if you are on windows or visual studio than that should be fine but on linux you must compile the code using -lm argument or the code may not work
    Example: gcc -Wall -o "%e" "%f" -lm

    #include <stdio.h>
    #include <math.h>
    #include <time.h>
    #include <limits.h>
    
    /* Finding prime numbers */
    int main()
    {   
        //pre-phase
        char d,w;
        int l,o;
        printf("  1. Find first n number of prime numbers or Find all prime numbers smaller than n ?\n"); // this question helps in setting the limits on m or n value i.e l or o 
        printf("     Enter 1 or 2 to get anwser of first or second question\n");
        // decision making
        do
        {
            printf("  -->");
            scanf("%c",&d);
            while ((w=getchar()) != '\n' && w != EOF);
            if ( d == '1')
            {
                printf("\n  2. Enter the target no. of primes you will like to find from 3 to 2,000,000 range\n  -->");
                scanf("%10d",&l);
                o=INT_MAX;
                printf("  Here we go!\n\n");
                break;
            }
            else if ( d == '2' )
            {
                printf("\n  2.Enter the limit under which to find prime numbers from 5 to 2,000,000 range\n  -->");
                scanf("%10d",&o);
                l=o/log(o)*1.25;
                printf("  Here we go!\n\n");
                break;
            }
            else printf("\n  Try again\n");
        }while ( d != '1' || d != '2' );
    
        clock_t start, end;
        double cpu_time_used;
        start = clock(); /* starting the clock for time keeping */
    
        // main program starts here
        int i,j,c,m,n; /* i ,j , c and m are all prime array 'p' variables and n is the number that is being tested */
        int s,x;
    
        int p[ l ]; /* p is the array for storing prime numbers and l sets the array size, l was initialized in pre-phase */
        p[1]=2;
        p[2]=3;
        p[3]=5;
        printf("%10dst:%10d\n%10dnd:%10d\n%10drd:%10d\n",1,p[1],2,p[2],3,p[3]); // first three prime are set
        for ( i=4;i<=l;++i ) /* this loop sets all the prime numbers greater than 5 in the p array to 0 */
            p[i]=0;
    
        n=6; /* prime number testing begins with number 6 but this can lowered if you wish but you must remember to update other variables too */
        s=sqrt(n); /* 's' does two things it stores the root value so that program does not have to calaculate it again and again and also it stores it in integer form instead of float*/
        x=2; /* 'x' is the biggest prime number that is smaller or equal to root of the number 'n' being tested */
    
        /* j ,x and c are related in this way, p[j] <= prime number x <= p[c] */
    
        // the main loop begins here
        for ( m=4,j=1,c=2; m<=l && n <= o;)
        /* this condition checks if all the first 'l' numbers of primes are found or n does not exceed the set limit o */
        {
                // this will divide n by prime number in p[j] and tries to rule out non-primes
                if ( n%p[j]==0 )
                {
                    /* these steps execute if the number n is found to be non-prime */
    
                    ++n; /* this increases n by 1 and therefore sets the next number 'n' to be tested */
                    s=sqrt(n); /* this calaulates and stores in 's' the new root of number 'n' */
                    if ( p[c] <= s && p[c] != x ) /* 'The Magic Setting' tests the next prime number candidate p[c] and if passed it updates the prime number x */
                    {
                        x=p[c];
                        ++c;
                    }
                    j=1;
                    /* these steps sets the next number n to be tested and finds the next prime number x if possible for the new number 'n' and also resets j to 1 for the new cycle */
                    continue; /* and this restarts the loop for the new cycle */
                }
                // confirmation test for the prime number candidate n
                else if ( n%p[j]!=0 && p[j]==x )
                {
                    /* these steps execute if the number is found to be prime */
                    p[m]=n;
                    printf("%10dth:%10d\n",m,p[m]);
                    ++n;
                    s = sqrt(n);
                    ++m;
                    j=1;
                    /* these steps stores and prints the new prime number and moves the 'm' counter up and also sets the next number n to be tested and also resets j to 1 for the new cycle */
                    continue; /* and this restarts the loop */
                    /* the next number which will be a even and non-prime will trigger the magic setting in the next cycle and therfore we do not have to add another magic setting here*/
                }
                ++j; /* increases p[j] to next prime number in the array for the next cycle testing of the number 'n' */
                // if the cycle reaches this point that means the number 'n' was neither divisible by p[j] nor was it a prime number
                // and therfore it will test the same number 'n' again in the next cycle with a bigger prime number
        }
        // the loops ends
        printf("  All done !!\n");
        end = clock();
        cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
        printf("  Time taken : %lf sec\n",cpu_time_used);
    }
    
    0 讨论(0)
  • 2020-11-29 19:32
    using System;
    
    namespace ConsoleApplication2
    {
        class Program
        {
            static void Main(string[] args)
            {
                int n, i = 3, j, c;
                Console.WriteLine("Please enter your integer: ");
                n = Convert.ToInt32(Console.ReadLine());
                if (n >= 1)
                {
                    Console.WriteLine("First " + n + " Prime Numbers are");
                    Console.WriteLine("2");
                }
                for(j=2;j<=n;)
                {
                    for(c=2;c<=i-1;c++)
                    {
                        if(i%c==0)
                            break;
                    }
                        if(c==i)
                        {
                            Console.WriteLine(i);
                            j++;
                        }
                        i++;                                
                }
                Console.Read();
            }
        }
    }
    
    0 讨论(0)
  • 2020-11-29 19:33

    Not efficient at all, but you can use a regular expression to test for prime numbers.

    /^1?$|^(11+?)\1+$/
    

    This tests if, for a string consisting of k1”s, k is not prime (i.e. whether the string consists of one “1” or any number of “1”s that can be expressed as an n-ary product).

    0 讨论(0)
提交回复
热议问题