问题
I was trying to solve a problem on SPOJ. We are required to calculate the nth twin prime pair( primes differing by 2). n can be as large as 10^5. I tried a precalculation using a sieve, I had to sieve up to 10^8 to get the maximum n twin prime, but the time limit is strict(2s) and it times out. I noticed people have solved it in 0.00 seconds, so i looked around for a formula on google, and couldnt get anything helpful. Could someone please guide me?
Thanks in advance!!
回答1:
So basically, sieving up to 20,000,000 is enough, according to Wolfram Alpha. Use plain sieve of Eratosthenes, on odds, with vector<bool> in C++ (what language were you using BTW?).
Track the twin primes right inside the sieve loop. Store the lower prime of a pair in a separate vector as you find the twins, and if an out-of-order (smaller then previous) index is requested (and they are, contrary to the examples shown on the description page), just get the prime from this storage:
size_t n = 10000000, itop=2236;
vector<bool> s;
vector<int> twins;
s.resize(n, true);
int cnt, k1, k2, p1=3, p2, k=0;
cin >> cnt;
if( cnt-- > 0 )
{
cin >> k1;
for( size_t i=1; i < n; ++i ) // p=2i+1
{
if( s[i] )
{
p2 = 2*i+1;
if( p2-p1 == 2 ) { ++k; twins.push_back(p1); }
if( k==k1 )
{
cout << p1 << " " << p2 << endl;
......
etc. Got accept with 1.05 sec (0.18 sec on Ideone). Or untangle the logic - just pre-calculate 100,000 twin prime pairs right away, and access them in a separate loop afterwards (0.94 sec).
回答2:
Out of curiosity, I solved the problem using two variants of a Sieve of Eratosthenes. The first variant completed on the testing machine in 0.93s and the second in 0.24s. For comparison, on my computer, the first finished in 0.08s and the second in 0.04s.
The first was a standard sieve on the odd numbers, the second a slightly more elaborate sieve omitting also the multiples of 3 in addition to the even numbers.
The testing machines of SPOJ are old and slow, so a programme runs much longer on them than on a typical recent box; and they have small caches, therefore it is important to keep the computation small.
Doing that, a Sieve of Eratosthenes is easily fast enough. However, it is really important to keep memory usage small. The first variant, using one byte per number, gave "Time limit exceeded" on SPOJ, but ran in 0.12s on my box. So, given the characteristics of the SPOJ testing machines, use a bit-sieve to solve it in the given time.
On the SPOJ machine, I got a significant speedup (running time 0.14s) by further reducing the space of the sieve by half. Since - except for the first pair (3,5) - all prime twins have the form (6*k-1, 6*k+1), and you need not know which of the two numbers is composite if k doesn't give rise to a twin prime pair, it is sufficient to sieve only the indices k.
(6*k + 1 is divisible by 5 if and only if k = 5*m + 4 for some m, and 6*k - 1 is divisible by 5 if and only if k = 5*m+1 for some m, so 5 would mark off 5*m ± 1, m >= 1 as not giving rise to twin primes. Similarly, 6*k+1 is divisible by 13 if and only if k = 13*m + 2 for some m and 6*k - 1 if and only if k = 13*m - 2 for some m, so 13 would mark off 13*m ± 2.)
This doesn't change the number of markings, so with a sufficiently large cache, the change in running time is small, but for small caches, it's a significant speedup.
One more thing, though. Your limit of 108 is way too high. I used a lower limit (20 million) that doesn't overestimate the 100,000th twin prime pair by so much. With a limit of 108, the first variant would certainly not have finished in time, the second probably not.
With the reduced limit, a Sieve of Atkin needs to be somewhat optimised to beat the Eratosthenes variant omitting even numbers and multiples of 3, a naive implementation will be significantly slower.
Some remarks concerning your (wikipedia's pseudocode) Atkin sieve:
#define limit 100000000
int prime1[MAXN];
int prime2[MAXN];
You don't need the second array, the larger partner of a prime twin pair can easily be computed from the smaller. You're wasting space and destroy cache locality reading from two arrays. (That's minor compared to the time needed for sieving, though.)
int root = ceil(sqrt(limit));
bool sieve[limit];
On many operating systems nowadays, that is an instant segfault, even with a reduced limit. The stack size is often limited to 8MB or less. Arrays of that size should be allocated on the heap.
As mentioned above, using one bool per number makes the programme run far slower than necessary. You should use a std::bitset or std::vector<bool> or twiddle the bits yourself. Also it is advisable to omit at least the even numbers.
for (int x = 1; x <= root; x++)
{
for (int y = 1; y <= root; y++)
{
//Main part of Sieve of Atkin
int n = (4*x*x)+(y*y);
if (n <= limit && (n % 12 == 1 || n % 12 == 5)) sieve[n] ^= true;
n = (3*x*x)+(y*y);
if (n <= limit && n % 12 == 7) sieve[n] ^= true;
n = (3*x*x)-(y*y);
if (x > y && n <= limit && n % 12 == 11) sieve[n] ^= true;
}
}
This is horribly inefficient. It tries far too many x-y-combinations, for each combination it does three or four divisions to check the remainder modulo 12 and it hops back and forth in the array.
Separate the different quadratics.
For 4*x^2 + y^2, it is evident that you need only consider x < sqrt(limit)/2 and odd y. Then the remainder modulo 12 is 1, 5, or 9. If the remainder is 9, then 4*x^2 + y^2 is actually a multiple of 9, so such a number would be eliminated as not square-free. However, it is preferable to omit the multiples of 3 from the sieve altogether and treat the cases n % 12 == 1 and n % 12 == 5 separately.
For 3*x^2 + y^2, it is evident that you need only consider x < sqrt(limit/3) and a little bit of thought reveals that x must be odd and y even (and not divisible by 3).
For 3*x^2 - y^2 with y < x, it is evident that you need only consider y < sqrt(limit/2). Looking at the remainders modulo 12, you see that y mustn't be divisible by 3 and x and y must have different parity.
回答3:
I have got AC in 0.66s. As, there are solutions with 0.0s I assume better optimizations are possible, however, I describe my approach here.
I have used one basic optimization in Sieve of Eratosthenes. You know that 2 is the only even prime, using this you can reduce your computation time and memory for calculating primes by half.
Secondly, all the numbers which are twin primes will not be multiples of 2 and 3 (as they are primes!). So, those numbers will be of the form 6N+1 and 6N+5 (rest will not be primes for sure). 6N+5 = 6N+6-1 = 6(N+1)-1. So it can be seen that 6N+1 and 6N-1 can possibly be twin primes for N >= 1. So, you precompute all these values using the primes that you have calculated before. (Trivial case is 3 5)
Note: You don't need to calculate primes till 10^8, the upper limit is much lower. [Edit: I can share my code if you want, but it would be better if you come up with a solution on your own. :)]
回答4:
A description of an efficient algorithm to solve this can be found here @ Programming Praxis entry Also, Scheme and Perl sample code are provided.
回答5:
I precomputed a large list of primes using the Sieve of Eratosthenes, then iterated through the list counting items that were 2 less than their successor until finding n of them. Runs in 1.42 seconds at http://ideone.com/vYjuC. I too would like to know how to compute the answer in zero seconds.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define ISBITSET(x, i) (( x[i>>3] & (1<<(i&7)) ) != 0)
#define SETBIT(x, i) x[i>>3] |= (1<<(i&7));
#define CLEARBIT(x, i) x[i>>3] &= (1<<(i&7)) ^ 0xFF;
typedef struct list {
int data;
struct list *next;
} List;
List *insert(int data, List *next)
{
List *new;
new = malloc(sizeof(List));
new->data = data;
new->next = next;
return new;
}
List *reverse(List *list) {
List *new = NULL;
List *next;
while (list != NULL)
{
next = list->next;
list->next = new;
new = list;
list = next;
}
return new;
}
int length(List *xs)
{
int len = 0;
while (xs != NULL)
{
len += 1;
xs = xs->next;
}
return len;
}
List *primes(int n)
{
int m = (n-1) / 2;
char b[m/8+1];
int i = 0;
int p = 3;
List *ps = NULL;
int j;
ps = insert(2, ps);
memset(b, 255, sizeof(b));
while (p*p < n)
{
if (ISBITSET(b,i))
{
ps = insert(p, ps);
j = (p*p - 3) / 2;
while (j < m)
{
CLEARBIT(b, j);
j += p;
}
}
i += 1; p += 2;
}
while (i < m)
{
if (ISBITSET(b,i))
{
ps = insert(p, ps);
}
i += 1; p += 2;
}
return reverse(ps);
}
int nth_twin(int n, List *ps)
{
while (ps->next != NULL)
{
if (n == 0)
{
return ps->data - 1;
}
if (ps->next->data - ps->data == 2)
{
--n;
}
ps = ps->next;
}
return 0;
}
int main(int argc, char *argv[])
{
List *ps = primes(100000000);
printf("%d\n", nth_twin(100000, ps));
return 0;
}
回答6:
this is what I have attempted. I have a string of TLEs.
bool mark [N];
vector <int> primeList;
void sieve ()
{
memset (mark, true, sizeof (mark));
mark [0] = mark [1] = false;
for ( int i = 4; i < N; i += 2 )
mark [i] = false;
for ( int i = 3; i * i <= N; i++ )
{
if ( mark [i] )
{
for ( int j = i * i; j < N; j += 2 * i )
mark [j] = false;
}
}
primeList.clear ();
primeList.push_back (2);
for ( int i = 3; i < N; i += 2 )
{
if ( mark [i] )
primeList.push_back (i);
}
//printf ("%d\n", primeList.size ());
}
int main ()
{
sieve ();
vector <int> twinPrime;
for ( size_t i = 1; i < primeList.size (); i++ )
{
if ( primeList [i] - primeList [i - 1] == 2 )
twinPrime.push_back (primeList [i - 1]);
}
int t;
scanf("%d",&t);
int s;
while ( t-- )
{
scanf("%d",&s);
printf ("%d %d\n", twinPrime [s - 1], twinPrime [s - 1] + 2);
}
return 0;
}
回答7:
Here is a procedure that could answer your question:
Prime numbers that, when divided by 3, have equal quotients when corrected to decimal 0 (zero) are Twin Primes.
This can be written as
For any pair of prime numbers Px, Py, if [Px/3, 0] = [Py/3, 0] then Px and Py are Prime Twins.
The basis for this is that if prime numbers differ by 2, then dividing the all the prime numbers of interest will yield unique equal quotients when the quotients are corrected to decimal zero. Primes that are not separated by 2 will not have equal quotients when corrected to decimal zero.
For example:
• 11, 13 when divided by 3 will yield unique the unique quotient of 4 when the quotient is corrected to decimal zero.
• 17, 19 when divided by 3 will yield the unique quotient of 6 when the quotient is corrected to decimal zero.
• 29, 31 when divided by 3 will yield the unique quotient of 10 when the quotient is corrected to decimal zero.
Etc.
Below is a simple procedure using Excel to:
• Find prime twins from any list of primes • Find twin primes in any range of primes • Find the largest prime twin prime • Find gaps between twin primes
- Import Kutools into Excel
- List prime numbers of interest into column 1.
- Insert divisor 3 in column 2 - fill down to the level of the largest prime on the list in column 1.
- Divide the first row of column 1 by the first row of column 2 and place the quotient in column 3
- Fill down column 3 to the level of the largest prime number on the list in column 1.
- Correct to zero decimal. Keep the numbers column 3 (quotients) selected.
- From “Conditional formatting’ - Select "duplicate values" from the menu
- Go to Kutools and select 'to actual' - This will highlight the cells of all the twin pairs scattered in the Quotient column 3.
- Select the quotients in column 3
- Select 'Sort and Filter' in Excel
- Select 'Custom Sort'
- Fill in the menu (For values chose the highlighted color in the quotient column) and and click ‘OK”.
- The twin primes will be grouped together in the column. This list can then be used to find the gaps between primes.
To find the largest twin prime use the above procedure with a range of the largest known prime into column 1 (e.g. the highest 10k primes).
If a prime twin is not found in this range, then go to the next lowest range until a twin prime is found. This will be the largest twin prime.
Hope this helps.
来源:https://stackoverflow.com/questions/10143431/finding-the-nth-twin-prime