Why do two algorithms for finding primes differ in speed so much even though they seem to do the same number of iterations?

老子叫甜甜 提交于 2019-11-29 14:22:40

The difference is an algorithmic one. In the first version, trial division, you test each candidate against all small primes - that you don't stop when the small prime exceeds candidate ** 0.5 doesn't matter for the range(10**9 - 10**5 ,10**9) if smallprimes has a good upper bound, but it would if the length of the range were much larger in relation to the upper bound. For composites, that doesn't incur much cost, since most of them have at least one small prime divisor. But for primes, you have to go all the way to N**0.5. There are roughly 10**5/log(10**9) primes in that interval, each of them is trial-divided by about 10**4.5/log(10**4.5) primes, so that makes about 1.47*10**7 trial divisions against primes.

On the other hand, with the sieve, you only cross off composites, each composite is crossed off as many times as it has prime divisors, while primes aren't crossed off at all (so primes are free). The number of prime divisors of n is bounded by (a multiple of) log(n) (that's a crude upper bound, usually greatly overestimating), so that gives an upper bound of 10**5*log(10**9) (times a small constant) crossings-off, about 2*10**6. In addition to that, the crossing off may be less work than a division (don't know about Python, it is for C arrays). So you're doing less work with the sieve.

Edit: collected the actual numbers for 10**9-10**5 to 10**9.

Ticks: 259987
4832
Divisions: 20353799
4832

The sieve does only 259987 crossings-off, you see that the crude upper bound above is overestimating by a large factor. The trial division needs more than 20 million divisions, 16433632 of those for primes (x/log x underestimates the number of primes, for x = 10**4.5 by about 10%), 3435183 are used for the 3297 numbers in that range whose smallest prime factor is larger than n**(1/3).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!