I am looking for an implementation or clear algorithm for getting the prime factorization of N in either python, pseudocode or any
If you don't want to reinvent the wheel, use the library sympy
pip install sympy
Use the function sympy.ntheory.factorint
>>> from sympy.ntheory import factorint
>>> factorint(10**20+1)
{73: 1, 5964848081: 1, 1676321: 1, 137: 1}
You can factor some very large numbers:
>>> factorint(10**100+1)
{401: 1, 5964848081: 1, 1676321: 1, 1601: 1, 1201: 1, 137: 1, 73: 1, 129694419029057750551385771184564274499075700947656757821537291527196801: 1}
I just ran into a bug in this code when factoring the number 2**1427 * 31
.
File "buckets.py", line 48, in prettyprime
factors = primefactors.primefactors(n, sort=True)
File "/private/tmp/primefactors.py", line 83, in primefactors
limit = int(n ** .5) + 1
OverflowError: long int too large to convert to float
This code snippet:
limit = int(n ** .5) + 1
for checker in smallprimes:
if checker > limit: break
while n % checker == 0:
factors.append(checker)
n //= checker
limit = int(n ** .5) + 1
if checker > limit: break
should be rewritten into
for checker in smallprimes:
while n % checker == 0:
factors.append(checker)
n //= checker
if checker > n: break
which will likely perform faster on realistic inputs anyway. Square root is slow — basically the equivalent of many multiplications —, smallprimes
only has a few dozen members, and this way we remove the computation of n ** .5
from the tight inner loop, which is certainly helpful when factoring numbers like 2**1427
. There's simply no reason to compute sqrt(2**1427)
, sqrt(2**1426)
, sqrt(2**1425)
, etc. etc., when all we care about is "does the [square of the] checker exceed n
".
The rewritten code doesn't throw exceptions when presented with big numbers, and is roughly twice as fast according to timeit
(on sample inputs 2
and 2**718 * 31
).
Also notice that isprime(2)
returns the wrong result, but this is okay as long as we don't rely on it. IMHO you should rewrite the intro of that function as
if n <= 3:
return n >= 2
...
You should probably do some prime detection which you could look here, Fast algorithm for finding prime numbers?
You should read that entire blog though, there is a few algorithms that he lists for testing primality.
You could factorize up to a limit then use brent to get higher factors
from fractions import gcd
from random import randint
def brent(N):
if N%2==0: return 2
y,c,m = randint(1, N-1),randint(1, N-1),randint(1, N-1)
g,r,q = 1,1,1
while g==1:
x = y
for i in range(r):
y = ((y*y)%N+c)%N
k = 0
while (k<r and g==1):
ys = y
for i in range(min(m,r-k)):
y = ((y*y)%N+c)%N
q = q*(abs(x-y))%N
g = gcd(q,N)
k = k + m
r = r*2
if g==N:
while True:
ys = ((ys*ys)%N+c)%N
g = gcd(abs(x-ys),N)
if g>1: break
return g
def factorize(n1):
if n1==0: return []
if n1==1: return [1]
n=n1
b=[]
p=0
mx=1000000
while n % 2 ==0 : b.append(2);n//=2
while n % 3 ==0 : b.append(3);n//=3
i=5
inc=2
while i <=mx:
while n % i ==0 : b.append(i); n//=i
i+=inc
inc=6-inc
while n>mx:
p1=n
while p1!=p:
p=p1
p1=brent(p)
b.append(p1);n//=p1
if n!=1:b.append(n)
return sorted(b)
from functools import reduce
#n= 2**1427 * 31 #
n= 67898771 * 492574361 * 10000223 *305175781* 722222227*880949 *908909
li=factorize(n)
print (li)
print (n - reduce(lambda x,y :x*y ,li))
There is no need to calculate smallprimes
using primesbelow
, use smallprimeset
for that.
smallprimes = (2,) + tuple(n for n in xrange(3,1000,2) if n in smallprimeset)
Divide your primefactors
into two functions for handling smallprimes
and other for pollard_brent
, this can save a couple of iterations as all the powers of smallprimes will be divided from n.
def primefactors(n, sort=False):
factors = []
limit = int(n ** .5) + 1
for checker in smallprimes:
print smallprimes[-1]
if checker > limit: break
while n % checker == 0:
factors.append(checker)
n //= checker
if n < 2: return factors
else :
factors.extend(bigfactors(n,sort))
return factors
def bigfactors(n, sort = False):
factors = []
while n > 1:
if isprime(n):
factors.append(n)
break
factor = pollard_brent(n)
factors.extend(bigfactors(factor,sort)) # recurse to factor the not necessarily prime factor returned by pollard-brent
n //= factor
if sort: factors.sort()
return factors
By considering verified results of Pomerance, Selfridge and Wagstaff and Jaeschke, you can reduce the repetitions in isprime
which uses Miller-Rabin primality test. From Wiki.
Edit 1: Corrected return call of if-else
to append bigfactors to factors in primefactors
.
There's a python library with a collection of primality tests (including incorrect ones for what not to do). It's called pyprimes. Figured it's worth mentioning for posterity's purpose. I don't think it includes the algorithms you mentioned.