Generating unique, ordered Pythagorean triplets

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-27 06:05:35

Pythagorean Triples make a good example for claiming "for loops considered harmful", because for loops seduce us into thinking about counting, often the most irrelevant part of a task.

(I'm going to stick with pseudo-code to avoid language biases, and to keep the pseudo-code streamlined, I'll not optimize away multiple calculations of e.g. x * x and y * y.)

Version 1:

for x in 1..N {
    for y in 1..N {
        for z in 1..N {
            if x * x + y * y == z * z then {
                // use x, y, z
            }
        }
    }
}

is the worst solution. It generates duplicates, and traverses parts of the space that aren't useful (e.g. whenever z < y). Its time complexity is cubic on N.

Version 2, the first improvement, comes from requiring x < y < z to hold, as in:

for x in 1..N {
    for y in x+1..N {
        for z in y+1..N {
            if x * x + y * y == z * z then {
                // use x, y, z
            }
        }
    }
}

which reduces run time and eliminates duplicated solutions. However, it is still cubic on N; the improvement is just a reduction of the co-efficient of N-cubed.

It is pointless to continue examining increasing values of z after z * z < x * x + y * y no longer holds. That fact motivates Version 3, the first step away from brute-force iteration over z:

for x in 1..N {
    for y in x+1..N {
        z = y + 1
        while z * z < x * x + y * y {
            z = z + 1
        }
        if z * z == x * x + y * y and z <= N then {
            // use x, y, z
        }
    }
}

For N of 1000, this is about 5 times faster than Version 2, but it is still cubic on N.

The next insight is that x and y are the only independent variables; z depends on their values, and the last z value considered for the previous value of y is a good starting search value for the next value of y. That leads to Version 4:

for x in 1..N {
    y = x+1
    z = y+1
    while z <= N {
        while z * z < x * x + y * y {
            z = z + 1
        }
        if z * z == x * x + y * y and z <= N then {
            // use x, y, z
        }
        y = y + 1
    }
}

which allows y and z to "sweep" the values above x only once. Not only is it over 100 times faster for N of 1000, it is quadratic on N, so the speedup increases as N grows.

I've encountered this kind of improvement often enough to be mistrustful of "counting loops" for any but the most trivial uses (e.g. traversing an array).

Update: Apparently I should have pointed out a few things about V4 that are easy to overlook.

  1. Both of the while loops are controlled by the value of z (one directly, the other indirectly through the square of z). The inner while is actually speeding up the outer while, rather than being orthogonal to it. It's important to look at what the loops are doing, not merely to count how many loops there are.

  2. All of the calculations in V4 are strictly integer arithmetic. Conversion to/from floating-point, as well as floating-point calculations, are costly by comparison.

  3. V4 runs in constant memory, requiring only three integer variables. There are no arrays or hash tables to allocate and initialize (and, potentially, to cause an out-of-memory error).

  4. The original question allowed all of x, y, and x to vary over the same range. V1..V4 followed that pattern.

Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)

Version 4:           46 sec.
using square root:  134 sec.
array and map:      400 sec.

The "array and map" algorithm is essentially:

squares = array of i*i for i in 1 .. N
roots = map of i*i -> i for i in 1 .. N
for x in 1 .. N
    for y in x+1 .. N
        z = roots[squares[x] + squares[y]]
        if z exists use x, y, z

The "using square root" algorithm is essentially:

for x in 1 .. N
    for y in x+1 .. N
        z = (int) sqrt(x * x + y * y)
        if z * z == x * x + y * y then use x, y, z

The actual code for V4 is:

public Collection<Triple> byBetterWhileLoop() {
    Collection<Triple> result = new ArrayList<Triple>(limit);
    for (int x = 1; x < limit; ++x) {
        int xx = x * x;
        int y = x + 1;
        int z = y + 1;
        while (z <= limit) {
            int zz = xx + y * y;
            while (z * z < zz) {++z;}
            if (z * z == zz && z <= limit) {
                result.add(new Triple(x, y, z));
            }
            ++y;
        }
    }
    return result;
}

Note that x * x is calculated in the outer loop (although I didn't bother to cache z * z); similar optimizations are done in the other variations.

I'll be glad to provide the Java source code on request for the other variations I timed, in case I've mis-implemented anything.

Kyle Gullion

Substantially faster than any of the solutions so far. Finds triplets via a ternary tree.

Wolfram says:

Hall (1970) and Roberts (1977) prove that is a primitive Pythagorean triple if and only if

(a,b,c)=(3,4,5)M

where M is a finite product of the matrices U,A,D.

And there we have a formula to generate every primitive triple.

In the above formula, the hypotenuse is ever growing so it's pretty easy to check for a max length.

In Python:

import numpy as np

def gen_prim_pyth_trips(limit=None):
    u = np.mat(' 1  2  2; -2 -1 -2; 2 2 3')
    a = np.mat(' 1  2  2;  2  1  2; 2 2 3')
    d = np.mat('-1 -2 -2;  2  1  2; 2 2 3')
    uad = np.array([u, a, d])
    m = np.array([3, 4, 5])
    while m.size:
        m = m.reshape(-1, 3)
        if limit:
            m = m[m[:, 2] <= limit]
        yield from m
        m = np.dot(m, uad)

If you'd like all triples and not just the primitives:

def gen_all_pyth_trips(limit):
    for prim in gen_prim_pyth_trips(limit):
        i = prim
        for _ in range(limit//prim[2]):
            yield i
            i = i + prim

list(gen_prim_pyth_trips(10**4)) took 2.81 milliseconds to come back with 1593 elements while list(gen_all_pyth_trips(10**4)) took 19.8 milliseconds to come back with 12471 elements

For reference, the accepted answer (in python) took 38 seconds for 12471 elements.

Just for fun, setting the upper limit to one million list(gen_all_pyth_trips(10**6)) returns in 2.66 seconds with 1980642 elements (almost 2 million triples in 3 seconds). list(gen_all_pyth_trips(10**7)) brings my computer to its knees as the list gets so large it consumes every last bit of ram. Doing something like sum(1 for _ in gen_all_pyth_trips(10**7)) gets around that limitation and returns in 30 seconds with 23471475 elements.

For more information on the algorithm used, check out the articles on Wolfram and Wikipedia.

You should define x < y < z.

for x in range (1, 1000):
    for y in range (x + 1, 1000):
            for z in range(y + 1, 1000):

Another good optimization would be to only use x and y and calculate zsqr = x * x + y * y. If zsqr is a square number (or z = sqrt(zsqr) is a whole number), it is a triplet, else not. That way, you need only two loops instead of three (for your example, that's about 1000 times faster).

The previously listed algorithms for generating Pythagorean triplets are all modifications of the naive approach derived from the basic relationship a^2 + b^2 = c^2 where (a, b, c) is a triplet of positive integers. It turns out that Pythagorean triplets satisfy some fairly remarkable relationships that can be used to generate all Pythagorean triplets.

Euclid discovered the first such relationship. He determined that for every Pythagorean triple (a, b, c), possibly after a reordering of a and b there are relatively prime positive integers m and n with m > n, at least one of which is even, and a positive integer k such that

a = k (2mn)
b = k (m^2 - n^2)
c = k (m^2 + n^2)

Then to generate Pythagorean triplets, generate relatively prime positive integers m and n of differing parity, and a positive integer k and apply the above formula.

struct PythagoreanTriple {
    public int a { get; private set; }
    public int b { get; private set; }
    public int c { get; private set; }

    public PythagoreanTriple(int a, int b, int c) : this() {
        this.a = a < b ? a : b;
        this.b = b < a ? a : b;
        this.c = c;
    }

    public override string ToString() {
        return String.Format("a = {0}, b = {1}, c = {2}", a, b, c);
    }

    public static IEnumerable<PythagoreanTriple> GenerateTriples(int max) {
        var triples = new List<PythagoreanTriple>();
        for (int m = 1; m <= max / 2; m++) {
            for (int n = 1 + (m % 2); n < m; n += 2) {
                if (m.IsRelativelyPrimeTo(n)) {
                    for (int k = 1; k <= max / (m * m + n * n); k++) {
                        triples.Add(EuclidTriple(m, n, k));
                    }
                }
            }
        }

        return triples;
    }

    private static PythagoreanTriple EuclidTriple(int m, int n, int k) {
        int msquared = m * m;
        int nsquared = n * n;
        return new PythagoreanTriple(k * 2 * m * n, k * (msquared - nsquared), k * (msquared + nsquared));
    }
}

public static class IntegerExtensions {
    private static int GreatestCommonDivisor(int m, int n) {
        return (n == 0 ? m : GreatestCommonDivisor(n, m % n));
    }

    public static bool IsRelativelyPrimeTo(this int m, int n) {
        return GreatestCommonDivisor(m, n) == 1;
    }
}

class Program {
    static void Main(string[] args) {
        PythagoreanTriple.GenerateTriples(1000).ToList().ForEach(t => Console.WriteLine(t));            
    }
}

The Wikipedia article on Formulas for generating Pythagorean triples contains other such formulae.

Algorithms can be tuned for speed, memory usage, simplicity, and other things.

Here is a pythagore_triplets algorithm tuned for speed, at the cost of memory usage and simplicity. If all you want is speed, this could be the way to go.

Calculation of list(pythagore_triplets(10000)) takes 40 seconds on my computer, versus 63 seconds for ΤΖΩΤΖΙΟΥ's algorithm, and possibly days of calculation for Tafkas's algorithm (and all other algorithms which use 3 embedded loops instead of just 2).

def pythagore_triplets(n=1000):
   maxn=int(n*(2**0.5))+1 # max int whose square may be the sum of two squares
   squares=[x*x for x in xrange(maxn+1)] # calculate all the squares once
   reverse_squares=dict([(squares[i],i) for i in xrange(maxn+1)]) # x*x=>x
   for x in xrange(1,n):
     x2 = squares[x]
     for y in xrange(x,n+1):
       y2 = squares[y]
       z = reverse_squares.get(x2+y2)
       if z != None:
         yield x,y,z

>>> print list(pythagore_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]

Note that if you are going to calculate the first billion triplets, then this algorithm will crash before it even starts, because of an out of memory error. So ΤΖΩΤΖΙΟΥ's algorithm is probably a safer choice for high values of n.

BTW, here is Tafkas's algorithm, translated into python for the purpose of my performance tests. Its flaw is to require 3 loops instead of 2.

def gcd(a, b):
  while b != 0:
    t = b
    b = a%b
    a = t
  return a

def find_triple(upper_boundary=1000):
  for c in xrange(5,upper_boundary+1):
    for b in xrange(4,c):
      for a in xrange(3,b):
        if (a*a + b*b == c*c and gcd(a,b) == 1):
          yield a,b,c
def pyth_triplets(n=1000):
    "Version 1"
    for x in xrange(1, n):
        x2= x*x # time saver
        for y in xrange(x+1, n): # y > x
            z2= x2 + y*y
            zs= int(z2**.5)
            if zs*zs == z2:
                yield x, y, zs

>>> print list(pyth_triplets(20))
[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]

V.1 algorithm has monotonically increasing x values.

EDIT

It seems this question is still alive :)
Since I came back and revisited the code, I tried a second approach which is almost 4 times as fast (about 26% of CPU time for N=10000) as my previous suggestion since it avoids lots of unnecessary calculations:

def pyth_triplets(n=1000):
    "Version 2"
    for z in xrange(5, n+1):
        z2= z*z # time saver
        x= x2= 1
        y= z - 1; y2= y*y
        while x < y:
            x2_y2= x2 + y2
            if x2_y2 == z2:
                yield x, y, z
                x+= 1; x2= x*x
                y-= 1; y2= y*y
            elif x2_y2 < z2:
                x+= 1; x2= x*x
            else:
                y-= 1; y2= y*y

>>> print list(pyth_triplets(20))
[(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17), (12, 16, 20)]

Note that this algorithm has increasing z values.

If the algorithm was converted to C —where, being closer to the metal, multiplications take more time than additions— one could minimalise the necessary multiplications, given the fact that the step between consecutive squares is:

(x+1)² - x² = (x+1)(x+1) - x² = x² + 2x + 1 - x² = 2x + 1

so all of the inner x2= x*x and y2= y*y would be converted to additions and subtractions like this:

def pyth_triplets(n=1000):
    "Version 3"
    for z in xrange(5, n+1):
        z2= z*z # time saver
        x= x2= 1; xstep= 3
        y= z - 1; y2= y*y; ystep= 2*y - 1
        while x < y:
            x2_y2= x2 + y2
            if x2_y2 == z2:
                yield x, y, z
                x+= 1; x2+= xstep; xstep+= 2
                y-= 1; y2-= ystep; ystep-= 2
            elif x2_y2 < z2:
                x+= 1; x2+= xstep; xstep+= 2
            else:
                y-= 1; y2-= ystep; ystep-= 2

Of course, in Python the extra bytecode produced actually slows down the algorithm compared to version 2, but I would bet (without checking :) that V.3 is faster in C.

Cheers everyone :)

I juste extended Kyle Gullion 's answer so that triples are sorted by hypothenuse, then longest side.

It doesn't use numpy, but requires a SortedCollection (or SortedList) such as this one

def primitive_triples():
""" generates primitive Pythagorean triplets x<y<z
sorted by hypotenuse z, then longest side y
through Berggren's matrices and breadth first traversal of ternary tree
:see: https://en.wikipedia.org/wiki/Tree_of_primitive_Pythagorean_triples
"""
key=lambda x:(x[2],x[1])
triples=SortedCollection(key=key)
triples.insert([3,4,5])
A = [[ 1,-2, 2], [ 2,-1, 2], [ 2,-2, 3]]
B = [[ 1, 2, 2], [ 2, 1, 2], [ 2, 2, 3]]
C = [[-1, 2, 2], [-2, 1, 2], [-2, 2, 3]]

while triples:
    (a,b,c) = triples.pop(0)
    yield (a,b,c)

    # expand this triple to 3 new triples using Berggren's matrices
    for X in [A,B,C]:
        triple=[sum(x*y for (x,y) in zip([a,b,c],X[i])) for i in range(3)]
        if triple[0]>triple[1]: # ensure x<y<z
            triple[0],triple[1]=triple[1],triple[0]
        triples.insert(triple)

def triples():
""" generates all Pythagorean triplets triplets x<y<z 
sorted by hypotenuse z, then longest side y
"""
prim=[] #list of primitive triples up to now
key=lambda x:(x[2],x[1])
samez=SortedCollection(key=key) # temp triplets with same z
buffer=SortedCollection(key=key) # temp for triplets with smaller z
for pt in primitive_triples():
    z=pt[2]
    if samez and z!=samez[0][2]: #flush samez
        while samez:
            yield samez.pop(0)
    samez.insert(pt)
    #build buffer of smaller multiples of the primitives already found
    for i,pm in enumerate(prim):
        p,m=pm[0:2]
        while True:
            mz=m*p[2]
            if mz < z:
                buffer.insert(tuple(m*x for x in p))
            elif mz == z: 
                # we need another buffer because next pt might have
                # the same z as the previous one, but a smaller y than
                # a multiple of a previous pt ...
                samez.insert(tuple(m*x for x in p))
            else:
                break
            m+=1
        prim[i][1]=m #update multiplier for next loops
    while buffer: #flush buffer
        yield buffer.pop(0)
    prim.append([pt,2]) #add primitive to the list

the code is available in the math2 module of my Python library. It is tested against some series of the OEIS (code here at the bottom), which just enabled me to find a mistake in A121727 :-)

I wrote that program in Ruby and it similar to the python implementation. The important line is:

if x*x == y*y + z*z && gcd(y,z) == 1:

Then you have to implement a method that return the greatest common divisor (gcd) of two given numbers. A very simple example in Ruby again:

def gcd(a, b)
    while b != 0
      t = b
      b = a%b
      a = t
    end
    return a
end

The full Ruby methon to find the triplets would be:

def find_triple(upper_boundary)

  (5..upper_boundary).each {|c|
    (4..c-1).each {|b|
      (3..b-1).each {|a|
        if (a*a + b*b == c*c && gcd(a,b) == 1)
          puts "#{a} \t #{b} \t #{c}"
        end
      }
    }
  }
end
Pitarou

Yes, there is.

Okay, now you'll want to know why. Why not just constrain it so that z > y? Try

for z in range (y+1, 1000)

Old Question, but i'll still input my stuff. There are two general ways to generate unique pythagorean triples. One Is by Scaling, and the other is by using this archaic formula.

What scaling basically does it take a constant n, then multiply a base triple, lets say 3,4,5 by n. So taking n to be 2, we get 6,8,10 our next triple.

Scaling

def pythagoreanScaled(n):
    triplelist = []
    for x in range(n):
        one = 3*x
        two = 4*x
        three = 5*x
        triple = (one,two,three)
        triplelist.append(triple)
return triplelist

The formula method uses the fact the if we take a number x, calculate 2m, m^2+1, and m^2-1, those three will always be a pythagorean triplet.

Formula

def pythagoreantriple(n):
    triplelist = []
    for x in range(2,n):
        double = x*2
        minus = x**2-1
        plus = x**2+1
        triple = (double,minus,plus)
        triplelist.append(triple)
    return triplelist
from  math import sqrt
from itertools import combinations

#Pythagorean triplet - a^2 + b^2 = c^2 for (a,b) <= (1999,1999)
def gen_pyth(n):
if n >= 2000 :
  return
ELEM =   [  [ i,j,i*i + j*j ] for i , j in list(combinations(range(1, n +   1 ), 2)) if sqrt(i*i + j*j).is_integer() ]
print (*ELEM , sep = "\n")


gen_pyth(200)

Version 5 to Joel Neely.

Since X can be max of 'N-2' and Y can be max of 'N-1' for range of 1..N. Since Z max is N and Y max is N-1, X can be max of Sqrt ( N * N - (N-1) * (N-1) ) = Sqrt ( 2 * N - 1 ) and can start from 3.

MaxX = ( 2 * N - 1 ) ** 0.5

for x in 3..MaxX {
  y = x+1
  z = y+1
  m = x*x + y*y
  k = z * z
  while z <= N {
     while k < m {
        z = z + 1
        k = k + (2*z) - 1
    }
    if k == m and z <= N then {
        // use x, y, z
    }
    y = y + 1
    m = m + (2 * y) - 1
  }
 }

Just checking, but I've been using the following code to make pythagorean triples. It's very fast (and I've tried some of the examples here, though I kind of learned them and wrote my own and came back and checked here (2 years ago)). I think this code correctly finds all pythagorean triples up to (name your limit) and fairly quickly too. I used C++ to make it.

ullong is unsigned long long and I created a couple of functions to square and root my root function basically said if square root of given number (after making it whole number (integral)) squared not equal number give then return -1 because it is not rootable. _square and _root do as expected as of description above, I know of another way to optimize it but I haven't done nor tested that yet.

generate(vector<Triple>& triplist, ullong limit) {
cout<<"Please wait as triples are being generated."<<endl;
register ullong a, b, c;
register Triple trip;
time_t timer = time(0);

for(a = 1; a <= limit; ++a) {
    for(b = a + 1; b <= limit; ++b) {
        c = _root(_square(a) + _square(b));

        if(c != -1 && c <= limit) {
            trip.a = a; trip.b = b; trip.c = c;

            triplist.push_back(trip);

        } else if(c > limit)
            break;
    }
}

timer = time(0) - timer;
cout<<"Generated "<<triplist.size()<<" in "<<timer<<" seconds."<<endl;
cin.get();
cin.get();

}

Let me know what you all think. It generates all primitive and non-primitive triples according to the teacher I turned it in for. (she tested it up to 100 if I remember correctly).

The results from the v4 supplied by a previous coder here are

Below is a not-very-scientific set of timings (using Java under Eclipse on my older laptop with other stuff running...), where the "use x, y, z" was implemented by instantiating a Triple object with the three values and putting it in an ArrayList. (For these runs, N was set to 10,000, which produced 12,471 triples in each case.)

Version 4: 46 sec. using square root: 134 sec. array and map: 400 sec.

The results from mine is How many triples to generate: 10000

Please wait as triples are being generated. Generated 12471 in 2 seconds.

That is before I even start optimizing via the compiler. (I remember previously getting 10000 down to 0 seconds with tons of special options and stuff). My code also generates all the triples with 100,000 as the limit of how high side1,2,hyp can go in 3.2 minutes (I think the 1,000,000 limit takes an hour).

I modified the code a bit and got the 10,000 limit down to 1 second (no optimizations). On top of that, with careful thinking, mine could be broken down into chunks and threaded upon given ranges (for example 100,000 divide into 4 equal chunks for 3 cpu's (1 extra to hopefully consume cpu time just in case) with ranges 1 to 25,000 (start at 1 and limit it to 25,000), 25,000 to 50,000 , 50,000 to 75,000, and 75,000 to end. I may do that and see if it speeds it up any (I will have threads premade and not include them in the actual amount of time to execute the triple function. I'd need a more precise timer and a way to concatenate the vectors. I think that if 1 3.4 GHZ cpu with 8 gb ram at it's disposal can do 10,000 as lim in 1 second then 3 cpus should do that in 1/3 a second (and I round to higher second as is atm).

It should be noted that for a, b, and c you don't need to loop all the way to N.

For a, you only have to loop from 1 to int(sqrt(n**2/2))+1, for b, a+1 to int(sqrt(n**2-a**2))+1, and for c from int(sqrt(a**2+b**2) to int(sqrt(a**2+b**2)+2.

Arun
# To find all pythagorean triplets in a range
import math
n = int(input('Enter the upper range of limit'))
for i in range(n+1):
    for j in range(1, i):
        k = math.sqrt(i*i + j*j)
        if k % 1 == 0 and k in range(n+1):
            print(i,j,int(k))
Geetanjali

You can try this

  triplets=[]
    for a in range(1,100):
        for b in range(1,100):
            for c in range(1,100):
                if  a**2 + b**2==c**2:
                    i=[a,b,c]
                    triplets.append(i)
                    for i in triplets:
                          i.sort()
                          if triplets.count(i)>1:
                              triplets.remove(i)
    print(triplets)
Nayeem Joy

U have to use Euclid's proof of Pythagorean triplets. Follow below...

U can choose any arbitrary number greater than zero say m,n

According to Euclid the triplet will be a(m*m-n*n), b(2*m*n), c(m*m+n*n)

Now apply this formula to find out the triplets, say our one value of triplet is 6 then, other two? Ok let’s solve...

a(m*m-n*n), b(2*m*n) , c(m*m+n*n)

It is sure that b(2*m*n) is obviously even. So now

(2*m*n)=6 =>(m*n)=3 =>m*n=3*1 =>m=3,n=1

U can take any other value rather than 3 and 1, but those two values should hold the product of two numbers which is 3 (m*n=3)

Now, when m=3 and n=1 Then,

a(m*m-n*n)=(3*3-1*1)=8 , c(m*m-n*n)=(3*3+1*1)=10

6,8,10 is our triplet for value, this our visualization of how generating triplets.

if given number is odd like (9) then slightly modified here, because b(2*m*n)

will never be odd. so, here we have to take

a(m*m-n*n)=7, (m+n)*(m-n)=7*1, So, (m+n)=7, (m-n)=1

Now find m and n from here, then find the other two values.

If u don’t understand it, read it again carefully.

Do code according this, it will generate distinct triplets efficiently.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!