Lets N be a number (10<=N<=10^5).
I have to break it into 3 numbers (x,y,z) such that it validates the following conditions.
The bounds of x and y are an important part of the problem. I personally went with this Wolfram Alpha query and checked the exact forms of the variables.
Thanks to @Bleep-Bloop and comments, a very elegant bound optimization was found, which is x < n and x <= y < n - x. The results are the same and the times are nearly identical.
Also, since the only possible values for x and y are positive even integers, we can reduce the amount of loop iterations by half.
To optimize even further, since we compute the upper bound of x, we build a list of all possible values for x and make the computation parallel. That saves a massive amount of time on higher values of N but it's a bit slower for smaller values because of the overhead of the parallelization.
Here's the final code:
Non-parallel version, with int values:
List res = new List();
int n2 = n * n;
double maxX = 0.5 * (2.0 * n - Math.Sqrt(2) * Math.Sqrt(n2 + 1));
for (int x = 2; x < maxX; x += 2)
{
int maxY = (int)Math.Floor((n2 - 2.0 * n * x - 1.0) / (2.0 * n - 2.0 * x));
for (int y = x; y <= maxY; y += 2)
{
int z2 = x * x + y * y + 1;
int z = (int)Math.Sqrt(z2);
if (z * z == z2 && x + y + z <= n)
res.Add(x + "," + y + "," + z);
}
}
Parallel version, with long values:
using System.Linq;
...
// Use ConcurrentBag for thread safety
ConcurrentBag res = new ConcurrentBag();
long n2 = n * n;
double maxX = 0.5 * (2.0 * n - Math.Sqrt(2) * Math.Sqrt(n2 + 1L));
// Build list to parallelize
int nbX = Convert.ToInt32(maxX);
List xList = new List();
for (int x = 2; x < maxX; x += 2)
xList.Add(x);
Parallel.ForEach(xList, x =>
{
int maxY = (int)Math.Floor((n2 - 2.0 * n * x - 1.0) / (2.0 * n - 2.0 * x));
for (long y = x; y <= maxY; y += 2)
{
long z2 = x * x + y * y + 1L;
long z = (long)Math.Sqrt(z2);
if (z * z == z2 && x + y + z <= n)
res.Add(x + "," + y + "," + z);
}
});
When ran individually on a i5-8400 CPU, I get these results:
N: 10; Solutions: 1; Time elapsed: 0.03 ms (Not parallel,
int)N: 100; Solutions: 6; Time elapsed: 0.05 ms (Not parallel,
int)N: 1000; Solutions: 55; Time elapsed: 0.3 ms (Not parallel,
int)N: 10000; Solutions: 543; Time elapsed: 13.1 ms (Not parallel,
int)N: 100000; Solutions: 5512; Time elapsed: 849.4 ms (Parallel,
long)
You must use long when N is greater than 36340, because when it's squared, it overflows an int's max value. Finally, the parallel version starts to get better than the simple one when N is around 23000, with ints.