I recently wrote a short algorithm to calculate happy numbers in python. The program allows you to pick an upper bound and it will determine all the happy numbers below it.
Just to get a little more closure on this issue by seeing how fast I could truely find these numbers, I wrote a multithreaded C++ implementation of Dr_Asik's algorithm. There are two things that are important to realize about the fact that this implementation is multithreaded.
More threads does not necessarily lead to better execution times, there is a happy medium for every situation depending on the volume of numbers you want to calculate.
If you compare the times between this version running with one thread and the original version, the only factors that could cause a difference in time are the overhead from starting the thread and variable system performance issues. Otherwise, the algorithm is the same.
The code for this implementation (all credit for the algorithm goes to Dr_Asik) is here. Also, I wrote some speed tests with a double check for each test to help back up those 3 points.
Calculation of the first 100,000,000 happy numbers:
Original - 39.061 / 39.000 (Dr_Asik's original implementation)
1 Thread - 39.000 / 39.079
2 Threads - 19.750 / 19.890
10 Threads - 11.872 / 11.888
30 Threads - 10.764 / 10.827
50 Threads - 10.624 / 10.561 <--
100 Threads - 11.060 / 11.216
500 Threads - 13.385 / 12.527
From these results it looks like our happy medium is about 50 threads, plus or minus ten or so.