问题
I want to multiply every digit in a number to each other.
For example
515 would become 25(i.e 5*1*5)
10 would become 0(i.e 1*0)
111111 would become 1(i.e 1*1*1*1*1*1)
I used this code to do it
public static int evalulate(int no)
{
if(no==0)return 0;
int temp=1;
do
{
temp=(no%10)*temp;
no=no/10;
}while(no>0);
return temp;
}
problem is I want to evaluate for about a billion numbers like this
for(int i=0;i<1000000000;i++)evaluate(i);
This takes about 146 seconds on my processor.I want to evaluate it within some seconds.
So,is it possible to optimize this code using some shift
,and
,or
operators so that I can reduce the time to evaluate without using multiple threads or parallelizing it
Thanks
回答1:
First, figure out how many numbers you can store in memory. For this example, let's say you can store 999 numbers.
Your first step will be to pre-calculate the products of digits for all numbers from 0-999, and store that in memory. So, you'd have an array along the lines of:
multLookup = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
0, 2, 4, 6, 8, 10, 12, 14, 16, 18,
0, 3, 6, 9, 12, 15, 18, 21, 24, 27,
0, 4, 8, 12, 16, 20, 24, 28, 32, 36,
...]
Now, you'd break your number up into a bunch of 3 digit numbers. For example, if your number is 1739203423
, you'd break it up into 1
, 739
, 203
, and 423
. You'd look each of these up in your multLookup
array, and multiply the results together, like so:
solution = multLookup[1] * multLookup[739] * multLookup[203] * multLookup[423];
With this approach, you will have sped up your calculations by a factor of 3 (since we picked 999 items to store in memory). To speed it up by 5, store 99999 numbers in memory and follow the same steps. In your case, speeding it up by 5 means you'll arrive at your solution in 29.2 seconds.
Note: the gain isn't exactly linear with respect to how many numbers you store in memory. See jogojapan's reasoning in the comments under this answer for why that is.
If you know more about the order in which your numbers show up, or the range of your numbers (say your input is only in the range of [0, 10000]), you can make this algorithm smarter.
In your example, you're using a for loop to iterate from 0 to 1000000000. In this case, this approach will be super efficient because the memory won't page-fault very frequently and there will be fewer cache-misses.
But wait! You can make this even faster (for your specific for-loop iteration example)!! How, you ask? Caching! Lets say you're going through 10 digit numbers.
Let's say you start off at 8934236000
. Based on the 999 digits in memory solution, you'd break this down into 8
, 934
, 236
, and 000
. Then you'd multiply:
solution = multLookup[8] * multLookup[934] * multLookup[236] * multLookup[0];
Next, you'd take 8934236001
, break it down to 8
, 934
, 236
, and 001
, and multiply:
solution = multLookup[8] * multLookup[934] * multLookup[236] * multLookup[1];
And so on... But we notice that the first three lookups are the same for the next 997 iterations! So, we cache that.
cache = multLookup[8] * multLookup[934] * multLookup[236];
And then we use the cache as such:
for (int i = 0; i < 1000; i++) {
solution = cache * i;
}
And just like that, we've almost reduced the time by a factor of 4. So you take the ~29.2 second solution you had, and divide that by 4 to go through all billion numbers in ~7.3 seconds
回答2:
If you can store the result of each operation for all your numbers.. Then you can use Memoization. That way you need to only calculate 1 digit.
int prodOf(int num){
// can be optimized to store 1/10 of the numbers, since the last digit will always be processed
static std::vector<int> memo(<max number of iterations>, -1);
if(num == 0) return 0;
if(memo[num] != -1 )return memo[num];
int prod = (num%10) * prodOf(num/10);
memo[num] = prod;
return prod;
}
回答3:
Some test i made, With simple C/C++ code on my PC (Xeon 3.2GHz),
last no = i = 999999999 ==> 387420489 nb sec 23
#include "stdafx.h"
#include <chrono>
#include <iostream>
#undef _TRACE_
inline int evaluate(int no)
{
#ifdef _TRACE_
std::cout << no;
#endif
if(no==0)return 0;
int temp=1;
do
{
temp=(no%10)*temp;
no=no/10;
}while(no>0);
#ifdef _TRACE_
std::cout << " => " << temp << std::endl;
#endif // _TRACE_
return temp;
}
int _tmain(int argc, _TCHAR* argv[])
{
std::chrono::time_point<std::chrono::system_clock> start(std::chrono::system_clock::now());
int last = 0;
int i = 0;
for(/*int i = 0*/;i<1000000000;++i) {
last = evaluate(i);
}
std::cout << "last no = i = " << (i-1) << " ==> " << last << std::endl;
std::chrono::time_point<std::chrono::system_clock> end(std::chrono::system_clock::now());
std::cout << "nb sec " << std::chrono::duration_cast<std::chrono::seconds>(end - start).count() << std::endl;
return 0;
}
I also tested the loop split over multiple thread with openMP and result is 0 second, So I would say that it would be useful if you consider performance problem of using a real efficient language.
pragma omp parallel for
for(int i = 0;i<1000000000;++i) {
/*last[threadID][i] = */evaluate(i);
}
来源:https://stackoverflow.com/questions/17357368/how-to-multiply-each-digit-in-a-number-efficiently