Bounding this program to determine the sum of reciprocal integers not containing zero

问题

Let A denote the set of positive integers whose decimal representation does not contain the digit 0. The sum of the reciprocals of the elements in A is known to be 23.10345.

Ex. 1,2,3,4,5,6,7,8,9,11-19,21-29,31-39,41-49,51-59,61-69,71-79,81-89,91-99,111-119, ...

Then take the reciprocal of each number, and sum the total.

How can this be verified numerically?

Write a computer program to verify this number.

Here is what I have written so far, I need help bounding this problem as this currently takes too long to complete:

Code in Java

import java.util.*; 

public class recip
{
    public static void main(String[] args)
    {
        int current = 0; double total = 0;

        while(total < 23.10245)
        {
            if(Integer.toString(current).contains("0"))
            {
                current++;
            }
            else
            {
                total = total + (1/(double)current);
                current++;
            }
            System.out.println("Total: " + total);
        }
    }
}

回答1:

This is not that hard when approached properly.

Assume for example that you want to find the sum of reciprocals of all integers starting (i.e. the left-most digits) with 123 and ending with k non-zero digits. Obviously there are 9^k such integers and the reciprocal of each of these integers is in the range 1/(124*10^k) .. 1/(123*10^k). Hence the sum of reciprocals of all these integers is bounded by (9/10)^k/124 and (9/10)^k/123.

To find bounds for sum of all reciprocals starting with 123 one has to add up the bounds above for every k>=0. This is a geometric serie, hence it can be derived that the sum of reciprocals of integers starting with 123 is bounded by 10*(9/10)^k/124 and 10*(9/10)^k/123.

The same method can of course be applied for any combination of left-most digits. The more digits we examine on the left, the more accurate the result becomes. Here is an implementation of this approach in python:

def approx(t,k):
    """Returns a lower bound and an upper bound on the sum of reciprocals of
       positive integers starting with t not containing 0 in its decimal
       representation.
       k is the recursion depth of the search, i.e. we append k more digits
       to t, before approximating the sum. A larger k gives more accurate
       results, but takes longer."""
    if k == 0:
      return 10.0/(t+1), 10.0/t
    else:
        if t > 0:
            low, up = 1.0/t, 1.0/t
        else:
            low, up = 0, 0
        for i in range(10*t+1, 10*t+10):
            l,u = approx(i, k-1)
            low += l
            up += u
    return low, up

Calling approx(0, 8) for example gives the lower and upper bound: 23.103447707... and 23.103448107.... which is close to the claim 23.10345 given by the OP.

There are methods that converge faster to the sum in question, but they require more math. A much better approximation of the sum can be found here. A generalization of the problem are the Kempner series.

回答2:

For all values of current greater than some threshold N, 1.0/(double)current will be sufficiently small that total does not increase as a result of adding 1.0/(double)current. Thus, the termination criterion should be something like

 while(total != total + (1.0/(double)current))

instead of testing against the limit that is known a priori. Your loop will stop when current reaches this special value of N.

回答3:

I suspect that casting to string and then checking for the character '0' is the step that takes too long. If you want to avoid all zeroes, might help to increase current thus:

(Edited -- thanks to Aaron McSmooth)

current++;  
for( int i = 10000000; i >= 10; i = i / 10 )  
{
    if ( current % i ) == 0
    {
         current = current + ( i / 10 );
    }
}

This is untested, but the concept should be clear: whenever you hit a multiple of a power of ten (e.g. 300 or 20000), you add the next lower power of 10 (in our examples 10 + 1 and 1000 + 100 + 10 + 1, respectively) until there are no more zeroes in your number.

Change your while loop accordingly and see if this doesn't help performance to the point were your problem becomes manageable.

Oh, and you might want to restrict the System.out output a bit as well. Would every tenth, one hundreth or 10000th iteration be enough?

Edit the second: After some sleep, I suspect my answer might be a little short-sighted (blame the late hour, if you will). I simply hoped that, oh, one million iterations of current would get you to the solution and left it at that, instead of calculating the correction cases using log( current ) etc.

On second thought, I see two problems with this whole problem. One is that your target number of 23.10345 is a leeeeettle to round for my tastes. After all, you are adding thousands of items like "1/17", "1/11111" and so on, with infinite decimal representations, and it is highly unlikely that they add up to exactly 23.10345. If some specialist for numerical mathematics says so, fine -- but then I'd like to see the algorithm by which they arrived at this conclusion.

The other problem is related to the first and concerns the limited in-memory binary representation of your rational numbers. You might get by using BigDecimals, but I have my doubts.

So, basically, I suggest you reprogram the numerical algorithm instead of going for the brute force solution. Sorry.

Edit the third: Out of curiosity, I wrote this in C++ to test my theories. It's run for 6 minutes now and is at about 14.5 (roughly 550 mio. iterations). We'll see.

Current version is

double total = 0;
long long current = 0, currPowerCeiling = 10, iteration = 0;
while( total < 23.01245 )
{
    current++;
    iteration++;
    if( current >= currPowerCeiling )
        currPowerCeiling *= 10;

    for( long long power = currPowerCeiling; power >= 10; power = power / 10 )  
    {
        if( ( current % power ) == 0 )
        {
            current = current + ( power / 10 );
        }
    }
    total += ( 1.0 / current );

    if( ! ( iteration % 1000000 ) )
        std::cout << iteration / 1000000 << " Mio iterations: " << current << "\t -> " << total << std::endl;
}
std::cout << current << "\t" << total << std::endl;

Calculating currPowerCeiling (or however one might call this) by hand saves some log10 and pow calculations each iteration. Every little bit helps -- but it still takes forever...

Edit the fourth: Status is around 66,000 mio iterations, total is up to 16.2583, runtime is at around 13 hours. Not looking good, Bobby S. -- I suggest a more mathematical approach.

回答4:

How about storing the current number as a byte array where each array element is a digit 0-9? That way, you can detect zeroes very quickly (comparing bytes using == instead of String.contains).

The downside would be that you'll need to implement the incrementing yourself instead of using ++. You'll also need to devise a way to mark "nonexistent" digits so that you don't detect them as zeroes. Storing -1 for nonexistent digits sounds like a reasonable solution.

回答5:

For a signed 32-bit integer, this program will never stop. It will actually converge towards -2097156. Since the maximum harmonic number (the sum of integral reciprocals from 1 to N) of a signed 32-bit integer is ~14.66, this loop will never terminate, even when current wraps around from 2^31 - 1 to -2^31. Since the reciprocal of the largest negative 32-bit integer is ~-4.6566e-10, every time current returns to 0, the sum will be negative. Given that the largest number representable by a double such that number + + 1/2^31 == number is 2^52/2^31, you get roughly -2097156 as the converging value.

Having said that, and assuming you don't have a direct way of calculating the harmonic number of an arbitrary integer, there are a few things you can do to speed up your inner loop. First, the most expensive operation is going to be System.out.println; that has to interact with the console in which case your program will eventually have to flush the buffer to the console (if any). There are cases where that may not actually happen, but since you are using that for debugging they are not relevant to this question.

However, you also spend a lot of time determining whether a number has a zero. You can flip that test around to generate ranges of integers such that within that range you are guaranteed not to have an integer with a zero digit. That is really simple to do incrementally (in C++, but trivial enough to convert to Java):

class c_advance_to_next_non_zero_decimal
{
public:
    c_advance_to_next_non_zero_decimal(): next(0), max_set_digit_index(0)
    {
        std::fill_n(digits, digit_count, 0);

        return;
    }

    int advance_to_next_non_zero_decimal()
    {
        assert((next % 10) == 0);

        int offset= 1;
        digits[0]+= 1;

        for (int digit_index= 1, digit_value= 10; digit_index<=max_set_digit_index; ++digit_index, digit_value*= 10)
        {
            if (digits[digit_index]==0)
            {
                digits[digit_index]= 1;
                offset+= digit_value;
            }
        }

        next+= offset;

        return next;
    }

    int advance_to_next_zero_decimal()
    {
        assert((next % 10)!=0);
        assert(digits[0]==(next % 10));

        int offset= 10 - digits[0];
        digits[0]+= offset;
        assert(digits[0]==10);

        // propagate carries forward
        for (int digit_index= 0; digits[digit_index]==10 && digit_index<digit_count; ++digit_index)
        {
            digits[digit_index]= 0;
            digits[digit_index + 1]+= 1;

            max_set_digit_index= max(digit_index + 1, max_set_digit_index);
        }

        next+= offset;
        return next;
    }

private:
    int next;

    static const size_t digit_count= 10; // log10(2**31)

    int max_set_digit_index;

    int digits[digit_count];
};

What the code above does is to iterate over every range of numbers such that the range only contains numbers without zeroes. It works by determining how to go from N000... to N111... and from N111... to (N+1)000..., carrying (N+1) into 1(0)000... if necessary.

On my laptop, I can generate the harmonic number of 2^31 - 1 in 8.73226 seconds.

回答6:

public class SumOfReciprocalWithoutZero {
public static void main(String[] args) {

    int maxSize=Integer.MAX_VALUE/10;
    long time=-System.currentTimeMillis();
    BitSet b=new BitSet(maxSize);
    setNumbersWithZeros(10,maxSize,b);

    double sum=0.0;
    for(int i=1;i<maxSize;i++)
    {
        if(!b.get(i))
        {
            sum+=1.0d/(double)i;
        }
    }
    time+=System.currentTimeMillis();
    System.out.println("Total: "+sum+"\nTimeTaken : "+time+" ms");


}

 static void setNumbersWithZeros(int srt,int end,BitSet b)
 {
        for(int j=srt;j<end;j*=10)
        {
            for(int i=1;i<=10;i++)
        {
            int num=j*i;
            b.set(num);
        }
            if(j>=100)
            setInbetween(j, b);
        }
 }

 static void setInbetween(int strt,BitSet b)
 {

     int bitToSet;
     bitToSet=strt;
     for(int i=1;i<=10;i++)
     {
      int nxtInt=-1;

     while((nxtInt=b.nextSetBit(nxtInt+1))!=strt)
     {
         b.set(bitToSet+nxtInt);
     }
     nxtInt=-1;
     int lim=strt/10;
     while((nxtInt=b.nextClearBit(nxtInt+1))<lim)
     {
         b.set(bitToSet+nxtInt);
     }

     bitToSet=strt*i;

     }
 }


}

This is an implementation using BitSet.I calculated the sum of reciprocal's for all integer's in range (1-Integer.MAX_VALUE/10).The sum comes upto 13.722766931560747.This is the maximum I could calculate using BitSet since the maximum range for BitSet is Integer.MAX_VALUE.I need to divide it by 10 and limit the range to avoid overflow.But there is significant improvement in speed.I'm just posting this code in-case it might give you some new idea to improve your code.(Increase your memory using the VM argument -Xmx[Size>350]m)

Output:

Total: 13.722766931560747
TimeTaken : 60382 ms

UPDATE:

Java Porting of a previous , deleted answer :

     public static void main(String[] args) {
        long current =11;
        double tot=1 + 1.0/2 + 1.0/3 + 1.0/4 + 1.0/5 + 1.0/6 + 1.0/7 + 1.0/8 + 1.0/9;
        long i=0;
        while(true)
        {
            current=next_current(current);
            if(i%10000!=0)
                System.out.println(i+" "+current+" "+tot);
            for(int j=0;j<9;j++)
            {
                tot+=(1.0/current + 1.0/(current + 1) + 1.0/(current + 2) + 1.0/(current + 3) + 1.0/(current + 4) +
                          1.0/(current + 5) + 1.0/(current + 6) + 1.0/(current + 7) + 1.0/(current + 8));

                current += 10;
            }
            i++;
        }

    }

    static long next_current(long n){

    long m=(long)Math.pow(10,(int)Math.log10(n));
    boolean found_zero=false;
    while(m>=1)
    {
        if(found_zero)
            n+=m;
        else if((n/m)%10==0)
        {
            n=n-(n%m)+m;
           found_zero=true;
        }

     m=m/10;
    }
    return n;
    }

来源：https://stackoverflow.com/questions/3847615/bounding-this-program-to-determine-the-sum-of-reciprocal-integers-not-containing

标签

java

math

numerical-methods

numerical-analysis

bounding