In this article: http://googleresearch.blogspot.sg/2006/06/extra-extra-read-all-about-it-nearly.html, it mentioned most quick sort algorithm had a bug (left+right)/2, and it pointed out that the solution was using left+(right-left)/2 instead of (left+right)/2.
The solution was also given in question Bug in quicksort example (K&R C book)?
My question is why left+(right-left)/2 can avoid overflow? How to prove it? Thanks in advance.
You have left < right by definition.
As a consequence, right - left > 0, and furthermore left + (right - left) = right (follows from basic algebra).
And consequently left + (right - left) / 2 <= right. So no overflow can happen since every step of the operation is bounded by the value of right.
By contrast, consider the buggy expression, (left + right) / 2. left + right >= right, and since we don’t know the values of left and right, it’s entirely possible that that value overflows.
Suppose (to make the example easier) the maximum integer is 100, left = 50, and right = 80. If you use the naive formula:
int mid = (left + right)/2;
the addition will result in 130, which overflows.
If you instead do:
int mid = left + (right - left)/2;
you can't overflow in (right - left) because you're subtracting a smaller number from a larger number. That always results in an even smaller number, so it can't possibly go over the maximum. E.g. 80 - 50 = 30.
And since the result is the average of left and right, it must be between them. Since these are both less than the maximum integer, anything between them is also less than the maximum, so there's no overflow.
Basic logic.
- by definition
left <= MAX_INT - by definition
right <= MAX_INT left+(right-left)is equal toright, which already is<= MAX_INTper #2- and so
left+(right-left)/2must also be<= MAX_INTsincex/2is always smaller thanx.
Compare to the original
- by definition
left <= MAX_INT - by definition
right <= MAX_INT - therefore
left+right <= MAX_INT - and so
(left+right)/2 <= MAX_INT
where statement 3 is clearly false, since left can be MAX_INT (statement 1) and so can right (statement 2).
(This is more an intuitive explanation than a proof.)
Assume your data is unsigned char, and left = 100 and right = 255 (so right as at the edge of the range).
If you do left + right, you'll get 355, which does not fit the unsigned char range, so it will overflow.
However, (right-left)/2 is a quantity X such that left + X < right < MAX, where MAX is 255 for unsigned char. This way, you can be sure that the sum can never overflow.
A simple worked example will show it. For simplicity, assume numbers overflow above 999. If we have:
left = 997
right = 999
then:
left + right = 1995
which has overflown before we get to the /2. However:
right - left = 2
(right-left)/2 = 1
left + (right-left)/2 = 997 + 1 = 998
So we've avoided the overflow.
More generally (as others have said): If both left and right are within range (and assuming right > left, then (right-left)/2 will be within range and so too must left + (right-left)/2 since this must be less than right (since you've increased left by half the gap between it and right.
As int data type is 32 bit in Java (Assuming a programming language), any value that surpasses 32 bits gets rolled over. In numerical terms, it means that after incrementing 1 on Integer.MAX_VALUE (2147483647), the returned value will be -2147483648.
Coming to the question above lets assume the following:
int left = 1;
int right = Integer.MAX_VALUE;
int mid;
Case 1:
mid = (left +right)/2;
//Here the value of left + right would be -2147483648 which would overflow.
Case 2:
mid = left + (left - right)/2;
//This would not have the same problem as above as the value would never exceed "right".
In theory:
Both the values are same as left + (right - left)/2 = (2*left + right - left)/2 = (left + right)/2
Hope this answers your question.
来源:https://stackoverflow.com/questions/27167943/why-leftright-left-2-will-not-overflow