I was reading an algorithms book which had the following algorithm for binary search:
public class BinSearch {
static int search ( int [ ] A, int K ) {
int l = 0 ;
int u = A. length −1;
int m;
while (l <= u ) {
m = (l+u) /2;
if (A[m] < K) {
l = m + 1 ;
} else if (A[m] == K) {
return m;
} else {
u = m−1;
}
}
return −1;
}
}
The author says "The error is in the assignment m = (l+u)/2;
it can lead to overflow and should be replaced by m = l + (u-l)/2
."
I can't see how that would cause an overflow. When I run the algorithm in my mind for a few different inputs, I don't see the mid's value going out of the array index.
So, in which cases would the overflow occur?
This post covers this famous bug in a lot of detail. As others have said it's an overflow issue. The fix recommended on the link is as follows:
int mid = low + ((high - low) / 2);
// Alternatively
int mid = (low + high) >>> 1;
It is also probably worth mentioning that in case negative indices are allowed, or perhaps it's not even an array that's being searched (for example, searching for a value in some integer range satisfying some condition), the code above may not be correct as well. In this case, something as ugly as
(low < 0 && high > 0) ? (low + high) / 2 : low + (high - low) / 2
may be necessary. One good example is searching for the median in an unsorted array without modifying it or using additional space by simply performing a binary search on the whole Integer.MIN_VALUE
–Integer.MAX_VALUE
range.
The problem is that (l+u)
is evaluated first, and could overflow int, so (l+u)/2
would return the wrong value.
Jeff suggested really good post to read about this bug, here is summary if you want quick overview.
In Programming Pearls Bentley says that the analogous line "sets m to the average of l and u, truncated down to the nearest integer." On the face of it, this assertion might appear correct, but it fails for large values of the int variables low and high. Specifically, it fails if the sum of low and high is greater than the maximum positive int value (2^31 - 1). The sum overflows to a negative value, and the value stays negative when divided by two. In C this causes an array index out of bounds with unpredictable results. In Java, it throws ArrayIndexOutOfBoundsException.
The potential overflow is in the l+u
addition itself.
This was actually a bug in early versions of binary search in the JDK.
来源:https://stackoverflow.com/questions/6735259/calculating-mid-in-binary-search