GCC optimizations based on integer overflow

别等时光非礼了梦想. 提交于 2021-02-05 06:54:13

问题


Recently I had a discussion about someone who wanted to check for signed int overflow like this if (A + B < 2 * max(A, B)). Lets ignore for a second that the logic itself is wrong and discuss signed integer overflow in context of C/C++. (Which I believe fully inherits this part of standard from C).

What kinds of check that need signed integer overflow will be optimized away by current-ish GCC and which won't?

Since the original text wasn't all that well formulated and apparently controversial I decided to change the question somewhat, but leave the original text below.

All examples used below were tested gcc version 4.7.2 (Debian 4.7.2-5) and compiled using -O3

Namely, it is undefined and GCC infamously uses this to perform some branch simplifications. The first example of this that comes to mind is

int i = 1;
while (i > 0){
    i *= 2;
}

which produces an infinite loop. Another case where this kind of optimalization kicks in is

if (A + 2 < A){
    /* Handle potential overflow */
}

where, assuming A is signed integral type, the overflow branch gets completely removed.

Even more interestingly, some cases of easily provable integer overflow, are left untouched, such as

if (INT_MAX + 1 < 0){
    /* You wouldn't write this explicitly, but after static analysis the program
       could be shown to contain something like this. */
}

which triggers the branch that you would expect with two's complement representation. Similarly, this code leaves the conditional branches intact

int C = abs(A);
if (A + C < 0){
    /* For this to be hit, overflow or underflow had to happen. */
}

Now for the question, is there a pattern that looks roughly like if (A + B < C) or if (A + B < c), that will be optimized away? When I was googling around before writing this, it seemed like the last snippet should be optimized away, but I cannot reproduce this kind of error in an overflow check that doesn't operate with constant explicitly.


回答1:


Many compilers will replace expressions involving signed integers or pointers with "false", like

a + 1 < a // signed integer a
p + 1 < p // Pointer p

when the expression can only be true in the case of undefined behaviour. On the other hand, that allows

for (char* q = p; q < p + 2; ++q) ...

to be inlined, substituting q = p and q = p + 1, without any check, so that's a good thing.

if (A + abs (A) < 0)

is probably too complicated for many compilers. Note that for unsigned integers there is no undefined behaviour. As a consequence, loops using unsigned 32 bit integers with 64 bit pointers tend to be slower than necessary because the wraparound behaviour must be taken into account. For unsigned 32 bit integer and 64 bit pointers, it is possible that

&p [i] > &p [i+1]

without undefined behaviour (not with 64 bit integers or 32 bit pointers).




回答2:


If I may paraphrase your question, I believe that are asking something like this.

Does there exist a compiler that optimises signed integer expressions so aggressively that it is prepared to undertake detailed analysis of certain categories of such expressions in order to determine that a dependent condition is true (or false) throughout the range of representable values for the type of the result of the expression, and by those means delete the conditional test?

The compiler you offer is a particular version of GCC, and the expressions you offer fall into a narrow range, but I assume that you would also be interested to learn of another compiler or closely-related expressions.

The answer is right now I'm not aware of one, but it could be only a matter of time.

Existing compilers perform premature evaluation of expressions that contain constants or certain recognisable patterns, and if during this evaluation they encounter undefined behaviour will ordinarily avoid optimising the expression. They are not obliged to do so.

Data flow analysis is CPU and memory intensive and tends to be used where there are large benefits to be had. Eventually the C++ standard will stop changing (so much) and the compiler writers will have time on their hands. We're still a bit short of the day when a compiler reads a prime number sieve program and optimises it into a single print statement, but it will come.

The main point of my answer is to point out that this is actually a question about compiler technology and has very little to do with the C++ standard. Perhaps you should ask the GCC group directly.



来源:https://stackoverflow.com/questions/23889022/gcc-optimizations-based-on-integer-overflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!