问题
It's a variation of code from this tweet, just shorter one and not causing any damage to noobs. We have this code:
typedef int (*Function)();
static Function DoSmth;
static int Return7()
{
return 7;
}
void NeverCalled()
{
DoSmth = Return7;
}
int main()
{
return DoSmth();
}
You see that NeverCalled() is never called in the code, don't you? Here's what Compiler Explorer shows when clang 3.8 is selected with
-Os -std=c++11 -Wall
Code emitted is:
NeverCalled():
retq
main:
movl $7, %eax
retq
as if NeverCalled() was actually called before DoSmth() and set the DoSmth function pointer to Return7() function.
If function pointer assignment is removed from inside NeverCalled() as in here:
void NeverCalled() {}
then code being emitted is this:
NeverCalled():
retq
main:
ud2
The latter is quite expected. The compiler knows that function pointer is surely null and calling function using a null function pointer is undefined behavior.
The former code is not really expected. Somehow the compiler decided to have Return7() called although it's not directly called anywhere and function pointer assignment is inside function that is not called.
Yes, I know the compiler facing code with undefined behavior is allowed to do this by C++ Standard. Just how does it do this?
How does clang happen to emit this specific machine code?
回答1:
NeverCalled is a misnomer. Any global function is potentially called (by a constructor of a global object in a different translation unit, for example).
Incidentally, this is the only way this TU can possibly be incorporated in a program that doesn't have UB. In this case, main returns 7.
Make NeverCalled static, and main will compile to empty code.
回答2:
The path by which clang does this is probably something along the lines of;
DoSmthis astatic, so is zero initialised. Since it is a pointer (to function) that has the effect of initialisation to theNULLpointer (ornullptr)main()doesreturn DoSmth()so clang then reasons thatDoSmthcannot beNULL, since that would causereturn DoSmth()to exhibit undefined behaviour;- It then reasons about other code in the compilation unit, and finds that there is an assignment
DoSmth = Return7inNeverCalled(); - Since that is the only statement in the compilation unit which sets
DoSmthto be non-NULL, and it has reasoned thatDoSmthis not NULL, clang assumesNeverCalled()must have been called somehow; - As a result of the above reasoning clang concludes that
DoSmthmust be equal to the address ofReturn7; - Since it has now reasoned that
DoSmth == Return7, clang converts thereturn DoSmth()intoreturn Return7(); Return7()is in the same compilation unit, so clang inlines it.
The specifics of how clang does this internally is anyone's guess. However, various steps of code optimisation probably result in a reasoning chain something like the above.
The point is that your code - as it stands - has undefined behaviour. One cute feature of undefined behaviour is that a compiler is permitted (as distinct from required) to reason that your code actually has well-defined behaviour. In turn, that permits the compiler to reason that some code which ensures the behaviour to be well-defined has been magically executed.
来源:https://stackoverflow.com/questions/46272628/how-does-clang-manage-to-compile-this-code-with-undefined-behavior-into-this-mac