GCC, MSVC, LLVM, and probably other toolchains have support for link-time (whole program) optimization to allow optimization of calls among compilation units.
Is the
This recent question raises another possible (but rather specific) case in which LTO may have undesirable effects: if the code in question is instrumented for timing, and separate compilation units have been used to try to preserve the relative ordering of the instrumented and instrumenting statements, then LTO has a good chance of destroying the necessary ordering.
I did say it was specific.
Given that the code is implemented correctly, then link time optimization should not have any impact on the functionality. However, there are scenarios where not 100% correct code will typically just work without link time optimization, but with link time optimization the incorrect code will stop working. There are similar situations when switching to higher optimization levels, like, from -O2 to -O3 with gcc.
That is, depending on your specific context (like, age of the code base, size of the code base, depth of tests, are you starting your project or are you close to final release, ...) you would have to judge the risk of such a change.
One scenario where link-time-optimization can lead to unexpected behavior for wrong code is the following:
Imagine you have two source files read.c
and client.c
which you compile into separate object files. In the file read.c
there is a function read
that does nothing else than reading from a specific memory address. The content at this address, however, should be marked as volatile
, but unfortunately that was forgotten. From client.c
the function read
is called several times from the same function. Since read
only performs one single read from the address and there is no optimization beyond the boundaries of the read
function, read
will always when called access the respective memory location. Consequently, every time when read
is called from client.c
, the code in client.c
gets a freshly read value from the address, just as if volatile
had been used.
Now, with link-time-optimization, the tiny function read
from read.c
is likely to be inlined whereever it is called from client.c
. Due to the missing volatile
, the compiler will now realize that the code reads several times from the same address, and may therefore optimize away the memory accesses. Consequently, the code starts to behave differently.
I assume that by "production software" you mean software that you ship to the customers / goes into production. The answers at Why not always use compiler optimization? (kindly pointed out by Mankarse) mostly apply to situations in which you want to debug your code (so the software is still in the development phase -- not in production).
6 years have passed since I wrote this answer, and an update is necessary. Back in 2014, the issues were:
As of 2020, I would try to use LTO by default on any of my projects.
If you have well written code, it should only be advantageous. You may hit a compiler/linker bug, but this goes for all types of optimisation, this is rare.
Biggest downside is it drastically increases link time.
Apart from to this,
Consider a typical example from embedded system,
void function1(void) { /*Do something*/} //located at address 0x1000
void function2(void) { /*Do something*/} //located at address 0x1100
void function3(void) { /*Do something*/} //located at address 0x1200
With predefined addressed functions can be called through relative addresses like bellow,
(*0x1000)(); //expected to call function2
(*0x1100)(); //expected to call function2
(*0x1200)(); //expected to call function3
LOT can lead to unexpected behavior.