Create a function that always returns zero, but the optimizer doesn't know

I would like to create a function that always returns zero, but this fact should not be obvious to the optimizer, so that subsequent calculations using the value won't constant-fold away due to the "known zero" status.

In the absence of link-time optimization, this is generally as simple as putting this in its own compilation unit:

int zero() {
  return 0;
}

The optimizer can't see across units, so the always-zero nature of this function won't be discovered.

However, I need something that works with LTO and with as many possible future clever optimizations as well. I considered reading from a global:

int x;

int zero() {
  return x;
}

... but it seems to me that a sufficiently smart compiler could notice that x is never written to and still decide zero() is always zero.

I considered using a volatile, like:

int zero() {
  volatile int x = 0;
  return x;
}

... but the actual semantics of the required side effects of volatile reads aren't exactly clear, and would not seem to exclude the possibility that the function still returns zero.

Such an always-zero-but-not-at-compile-time value is useful in several scenarios, such as forcing a no-op dependency between two values. Something like: a += b & zero() causes a to depend on b in the final binary, but doesn't change the value of a.

Don't answer this by telling me the "standard doesn't guarantee any way to do this" - I'm well aware and I'm looking for a practical answer and not language from the standard.

I would be amazed if a compiler can figure this out:

int not_a_zero_honest_guv()
{
    // static makes sure the initialization code only gets called once
    static int const i = std::ifstream("") ? 1:0;
    return i;
}

int main()
{
    std::cout << not_a_zero_honest_guv();
}

This uses a complex, (unpredictable) runtime initialization of a function local static. If the naughty little compiler figures out that an empty filename will always fail, then put some illegal filename in there.

First an aside: I believe that the OP's third suggestion:

int zero() {
  volatile int x = 0;
  return x;
}

would in fact work (but this is not my answer; see below). This exact same function two weeks ago was the subject of Is it allowed for a compiler to optimize away a local volatile variable?, with much discussion and differing opinions, which I will not repeat here. But for a recent test of this, see https://godbolt.org/g/SA7k5P.

My answer is to add a static to the above, namely:

int zero() {
  static volatile int x;
  return x;
}

See some tests here: https://godbolt.org/g/qzWYJt.

Now with the addition of static, the abstract concept of "observable behavior" becomes more believable. With a little bit of work, I could figure out the address of x, especially if I disabled Address space layout randomization. This would probably be in the .bss segment. Then with a bit more work I could attach a debugger/hacking tool to the running process and then change the value of x. And with volatile, I have told the compiler that I might do this, so it is not allowed to change this "observable behavior" by optimizing x away. (It could perhaps optimize the call to zero away by inlining, but I don't care.)

The title of Is it allowed for a compiler to optimize away a local volatile variable? is a bit misleading, as the discussion centred on x being on the stack rather than it being a local variable. So is not applicable here. But we could change x from local scope to file scope or even global scope, as in:

volatile int x;
int zero() {
  return x;
}

This would not change my argument.

Further discussion:

Yes, volatile's are sometimes problematic: for example, see the pointer-to-volatile issues shown here https://godbolt.org/g/s6JhpL and in Does accessing a declared non-volatile object through a volatile reference/pointer confer volatile rules upon said accesses?.

And yes, sometimes (always?) compilers have bugs.

But I would like to argue that this solution is not an edge case, and that there is a consensus among compiler writers, and I will do so by looking at existing analyses.

John Regehr's 2010 blogpost Volatile Structs Are Broken reports a bug where a volatile access was optimized away in both gcc and Clang. (It was fixed in three hours.) One commentator quoted the standard (emphasis added):

"6.7.3 ... What constitutes an access to an object that has volatile-qualified type is implementation-defined."

Regehr agreed, but added that there is consensus in how it should work on non-edge cases:

Yes, what constitutes an access to a volatile variable is implementation defined. But you have missed the fact that all reasonable C implementations consider a read from a volatile variable to be a read access and a write to a volatile variable to be a write access.

For further references. see:

E. Eide, J. Regehr, "Volatiles Are Miscompiled, and What to Do about It", Proceedings of the Eighth ACM and IEEE International Conference on Embedded Software (EMSOFT), 2008.
Another Regehr 2010 blogpost, Nine ways to break your systems code using volatile.
Wintermute's answer to Volatile and its harmful implications.

These are reports about compiler bugs and programmers' errors. But they show how volatile should/does work, and that this answer meets those norms.

You'll find that each compiler has an extension for achieving this.

GCC:

__attribute__((noinline))
int zero()
{
    return 0;
}

MSVC:

__declspec(noinline)
int zero()
{
    return 0;
}

On clang and gcc, clobbering a variable works, but imposes some overhead

int zero()
{
    int i = 0;
    asm volatile(""::"g"(&i):"memory");
    return i;
}

which under O3 on gcc gets compiled to

    mov     DWORD PTR [rsp-4], 0
    lea     rax, [rsp-4]
    mov     eax, DWORD PTR [rsp-4]
    ret

and on clang

    mov     dword ptr [rsp - 12], 0
    lea     rax, [rsp - 12]
    mov     qword ptr [rsp - 8], rax
    mov     eax, dword ptr [rsp - 12]
    ret

Live.

来源：https://stackoverflow.com/questions/51471889/create-a-function-that-always-returns-zero-but-the-optimizer-doesnt-know

标签

c++

performance

optimization

compiler-optimization