RVO force compilation error on failure

China☆狼群 提交于 2019-12-29 06:52:26

问题


Lots of discussions here about when RVO can be done but not much about when it is actually done. As stated may times, RVO can not be guaranteed according to the Standard but is there a way to guarantee that either RVO optimization succeeds or the corresponding code fails to compile?

So far I partially succeeded to make the code issue link errors when RVO fails. For this I declare the copy constructors without defining them. Obviously this is neither robust nor feasible in the non rare cases where I need to implement one or both copy constructors, i.e. x(x&&) and x(x const&).

This brings me to my second question: Why have the compiler writers chosen to enable RVO when user defined copy constructors are in place but not when only default copy constructors are present?

Third question: Is there some other way to enable RVO for plain data structures?

Last question (promise): Do you know any compiler that makes my test code behave other then I observed with gcc and clang?

Here is some example code for gcc 4.6, gcc 4.8 and clang 3.3 that shows the problem. The behavior does not depend on general optimization or debug settings. Of course option --no-elide-constructors does what it says, i.e. turns RVO off.

#include <iostream>
using namespace std;

struct x
{
    x () { cout << "original x address" << this << endl; }
};
x make_x ()
{
    return x();
}

struct y
{
    y () { cout << "original y address" << this << endl; }
    // Any of the next two constructors will enable RVO even if only
    // declared but not defined. Default constructors will not do!
    y(y const & rhs);
    y(y && rhs);
};
y make_y ()
{
    return y();
}

int main ()
{
    auto x1 = make_x();
    cout << "copy of  x address" << &x1 << endl;
    auto y1 = make_y();
    cout << "copy of  y address" << &y1 << endl;
}

Output:

original x address0x7fff8ef01dff
copy of  x address0x7fff8ef01e2e
original y address0x7fff8ef01e2f
copy of  y address0x7fff8ef01e2f

RVO seems also not to work with plain data structures:

#include <iostream>

using namespace std;

struct x
{
    int a;
};

x make_x ()
{
    x tmp;
    cout << "original x address" << &tmp << endl;
    return tmp;
}

int main ()
{
    auto x1 = make_x();
    cout << "copy of  x address" << &x1 << endl;
}

Output:

original x address0x7fffe7bb2320
copy of  x address0x7fffe7bb2350

UPDATE: Note that some optimizations are very easily confused with RVO. Constructor helpers like make_x are an example. See this example where the optimization is actually enforced by the standard.


回答1:


The problem is that the compiler is doing too much optimizations :)

First of all, I disabled the inlining of make_x() otherwise we cannot distinguish between RVO and inlining. However, I did put the rest into an anonymous namespace so that external linkage is not interfering with any other compiler optimizations. (As evidence shows, external linkage can prevent inlining for example, and who knows what else...) I rewrote the input-output, now it uses printf(); otherwise the generated assembly code would be cluttered due to all the iostream stuff. So the code:

#include <cstdio>
using namespace std;

namespace {

struct x {
    //int dummy[1024];
    x() { printf("original x address %p\n", this); }
};

__attribute__((noinline)) x make_x() {
    return x();
}

} // namespace

int main() {
    auto x1 = make_x();
    printf("copy  of x address %p\n", &x1);
}

I analyzed the generated assembly code with a colleague of mine as my understanding of the gcc generated assembly is very limited. Later today, I used clang with the -S -emit-llvm flags to generate LLVM assembly which I personally find much nicer and easier to read than the X86 Assembly/GAS Syntax. It didn't matter which compiler was used, the conclusions are the same.

I rewrote the generated assembly in C++, it roughly looks like this if x is empty:

#include <cstdio>
using namespace std;

struct x { };

void make_x() {
    x tmp;
    printf("original x address %p\n", &tmp);
}

int main() {
    x x1;
    make_x();
    printf("copy  of x address %p\n", &x1);
}

If x is big (the int dummy[1024]; member uncommented):

#include <cstdio>
using namespace std;

struct x { int dummy[1024]; };

void make_x(x* x1) {

    printf("original x address %p\n", x1);
}

int main() {
    x x1;
    make_x(&x1);
    printf("copy  of x address %p\n", &x1);
}

It turns out that make_x() only has to print some valid, unique address if the object is empty. make_x() has the liberty to print some valid address pointing to its own stack if the object is empty. There is also nothing to be copied, there is nothing to return from make_x().

If you make the object bigger (add the int dummy[1024]; member for example), it gets constructed in place so RVO does kick in, and only the objects' address is passed to make_x() to be printed. No object gets copied, nothing gets moved.

If the object is empty, the compiler can decide not to pass an address to make_x() (What a waste of resources would that be? :) ) but let make_x() make up a unique, valid address from its own stack. When this optimization happens is somewhat fuzzy and hard to reason about (that is what you see with y) but it really doesn't matter.

RVO looks like to happen consistently in those cases where it matters. And, as my earlier confusion shows, even the whole make_x() function can get inlined so there is no return value to be optimized away in the first place.




回答2:


  1. I don't believe there's any way to make such a guarantee. RVO is an optimization and as such the compiler may determine in a particular case that using it is actually a de-optimization and elect to not do so.

  2. I'm assuming you're referring to your first code snippet. In 32 bit bit compilation I'm unable to reproduce your assertion on g++ 4.4, 4.5, or 4.8 (through ideone.com) even with no optimization enabled at all. In 64 bit compilation I can reproduce your no-RVO behavior. This smells like a 64 bit code generation bug in g++.

  3. If in fact what I observed in (2) is a bug then once the bug is fixed it will just work.

  4. I can confirm that Sun CC also does not RVO your specific examples even in 32 bit compilation.

I do wonder however if somehow your introspection code to print out the addresses is causing the compiler to inhibit the optimization (for example it may need to inhibit the optimization to prevent possible aliasing problems).




回答3:


Why have the compiler writers chosen to enable RVO when user defined copy constructors are in place but not when only default copy constructors are present?

Because the standard says so:

C++14, 12.8/31:

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects.

C++14, 12.8/32

When the criteria for elision of a copy operation are met or would be met save for the fact that the source object is a function parameter, and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If overload resolution fails, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object’s type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [ Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. —end note ]

You must remember that the RVO (and other copy elisions) are optional.

Imagine a code with deleted copy/move constructors/assignments that compiles on your compiler because the RVO kicks in. Then you move that perfectly compiling code into another compiler, where it legally fails to compile. This is not acceptable.

This means the code must always be valid even if the compiler, for some reason, decides to NOT do the RVO optimization.



来源:https://stackoverflow.com/questions/19262009/rvo-force-compilation-error-on-failure

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!