How can I disable c++ return value optimization for one type only?

问题

I have come across the situation where I really do need to execute non-trivial code in a copy-constructor/assignment-operator. The correctness of the algorithm depends on it.

While I could disable return value optimisation with a compiler switch, it seems a waste because it's only the one type I need it disabled for, so why should the performance of the whole application suffer? (Not to mention that my company would not allow me to add the switch, anyway).

struct A {
    explicit A(double val) : m_val(val) {}

    A(const A& other) : m_val(other.m_val) {
        // Do something really important here
    }
    A& operator=(const A& other) {
        if (&other != this) {
            m_val = other.m_val;
            // Do something really important here 
        }
        return *this;
    }
    double m_val;
};

A operator+(const A& a1, const A& a2) {
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    return retVal;
}
// Implement other operators like *,+,-,/ etc.

This class would be used as such:

A a1(3), a2(4), a3(5);
A a4 = (a1 + a2) * a3 / a1;

Return value optimisation means that a4 will not be created with the copy constructor, and the "really important thing to do" does not happen!

I know I could hack in a solution where operator+ returns a different type (B, say) and have an A constructor that takes a B as input. But then the number of operators needed to be implemented explodes:

B operator+(const A& a1, const A& a2);
B operator+(const B& a1, const A& a2);
B operator+(const A& a1, const B& a2);
B operator+(const B& a1, const B& a2);

There must be a better solution. How can I hack it so that RVO does not happen for my type? I can only change the A class code and the operators. I can't change the calling site code; i.e. I can't do this:

A a1(3), a2(4), a3(5);
A a4;
a4 = (a1 + a2) * a3 / a1;

One thing I've considered trying is to try and experiment with C++11 move constructors, but I'm not sure this would work, and I don't like it not being valid in C++03.

Any ideas?

EDIT: Please just accept that this is the only way I can do what I need to do. I cannot just 'change the design'. The calling code is fixed, and I must implement my strategy inside the mathematical operators and copy constructor & assignment operator. The idea is that the intermediate values calculated inside the "a4 = (a1+a2)*a3/a1" equation cannot be referenced anywhere else in the program - but a4 can. I know this is vague but you'll just have to live with it.

回答1:

Answering my own question here: I'm going to bite the bullet and use an intermediate type:

struct B;

struct A
{
    A(int i) : m_i(i) {}
    A(const B& a);
    A(const A& a) : m_i(a.m_i)
    {
        std::cout << "A(const A&)" << std::endl;
    }
    int m_i;
};
struct B
{
    B(int i) : m_i(i) {}
    int m_i;
};

A::A(const B& a) : m_i(a.m_i)
{
    std::cout << "A(const B&)" << std::endl;
}

B operator+(const A& a0, const A& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "A+A" << std::endl;
    return b;
}
B operator+(const B& a0, const A& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "B+A" << std::endl;
    return b;
}
B operator+(const A& a0, const B& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "A+B" << std::endl;
    return b;
}
B operator+(const B& a0, const B& a1)
{
    B b(a0.m_i + a1.m_i);
    std::cout << "B+B" << std::endl;
    return b;
}

int main()
{
    A a(1);
    A b(2);
    A c(3);
    A d = (a+b) + (a + b + c);
}

Output on GCC 4.2.1:

A+A
B+A
A+A
B+B
A(const B&)

And I can do the "very important thing" in the A(const B&) constructor.

回答2:

As Angew pointed out, you can use an intermediate type. Here's an example with some optimizations using the move ctor.

#include <utility>
#include <iostream>

struct B;

struct A {
    explicit A(double val) : m_val(val)
    {
        std::cout << "A(double)" << std::endl;
    }
    A(A&& p) : m_val(p.m_val)
    { /* no output */ }

    A(const A& other) : m_val(other.m_val) {
        // Do something really important here
        std::cout << "A(A const&)" << std::endl;
    }
    A& operator=(const A& other) {
        if (&other != this) {
            m_val = other.m_val;
            // Do something really important here
            std::cout << "A::operator=(A const&)" << std::endl;
        }
        return *this;
    }
    double m_val;

    A(B&&);
};

struct B
{
    operator A const&() const
    {
        std::cout << "B::operator A const&()" << std::endl;
        return a;
    }

private:
    friend struct A;
    A a;

    // better: befriend a factory function
    friend B operator+(const A&, const A&);
    friend B operator*(const A&, const A&);
    friend B operator/(const A&, const A&);
    B(A&& p) : a( std::move(p) )
    { /* no output */ }
};

A::A(B&& p) : A( std::move(p.a) )
{
    std::cout << "A(B&&)" << std::endl;
}

B operator+(const A& a1, const A& a2) {
    std::cout << "A const& + A const&" << std::endl;
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    return std::move(retVal);
}

B operator*(const A& a1, const A& a2) {
    std::cout << "A const& * A const&" << std::endl;
    A retVal(a1.m_val * a2.m_val);
    // Do something else important
    return std::move(retVal);
}

B operator/(const A& a1, const A& a2) {
    std::cout << "A const& / A const&" << std::endl;
    A retVal(a1.m_val / a2.m_val);
    // Do something else important
    return std::move(retVal);
}

int main()
{
    A a1(3), a2(4), a3(5);
    A a4 = (a1 + a2) * a3 / a1;
}

IIRC, the temporary returned by, say a1 + a2 lasts for the whole copy-initialization (more precisely: for the whole full-expression, and that includes AFAIK the construction of a4). That's the reason why we can return an A const& from within B, even though the B objects are only created as temporaries. (If I'm wrong about that, see my previous edits for some other solutions.. :D )

The essence of this example is the combination of an intermediate type, move ctors and the said return of a reference.

Output of g++4.6.3 and clang++3.2:

A(double)             <---- A a1(3);
A(double)             <---- A a2(4);
A(double)             <---- A a3(5);
A const& + A const&   <---- a1 + a2;
A(double)               <-- A retVal(a1.m_val + a2.m_val);
B::operator A const&()<---- __temp__ conversion B --> const A&
A const& * A const&   <---- __temp__ * a3;
A(double)               <-- A retVal(a1.m_val * a2.m_val);
B::operator A const&()<---- __temp__ conversion B --> const A&
A const& / A const&   <---- __temp__ / a1;
A(double)               <-- A retVal(a1.m_val / a2.m_val);
A(B&&)                <---- A a4 = __temp__;

Now that the copy and move operations (which are not shown) are split up, I think you can implement your "something important" more precisely where it belongs to:

A(double) -- creation of a new A object from numerical values
A(A const&) -- actual copy of an A object; doesn't happen here
A(B&&) -- construction of an A object from an operator result
B(A&&) -- invoked for the return value of an operator
B::operator A const&() const -- invoked to use the return value of an operator

回答3:

RVO is allowed by the standard, in the following cases ([class.copy]§31, listing only applicable parts):

in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value

when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move

In your code:

A operator+(const A& a1, const A& a2) {
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    return retVal;
}


A a4 = (a1 + a2) * a3 / a1;

there are two elidable copies involved: copying revVal into temporary object storing return value of operator+, and copying this temporary object into a4.

I can't see a way to prevent elision of the second copy (the one from return value to a4), but the "non-volatile" part of the standard makes me believe this should prevent elision of the first copy:

A operator+(const A& a1, const A& a2) {
    A retVal(a1.m_val + a2.m_val);
    // Do something else important
    volatile A volRetVal(retVal);
    return volRetVal;
}

Of course this means you'll have to define an additional copy constructor for A taking const volatile A&.

来源：https://stackoverflow.com/questions/16234323/how-can-i-disable-c-return-value-optimization-for-one-type-only

标签

c++

return-value-optimization

copy-elision