Efficient use of move semantics together with (N)RVO

问题

Let's say I want to implement a function that is supposed to process an object and return a new (possibly changed) object. I would like to do this as efficient as possible in C+11. The environment is as follows:

class Object {
    /* Implementation of Object */
    Object & makeChanges();
};

The alternatives that come to my mind are:

// First alternative:
Object process1(Object arg) { return arg.makeChanges(); }
// Second alternative:
Object process2(Object const & arg) { return Object(arg).makeChanges(); }
Object process2(Object && arg) { return std::move(arg.makeChanges()); }
// Third alternative:
Object process3(Object const & arg) { 
    Object retObj = arg; retObj.makeChanges(); return retObj; 
}
Object process3(Object && arg) { std::move(return arg.makeChanges()); }

Note: I would like to use a wrapping function like process() because it will do some other work and I would like to have as much code reuse as possible.

Updates:

I used the makeChanges() with the given signature because the objects I am dealing with provides methods with that type of signature. I guess they used that for method chaining. I also fixed the two syntax errors mentioned. Thanks for pointing those out. I also added a third alternative and I will repose the question below.

Trying these out with clang [i.e. Object obj2 = process(obj);] results in the following:

First option makes two calls to the copy constructor; one for passing the argument and one for returning. One could instead say return std::move(..) and have one call to the copy constructor and one call to the move constructor. I understand that RVO can not get rid of one of these calls because we are dealing with the function parameter.

In the second option, we still have two calls to the copy constructor. Here we make one explicit call and one is made while returning. I was expecting for RVO to kick in and get rid of the latter since the object we are returning is a different object than the argument. However, it did not happen.

In the third option we have only one call to the copy constructor and that is the explicit one. (N)RVO eliminates the copy constructor call we would do for returning.

My questions are the following:

(answered) Why does RVO kick in the last option and not the second?
Is there a better way to do this?
Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?

Thanks!

回答1:

I like to measure, so I set up this Object:

#include <iostream>

struct Object
{
    Object() {}
    Object(const Object&) {std::cout << "Object(const Object&)\n";}
    Object(Object&&) {std::cout << "Object(Object&&)\n";}

    Object& makeChanges() {return *this;}
};

And I theorized that some solutions may give different answers for xvalues and prvalues (both of which are rvalues). And so I decided to test both of them (in addition to lvalues):

Object source() {return Object();}

int main()
{
    std::cout << "process lvalue:\n\n";
    Object x;
    Object t = process(x);
    std::cout << "\nprocess xvalue:\n\n";
    Object u = process(std::move(x));
    std::cout << "\nprocess prvalue:\n\n";
    Object v = process(source());
}

Now it is a simple matter of trying all of your possibilities, those contributed by others, and I threw one in myself:

#if PROCESS == 1

Object
process(Object arg)
{
    return arg.makeChanges();
}

#elif PROCESS == 2

Object
process(const Object& arg)
{
    return Object(arg).makeChanges();
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 3

Object
process(const Object& arg)
{
    Object retObj = arg;
    retObj.makeChanges();
    return retObj; 
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 4

Object
process(Object arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 5

Object
process(Object arg)
{
    arg.makeChanges();
    return arg;
}

#endif

The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:

+----+--------+--------+---------+
|    | lvalue | xvalue | prvalue |    legend: copies/moves
+----+--------+--------+---------+
| p1 |  2/0   |  1/1   |   1/0   |
+----+--------+--------+---------+
| p2 |  2/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p3 |  1/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p4 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+
| p5 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+

process3 looks like the best solution to me. However it does require two overloads. One to process lvalues and one to process rvalues. If for some reason this is problematic, solutions 4 and 5 do the job with only one overload at the cost of 1 extra move construction for glvalues (lvalues and xvalues). It is a judgement call as to whether one wants to pay an extra move construction to save overloading (and there is no one right answer).

(answered) Why does RVO kick in the last option and not the second?

For RVO to kick in, the return statement needs to look like:

return arg;

If you complicate that with:

return std::move(arg);

or:

return arg.makeChanges();

then RVO gets inhibited.

Is there a better way to do this?

My favorites are p3 and p5. My preference of p5 over p4 is merely stylistic. I shy away from putting move on the return statement when I know it will be applied automatically for fear of accidentally inhibiting RVO. However in p5 RVO is not an option anyway, even though the return statement does get an implicit move. So p5 and p4 really are equivalent. Pick your style.

Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?

The "prvalue" column vs "xvalue" column addresses this question. Some solutions add an extra move construction for xvalues and some don't.

回答2:

None of the functions you show will have any significant return value optimizations on their return values.

makeChanges returns an Object&. Therefore, it must be copied into a value, since you're returning it. So the first two will always make a copy of the value to be returned. In terms of the number of copies, the first one makes two copies (one for the parameter, one for the return value). The second one makes two copies (one explicitly in the function, one for the return value.

The third one shouldn't even compile, since you can't implicitly convert an l-value reference into an r-value reference.

So really, don't do this. If you want to pass an object, and modify it in-situ, then just do this:

Object &process1(Object &arg) { return arg.makeChanges(); }

This modifies the provided object. No copying or anything. Granted, one might wonder why process1 isn't a member function or something, but that doesn't matter.

回答3:

The fastest way to do this is- if the argument is lvalue, then copy it and return that copy- if rvalue, then move it. The return can always be moved or have RVO/NRVO applied. This is easily accomplished.

Object process1(Object arg) {
    return std::move(arg.makeChanges());
}

This is very similar to the canonical C++11 forms of many kinds of operator overloads.

来源：https://stackoverflow.com/questions/9952622/efficient-use-of-move-semantics-together-with-nrvo

标签

c++

c++11

move-semantics

return-value-optimization