How to return large data efficiently in C++11

问题

I'm realy confused about returning large data in C++11. What is the most efficient way? Here is my related function:

void numericMethod1(vector<double>& solution,
                    const double input);

void numericMethod2(pair<vector<double>,vector<double>>& solution1,
                    vector<double>& solution2,
                    const double input1,
                    const double input2);

and here is the way i use them:

int main()
{
    // apply numericMethod1
    double input = 0;
    vector<double> solution;
    numericMethod1(solution, input);

    // apply numericMethod2
    double input1 = 1;
    double input2 = 2;
    pair<vector<double>,vector<double>> solution1;
    vector<double> solution2;
    numericMethod2(solution1, solution2, input1, input2);

    return 0;
}

The question is, is the std::move() useless in following implemtation?

Implementation:

void numericMethod1(vector<double>& solution,
                    const double input)
{
    vector<double> tmp_solution;

    for (...)
    {
    // some operation about tmp_solution
    // after that this vector become very large
    }

    solution = std::move(tmp_solution);
}

void numericMethod2(pair<vector<double>,vector<double>>& solution1,
                    vector<double>& solution2,
                    const double input1,
                    const double input2)
{
    vector<double> tmp_solution1_1;
    vector<double> tmp_solution1_2;
    vector<double> tmp_solution2;

    for (...)
    {
    // some operation about tmp_solution1_1, tmp_solution1_2 and tmp_solution2
    // after that the three vector become very large
    }

    solution1.first = std::move(tmp_solution1_1);
    solution1.second = std::move(tmp_solution1_2);
    solution2 = std::move(tmp_solution2);
}

If they are useless, how can i deal with these large return value without copy many times? Free to change the API!

UPDATE

Thanks to StackOverFlow and these answers, after diving into related questions, I know this problem better. Due to RVO, I change the API, and for more clear, I don't use std::pair anymore. Here, is my new code:

struct SolutionType
{
    vector<double> X;
    vector<double> Y;
};

SolutionType newNumericMethod(const double input1,
                              const double input2);

int main()
{
    // apply newNumericMethod
    double input1 = 1;
    double input2 = 2;
    SolutionType solution = newNumericMethod(input1, input2);

    return 0;
}

SolutionType newNumericMethod(const double input1,
                              const double input2);
{
    SolutionType tmp_solution; // this will call the default constructor, right?
    // since the name is too long, i make alias.
    vector<double> &x = tmp_solution.X;
    vector<double> &y = tmp_solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }

    return tmp_solution;
}

How can I know RVO is happened? or How can I ensure RVO happened?

回答1:

Return by value, rely on RVO (return value optimization).

auto make_big_vector()
{
    vector<huge_thing> v1;
    // fill v1

    // explicit move is not necessary here        
    return v1;
} 

auto make_big_stuff_tuple()
{
    vector<double> v0;
    // fill v0

    vector<huge_thing> v1;
    // fill v1

    // explicit move is necessary for make_tuple's arguments,
    // as make_tuple uses perfect-forwarding:
    // http://en.cppreference.com/w/cpp/utility/tuple/make_tuple

    return std::make_tuple(std::move(v0), std::move(v1));
}

auto r0 = make_big_vector();
auto r1 = make_big_stuff_tuple();

I would change the API of your functions to simply return by value.

回答2:

You could use std::vector::swap member function, which exchanges the contents of the container with those of other. Does not invoke any move, copy, or swap operations on individual elements.

solution1.first.swap(tmp_solution1_1);
solution1.second.swap(tmp_solution1_2);
solution2.swap(tmp_solution2);

edit:

These statements are not useless,

solution1.first = std::move(tmp_solution1_1);
solution1.second = std::move(tmp_solution1_2);
solution2 = std::move(tmp_solution2);

they envoke the move assignment operator of std::vector::operator=(&&), which indeed moves the vector in the right hand side.

回答3:

When you have large data like a very big vector<double>, you can still return it by value, since C++11's move semantics will kick in for std::vector, so returning it from your function will just be some kind of pointer assignment (since vector<double>'s content is typically heap-allocated under the hood).

So I would just do:

// No worries in returning large vectors by value
std::vector<double> numericMethod1(const double input)
{
    std::vector<double> result;

    // Compute your vector<double>'s content
    ...

    // NOTE: Don't call std::move() here.
    // A simple return statement is just fine.
    return result;
}

(Note that other kind of optimizations already available in C++98/03 like RVO/NRVO can be applied as well, based on the particular C++ compiler.)

Instead, if you have a method that returns multiple output values, then I'd use non-const references, just like in C++98/03:

void numericMethod2(pair<vector<double>,vector<double>>& output1,
                    vector<double>& output2,
                    vector<double>& output3,
                    ...
                    const double input1,
                    const double input2);

Inside the implementation, you can still use a valid C++98/03 technique of "swap-timization", where you can just call std::swap() to swap local variables and output parameters:

#include <utility> // for std::swap

void numericMethod2(pair<vector<double>,vector<double>>& solution1,
                    vector<double>& solution2,
                    const double input1,
                    const double input2)

{
    vector<double> tmp_solution1_1;
    vector<double> tmp_solution1_2;
    vector<double> tmp_solution2;

    // Some processing to compute local solution vectors
    ...

    // Return output values to caller via swap-timization
    swap(solution1.first, tmp_solution1_1);
    swap(solution1.second, tmp_solution1_2);
    swap(solution2, tmp_solution2);
}

Swapping vectors typically swaps internal vector's pointers to the heap-allocated memory owned by the vectors: so you just have pointer assignments, not deep-copies, memory reallocations, or similar expensive operations.

回答4:

First of all, why dont you use the solution1 directly in numericMethod2? that is more direct.

Unlike the std::array or obj[], the value is not store in stack, but using heap ( you can refer to the standard library code, they use operator new() a lot ). so, if you find the vector is temporary only and will return to somewhere else, use std::swap or std::move. function return can actually be casted to xvalue

this is always true for standard container ( std::map, std::set, deque, list, etc )

来源：https://stackoverflow.com/questions/37117815/how-to-return-large-data-efficiently-in-c11

标签

c++

c++11

parameter-passing

return-value