Efficient accumulate

故事扮演 提交于 2021-02-18 06:18:11

问题


Assume I have vector of strings and I want concatenate them via std::accumulate.

If I use the following code:

std::vector<std::string> foo{"foo","bar"};
string res=""; 
res=std::accumulate(foo.begin(),foo.end(),res,
  [](string &rs,string &arg){ return rs+arg; });

I can be pretty sure there will be temporary object construction.

In this answer they say that the effect of std::accumulate is specified this way:

Computes its result by initializing the accumulator acc with the initial value init and then modifies it with acc = acc + *i or acc = binary_op(acc, *i) for every iterator i in the range [first,last) in order.

So I'm wondering what is the correct way to do this to avoid the unnecessary temporary object construction.

One idea was to change the lambda this way:

[](string &rs,string &arg){ rs+=arg; return rs; }

In this case, I thought I force efficient concatenation of the strings and help the compiler (I know I shouldn't) omit the unnecessary copy, since this should be equivalent to (pseudocode):

accum = [](& accum,& arg){ ...; return accum; }

and thus

accum = & accum;

Another idea was to use

accum = [](& accum,& arg){ ...; return std::move(accum); }

But this would probably lead to something like:

accum = std::move(& accum);

Which looks very suspicious to me.

What is the correct way to write this to minimize the risk of the unnecessary creation of temporary objects? I'm not just interested in std::string, I'd be happy to have a solution, that would probably work for any object that has copy and move constructors/assignments implemented.


回答1:


Try the following

res=std::accumulate(foo.begin(),foo.end(),res,
  [](string &rs, const string &arg) -> string & { return rs+=arg; });

Before this call maybe there is a sence to call

std::string::size_type n = std::accumulate( foo.begin(), foo.end(), 
   std::string::size_type( 0 ),
   [] ( std::string_size_type n, const std::string &s ) { return ( n += s.size() ); } );

res.reserve( n );



回答2:


I would break this into two operations, first std::accumulate to obtain the total length of the string that needs to be created, then a std::for_each with a lambda that updates the local string:

std::string::size_type total = std::accumulate(foo.begin(), foo.end(), 0u, 
                [](std::string::size_type c, std::string const& s) {
                    return c+s.size() 
                });
std::string result;
result.reserve(total);
std::for_each(foo.begin(), foo.end(), 
              [&](std::string const& s) { result += s; });

The common alternative to this is using expression templates, but that does not fit in an answer. Basically you create a data structure that maps the operations, but does not execute them. When the expression is finally evaluated, it can gather the information it needs upfront and use that to reserve the space and do the copies. The code that uses the expression template is nicer, but more complicated.




回答3:


Using std::accumulate efficiently without any redundant copies is not obvious.
In addition to being reassigned and passed into and out-of the lambda, the accumulating value may get copied internally by the implementation.
Also, note that std::accumulate() itself takes the initial value by-value, calling a copy-ctor and thus, ignoring any reserve()s done on the source of the copy (as suggested in some of the other answers).

The most efficient way I found to concatenate the strings is as follows:

std::vector<std::string> str_vec{"foo","bar"};

// get reserve size:
auto sz = std::accumulate(str_vec.cbegin(), str_vec.cend(), std::string::size_type(0), [](int sz, auto const& str) { return sz + str.size() + 1; });

std::string res;
res.reserve(sz);
std::accumulate(str_vec.cbegin(), str_vec.cend(),
   std::ref(res), // use a ref wrapper to keep same object with capacity
   [](std::string& a, std::string const& b) -> std::string& // must specify return type because cannot return `std::reference_wrapper<std::string>`.
{                                                           // can't use `auto&` args for the same reason
   a += b;
   return a;
});

The result will be in res.
This implementation has no redundant copies, moves or reallocations.




回答4:


This is a bit tricky, since there are two operations involved, the addition and the assignment. In order to avoid the copies, you have to both modify the string in the addition, and ensure that the assignment is a no-op. It's the second part which is tricky.

What I've done on occasions is to create a custom "accumulator", along the lines of:

class Accu
{
    std::string myCollector;
    enum DummyToSuppressAsgn { dummy };
public:
    Accu( std::string const& startingValue = std::string() )
        : myCollector( startingValue )
    {
    }
    //  Default copy ctor and copy asgn are OK.
    //  On the other hand, we need the following special operators
    Accu& operator=( DummyToSuppressAsgn )
    {
        //  Don't do anything...
        return *this;
    }
    DummyToSuppressAsgn operator+( std::string const& other )
    {
        myCollector += other;
        return dummy;
    }
    //  And to get the final results...
    operator std::string() const
    {
        return myCollector;
    }
};

There'll be a few copies when calling accumulate, and of the return value, but during the actual accumulation, nothing. Just invoke:

std::string results = std::accumulate( foo.begin(), foo.end(), Accu() );

(If you're really concerned about performance, you can add a capacity argument to the constructor of Accu, so that it can do a reserve on the member string. If I did this, I'd probably hand write the copy constructor as well, to ensure that the string in the copied object had the required capacity.)



来源:https://stackoverflow.com/questions/19664196/efficient-accumulate

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!