Can the compiler elide the following copy?

允我心安 提交于 2019-12-22 01:53:35

问题


I'm still a rookie programmer, I know that premature optimization is bad, but I also know that copying huge stuff around is bad, as well.

I've read up on copy elision and it's synonyms but the examples on Wikipedia for example make it seem to me that copy elision can only take place if the object to be returned gets returned at the same time it gets completely constructed.

What about objects like vectors, which usually only make sense when filled with something, when used as a return value. After all, an empty vector could just be instantiated manually.

So, does it also work in a case like this?

bad style for brevity:

vector<foo> bar(string baz)
{
    vector<foo> out;
    for (each letter in baz)
        out.push_back(someTable[letter]);

    return out;
}

int main()
{
     vector<foo> oof = bar("Hello World");
}

I have no real trouble using bar(vector & out, string text), but the above way would look so much better, aesthetically, and for intent.


回答1:


the examples on wikipedia for example make it seem to me that copy elision can only take place if the object to be returned gets returned at the same time it gets completely constructed.

That is misleading (read: wrong). The issue is rather that only one object is returned in all code paths, i.e. that only one construction for the potential return object is happening.

Your code is fine, any modern compiler can elide the copy.

On the other hand, the following code could potentially generate problems:

vector<int> foo() {
    vector<int> a;
    vector<int> b;
    // … fill both.
    bool c;
    std::cin >> c;
    if (c) return a; else return b;
}

Here, the compiler needs to fully construct two distinct objects, and only later decides which of them are returned, hence it has to copy once because it cannot directly construct the returned object in the target memory location.




回答2:


There is nothing preventing the compiler from eliding the copy. This is defined in 12.8.15:

[...] This elision of copy operations is permitted in the following circumstances (which may be combined to eliminate multiple copies):

[...]

  • when a temporary class object that has not been bound to a reference (12.2) would be copied to a class object with the same cv-unqualified type, the copy operation can be omitted by constructing the temporary object directly into the target of the omitted copy

If it actually does depends on the compiler and the settings you use.




回答3:


Both implied copies of the vector can - and often are - eliminated. The named return value optimization can eliminate the copy implied in the return statement return out; and it is allowed the the for the temporary implied in the copy initialization of oof to be eliminated as well.

With both optimizations in play the object constructed in vector<foo> out; is the same object as oof.

It's easier to test which of these optimizations are being performed with an artificial test case such as this.

struct CopyMe
{
    CopyMe();
    CopyMe(const CopyMe& x);
    CopyMe& operator=(const CopyMe& x);

    char data[1024]; // give it some bulk
};

void Mutate(CopyMe&);

CopyMe fn()
{
    CopyMe x;
    Mutate(x);
    return x;
}

int main()
{
    CopyMe y = fn();
    return 0;
}

The copy constructor is declared but not defined so that calls to it can't be inlined and eliminated. Compiling with a now comparatively old gcc 4.4 gives the following assembly at -O3 -fno-inline (filtered to demangle C++ names and edited to remove non-code).

fn():
        pushq   %rbx
        movq    %rdi, %rbx
        call    CopyMe::CopyMe()
        movq    %rbx, %rdi
        call    Mutate(CopyMe&)
        movq    %rbx, %rax
        popq    %rbx
        ret

main:
        subq    $1032, %rsp
        movq    %rsp, %rdi
        call    fn()
        xorl    %eax, %eax
        addq    $1032, %rsp
        ret

As can be seen there are no calls to the copy constructor. In fact, gcc performs these optimizations even at -O0. You have to provide the -fno-elide-constructors to turn this behaviour off; if you do this then gcc generates two calls to the copy constructor of CopyMe - one inside and one outside of the call to fn().

fn():
        movq    %rbx, -16(%rsp)
        movq    %rbp, -8(%rsp)
        subq    $1048, %rsp
        movq    %rdi, %rbx
        movq    %rsp, %rdi
        call    CopyMe::CopyMe()
        movq    %rsp, %rdi
        call    Mutate(CopyMe&)
        movq    %rsp, %rsi
        movq    %rbx, %rdi
        call    CopyMe::CopyMe(CopyMe const&)
        movq    %rbx, %rax
        movq    1040(%rsp), %rbp
        movq    1032(%rsp), %rbx
        addq    $1048, %rsp
        ret

main:
        pushq   %rbx
        subq    $2048, %rsp
        movq    %rsp, %rdi
        call    fn()
        leaq    1024(%rsp), %rdi
        movq    %rsp, %rsi
        call    CopyMe::CopyMe(CopyMe const&)
        xorl    %eax, %eax
        addq    $2048, %rsp
        popq    %rbx
        ret


来源:https://stackoverflow.com/questions/6139188/can-the-compiler-elide-the-following-copy

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!