Undefined Behavior with the C++0x Closure: II

雨燕双飞 提交于 2019-12-05 22:04:47

I hope to find out why there is an undefined behavior in the code

Every time I deal with complex and intricated lambda, I feel it more easier to do first the translation into function-object form. Because lambdas are just syntactic sugar for function-object and for each lambda there is a one-to-one mapping with a corresponding function-object. This article explain really well how to do the translation : http://blogs.msdn.com/b/vcblog/archive/2008/10/28/lambdas-auto-and-static-assert-c-0x-features-in-vc10-part-1.aspx

So for example, your program no 2 :

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

would be approximately translate by the compiler into this one :

#include <iostream>

struct InnerAccumulator
{
    int& x;
    InnerAccumulator(int& x):x(x)
    {
    }
    int operator()(int y) const
    {
        return x+=y;
    }
};

struct Accumulator
{
    InnerAccumulator operator()(int x) const
    {
        return InnerAccumulator(x); // constructor
    }
};


int main()
{
    Accumulator accumulator;
    InnerAccumulator ac = accumulator(1);
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
}

And now, the problem become quite obvious :

InnerAccumulator operator()(int x) const
{
   return InnerAccumulator(x); // constructor
}

Here the constructor of InnerAccumulator will take a reference to x, a local variable which will die as soon as you exit the operator() scope. So yes, you just get a plain good old undefined behavior as you suspected.

Let's try something entirely innocent-looking:

#include <iostream>
int main(){
    auto accumulator = [](int x) {
        return [&](int y) -> int { 
            return x+=y;
        }; 
    };
    auto ac=accumulator(1);

    //// Surely this should be a no-op? 
    accumulator(666);
    //// There are no side effects and we throw the result away!

    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
    std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 
}

Tada:

669 668 667 
672 671 670 
675 674 673 

Of course, this is not guaranteed behaviour either. Indeed, with optimizations enabled, gcc will eliminate the accumulator(666) call figuring it's dead code, and we again get the original results. And it is entirely within its rights to do so; in a conforming program, removing the call would indeed not affect the semantics. But in the realm of undefined behaviour, anything may happen.


EDIT

auto ac=accumulator(1);

std::cout << pow(2,2) << std::endl;

std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl;
std::cout << ac(1) << " " << ac(1) << " " << ac(1) << " " << std::endl; 

Without optimizations enabled, I get the following:

4
1074790403 1074790402 1074790401 
1074790406 1074790405 1074790404 
1074790409 1074790408 1074790407 

With optimizations enabled,

4
4 3 2 
7 6 5 
10 9 8

Again, C++ does not and cannot provide true lexical closures where the lifetime of local variables would get extended beyond their original scope. That would entail bringing garbage collection and heap-based locals to the language.

This is all rather academic, though, as capturing x by copy makes the program well-defined and to work as expected:

auto accumulator = [](int x) {
    return [x](int y) mutable -> int { 
        return x += y;
    }; 
};

Well, references become dangling when the referent goes away. You have a very fragile design if object A has a reference to some part of object B, unless object A in some way can guarantee the lifetime of object B (for instance, when A holds a shared_ptr to B anyway, or both are in the same scope).

References in lambda's are no magical exception. If you plan to return a reference to x+=y, you'd better make sure that x lives long enough. Here it's the argument int x initialized as part of the call accumulator(1). The lifetime of a function argument ends when the function returns.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!