Why isn't g++ tail call optimizing while gcc is?

前端 未结 3 1501
忘掉有多难
忘掉有多难 2020-12-31 15:36

I wanted to check whether g++ supports tail calling so I wrote this simple program to check it: http://ideone.com/hnXHv

using namespace std;

size_t st;

v         


        
相关标签:
3条回答
  • 2020-12-31 16:03

    I don't find the other answer satisfying because a local object has no effect on the stack once it's gone.

    Here is a good article which mentions that the lifetime of local objects extends into the tail-called function. Tail call optimization requires destroying locals before relinquishing control, GCC will not apply it unless it is sure that no local object will be accessed by the tail call.

    Lifetime analysis is hard though, and it looks like it's being done too conservatively. Setting a global pointer to reference a local disables TCO even if the local's lifetime (scope) ends before the tail call.

    {
        int x;
        static int * p;
        p = & x;
    } // x is dead here, but the enclosing function still has TCO disabled.
    

    This still doesn't seem to model what's happening, so I found another bug. Passing local to a parameter with a user-defined or non-trivial destructor also disables TCO. (Defining the destructor = delete allows TCO.)

    std::string has a nontrivial destructor, so that's causing the issue here.

    The workaround is to do these things in a nested function call, because lifetime analysis will then be able to tell that the object is dead by the tail call. But there's no need to forgo all C++ objects.

    0 讨论(0)
  • 2020-12-31 16:16

    The original code with temporary std::string object is still tail recursive, since the destructor for that object is executed immediately after exit from PrintStackTop("");, so nothing should be executed after the recursive return statement.

    However, there are two issues that lead to confusion of tail call optimization (TCO):

    • the argument is passed by reference to the PrintStackTop function
    • non-trivial destructor of std::string

    It can be verified by custom class that each of those two issues is able to break TCO. As it is noted in the previous answer by @Potatoswatter there is a workaround for both of those issues. It is enough to wrap call of PrintStackTop by another function to help the compiler to perform TCO even with temporary std::string:

    void PrintStackTopTail()
    {
        PrintStackTop("tail");
    }
    int TailCallFactorial(int n, int a = 1)
    {
        PrintStackTopTail();
    //...
    }
    

    Note that is not enough to limit the scope by enclosing { PrintStackTop("tail"); } in curly braces. It must be enclosed as a separate function.

    Now it can be verified with g++ version 4.7.2 (compilation options -O2) that tail recursion is replaced by a loop.

    The similar issue is observed in Pass-by-reference hinders gcc from tail call elimination

    Note that printing (st - (size_t) &stack_top) is not enough to be sure that TCO is performed, for example with the optimization option -O3 the function TailCallFactorial is self inlined five times, so TailCallFactorial(5) is executed as a single function call, but the issue is revealed for larger argument values (for example for TailCallFactorial(15);). So, the TCO may be verified by reviewing assembly output generated with the -S flag.

    0 讨论(0)
  • 2020-12-31 16:18

    Because you're passing a temporary std::string object to the PrintStackTop(std::string) function. This object is allocated on the stack and thus prevent the tail call optimization.

    I modified your code:

    void PrintStackTopStr(char const*const type)
    {
        int stack_top;
        if(st == 0) st = (size_t) &stack_top;
        cout << "In " << type << " call version, the stack top is: " << (st - (size_t) &stack_top) << endl;
    }
    
    int RealTailCallFactorial(int n, int a = 1)
    {
        PrintStackTopStr("tail");
        if(n < 2)
            return a;
        return RealTailCallFactorial(n - 1, n * a);
    }
    

    Compile with: g++ -O2 -fno-exceptions -o tailcall tailcall.cpp

    And it now uses the tail call optimisation. You can see it in action if you use the -S flag to produce the assembly:

    L39:
            imull   %ebx, %esi
            subl    $1, %ebx
    L38:
            movl    $LC2, (%esp)
            call    __Z16PrintStackTopStrPKc
            cmpl    $1, %ebx
            jg      L39
    

    You see the recursive call inlined as a loop (jg L39).

    0 讨论(0)
提交回复
热议问题