I wanted to check whether g++ supports tail calling so I wrote this simple program to check it: http://ideone.com/hnXHv
using namespace std;
size_t st;
v
I don't find the other answer satisfying because a local object has no effect on the stack once it's gone.
Here is a good article which mentions that the lifetime of local objects extends into the tail-called function. Tail call optimization requires destroying locals before relinquishing control, GCC will not apply it unless it is sure that no local object will be accessed by the tail call.
Lifetime analysis is hard though, and it looks like it's being done too conservatively. Setting a global pointer to reference a local disables TCO even if the local's lifetime (scope) ends before the tail call.
{
int x;
static int * p;
p = & x;
} // x is dead here, but the enclosing function still has TCO disabled.
This still doesn't seem to model what's happening, so I found another bug. Passing local to a parameter with a user-defined or non-trivial destructor also disables TCO. (Defining the destructor = delete
allows TCO.)
std::string
has a nontrivial destructor, so that's causing the issue here.
The workaround is to do these things in a nested function call, because lifetime analysis will then be able to tell that the object is dead by the tail call. But there's no need to forgo all C++ objects.
The original code with temporary std::string
object is still tail recursive, since the destructor for that object is executed immediately after exit from PrintStackTop("");
, so nothing should be executed after the recursive return
statement.
However, there are two issues that lead to confusion of tail call optimization (TCO):
PrintStackTop
functionIt can be verified by custom class that each of those two issues is able to break TCO.
As it is noted in the previous answer by @Potatoswatter there is a workaround for both of those issues. It is enough to wrap call of PrintStackTop
by another function to help the compiler to perform TCO even with temporary std::string
:
void PrintStackTopTail()
{
PrintStackTop("tail");
}
int TailCallFactorial(int n, int a = 1)
{
PrintStackTopTail();
//...
}
Note that is not enough to limit the scope by enclosing { PrintStackTop("tail"); }
in curly braces. It must be enclosed as a separate function.
Now it can be verified with g++ version 4.7.2 (compilation options -O2) that tail recursion is replaced by a loop.
The similar issue is observed in Pass-by-reference hinders gcc from tail call elimination
Note that printing (st - (size_t) &stack_top)
is not enough to be sure that TCO is performed, for example with the optimization option -O3 the function TailCallFactorial
is self inlined five times, so TailCallFactorial(5)
is executed as a single function call, but the issue is revealed for larger argument values (for example for TailCallFactorial(15);
). So, the TCO may be verified by reviewing assembly output generated with the -S flag.
Because you're passing a temporary std::string object to the PrintStackTop(std::string) function. This object is allocated on the stack and thus prevent the tail call optimization.
I modified your code:
void PrintStackTopStr(char const*const type)
{
int stack_top;
if(st == 0) st = (size_t) &stack_top;
cout << "In " << type << " call version, the stack top is: " << (st - (size_t) &stack_top) << endl;
}
int RealTailCallFactorial(int n, int a = 1)
{
PrintStackTopStr("tail");
if(n < 2)
return a;
return RealTailCallFactorial(n - 1, n * a);
}
Compile with: g++ -O2 -fno-exceptions -o tailcall tailcall.cpp
And it now uses the tail call optimisation. You can see it in action if you use the -S flag to produce the assembly:
L39:
imull %ebx, %esi
subl $1, %ebx
L38:
movl $LC2, (%esp)
call __Z16PrintStackTopStrPKc
cmpl $1, %ebx
jg L39
You see the recursive call inlined as a loop (jg L39).