We use stack traces in proprietary assert
like macro to catch developer mistakes - when error is caught, stack trace is printed.
I find gcc\'s pair
So you want a stand-alone function that prints a stack trace with all of the features that gdb stack traces have and that doesn't terminate your application. The answer is to automate the launch of gdb in a non-interactive mode to perform just the tasks that you want.
This is done by executing gdb in a child process, using fork(), and scripting it to display a stack-trace while your application waits for it to complete. This can be performed without the use of a core-dump and without aborting the application. I learned how to do this from looking at this question: How it's better to invoke gdb from program to print it's stacktrace?
The example posted with that question didn't work for me exactly as written, so here's my "fixed" version (I ran this on Ubuntu 9.04).
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/prctl.h>
void print_trace() {
char pid_buf[30];
sprintf(pid_buf, "%d", getpid());
char name_buf[512];
name_buf[readlink("/proc/self/exe", name_buf, 511)]=0;
prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, 0, 0, 0);
int child_pid = fork();
if (!child_pid) {
dup2(2,1); // redirect output to stderr - edit: unnecessary?
execl("/usr/bin/gdb", "gdb", "--batch", "-n", "-ex", "thread", "-ex", "bt", name_buf, pid_buf, NULL);
abort(); /* If gdb failed to start */
} else {
waitpid(child_pid,NULL,0);
}
}
As shown in the referenced question, gdb provides additional options that you could use. For example, using "bt full" instead of "bt" produces an even more detailed report (local variables are included in the output). The manpages for gdb are kind of light, but complete documentation is available here.
Since this is based on gdb, the output includes demangled names, line-numbers, function arguments, and optionally even local variables. Also, gdb is thread-aware, so you should be able to extract some thread-specific metadata.
Here's an example of the kind of stack traces that I see with this method.
0x00007f97e1fc2925 in waitpid () from /lib/libc.so.6
[Current thread is 0 (process 15573)]
#0 0x00007f97e1fc2925 in waitpid () from /lib/libc.so.6
#1 0x0000000000400bd5 in print_trace () at ./demo3b.cpp:496
2 0x0000000000400c09 in recursive (i=2) at ./demo3b.cpp:636
3 0x0000000000400c1a in recursive (i=1) at ./demo3b.cpp:646
4 0x0000000000400c1a in recursive (i=0) at ./demo3b.cpp:646
5 0x0000000000400c46 in main (argc=1, argv=0x7fffe3b2b5b8) at ./demo3b.cpp:70
Note: I found this to be incompatible with the use of valgrind (probably due to Valgrind's use of a virtual machine). It also doesn't work when you are running the program inside of a gdb session (can't apply a second instance of "ptrace" to a process).
You can use DeathHandler - small C++ class which does everything for you, reliable.
Since the GPL licensed code is intended to help you during development, you could simply not include it in the final product. The GPL restricts you from distributing GPL licenses code linked with non-GPL compatible code. As long as you only use the GPL code inhouse, you should be fine.
I suppose line numbers are related to current eip value, right?
SOLUTION 1:
Then you can use something like GetThreadContext(), except that you're working on linux. I googled around a bit and found something similar, ptrace():
The ptrace() system call provides a means by which a parent process may observe and control the execution of another process, and examine and change its core image and registers. [...] The parent can initiate a trace by calling fork(2) and having the resulting child do a PTRACE_TRACEME, followed (typically) by an exec(3). Alternatively, the parent may commence trace of an existing process using PTRACE_ATTACH.
Now I was thinking, you can do a 'main' program which checks for signals that are sent to its child, the real program you're working on. after fork()
it call waitid():
All of these system calls are used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed.
and if a SIGSEGV (or something similar) is caught call ptrace()
to obtain eip
's value.
PS: I've never used these system calls (well, actually, I've never seen them before ;) so I don't know if it's possible neither can help you. At least I hope these links are useful. ;)
SOLUTION 2:
The first solution is quite complicated, right? I came up with a much simpler one: using signal() catch the signals you are interested in and call a simple function that reads the eip
value stored in the stack:
...
signal(SIGSEGV, sig_handler);
...
void sig_handler(int signum)
{
int eip_value;
asm {
push eax;
mov eax, [ebp - 4]
mov eip_value, eax
pop eax
}
// now you have the address of the
// **next** instruction after the
// SIGSEGV was received
}
That asm syntax is Borland's one, just adapt it to GAS
. ;)
Not too long ago I answered a similar question. You should take a look at the source code available on method #4, which also prints line numbers and filenames.
A small improvement I've done on method #3 to print line numbers. This could be copied to work on method #2 also.
Basically, it uses addr2line to convert addresses into file names and line numbers.
The source code below prints line numbers for all local functions. If a function from another library is called, you might see a couple of ??:0
instead of file names.
#include <stdio.h>
#include <signal.h>
#include <stdio.h>
#include <signal.h>
#include <execinfo.h>
void bt_sighandler(int sig, struct sigcontext ctx) {
void *trace[16];
char **messages = (char **)NULL;
int i, trace_size = 0;
if (sig == SIGSEGV)
printf("Got signal %d, faulty address is %p, "
"from %p\n", sig, ctx.cr2, ctx.eip);
else
printf("Got signal %d\n", sig);
trace_size = backtrace(trace, 16);
/* overwrite sigaction with caller's address */
trace[1] = (void *)ctx.eip;
messages = backtrace_symbols(trace, trace_size);
/* skip first stack frame (points here) */
printf("[bt] Execution path:\n");
for (i=1; i<trace_size; ++i)
{
printf("[bt] #%d %s\n", i, messages[i]);
/* find first occurence of '(' or ' ' in message[i] and assume
* everything before that is the file name. (Don't go beyond 0 though
* (string terminator)*/
size_t p = 0;
while(messages[i][p] != '(' && messages[i][p] != ' '
&& messages[i][p] != 0)
++p;
char syscom[256];
sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, messages[i]);
//last parameter is the file name of the symbol
system(syscom);
}
exit(0);
}
int func_a(int a, char b) {
char *p = (char *)0xdeadbeef;
a = a + b;
*p = 10; /* CRASH here!! */
return 2*a;
}
int func_b() {
int res, a = 5;
res = 5 + func_a(a, 't');
return res;
}
int main() {
/* Install our signal handler */
struct sigaction sa;
sa.sa_handler = (void *)bt_sighandler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
sigaction(SIGSEGV, &sa, NULL);
sigaction(SIGUSR1, &sa, NULL);
/* ... add any other signal here */
/* Do something */
printf("%d\n", func_b());
}
This code should be compiled as: gcc sighandler.c -o sighandler -rdynamic
The program outputs:
Got signal 11, faulty address is 0xdeadbeef, from 0x8048975
[bt] Execution path:
[bt] #1 ./sighandler(func_a+0x1d) [0x8048975]
/home/karl/workspace/stacktrace/sighandler.c:44
[bt] #2 ./sighandler(func_b+0x20) [0x804899f]
/home/karl/workspace/stacktrace/sighandler.c:54
[bt] #3 ./sighandler(main+0x6c) [0x8048a16]
/home/karl/workspace/stacktrace/sighandler.c:74
[bt] #4 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x3fdbd6]
??:0
[bt] #5 ./sighandler() [0x8048781]
??:0
Here's an alternative approach. A debug_assert() macro programmatically sets a conditional breakpoint. If you are running in a debugger, you will hit a breakpoint when the assert expression is false -- and you can analyze the live stack (the program doesn't terminate). If you are not running in a debugger, a failed debug_assert() causes the program to abort and you get a core dump from which you can analyze the stack (see my earlier answer).
The advantage of this approach, compared to normal asserts, is that you can continue running the program after the debug_assert is triggered (when running in a debugger). In other words, debug_assert() is slightly more flexible than assert().
#include <iostream>
#include <cassert>
#include <sys/resource.h>
// note: The assert expression should show up in
// stack trace as parameter to this function
void debug_breakpoint( char const * expression )
{
asm("int3"); // x86 specific
}
#ifdef NDEBUG
#define debug_assert( expression )
#else
// creates a conditional breakpoint
#define debug_assert( expression ) \
do { if ( !(expression) ) debug_breakpoint( #expression ); } while (0)
#endif
void recursive( int i=0 )
{
debug_assert( i < 5 );
if ( i < 10 ) recursive(i+1);
}
int main( int argc, char * argv[] )
{
rlimit core_limit = { RLIM_INFINITY, RLIM_INFINITY };
setrlimit( RLIMIT_CORE, &core_limit ); // enable core dumps
recursive();
}
Note: Sometimes "conditional breakpoints" setup within debuggers can be slow. By establishing the breakpoint programmatically, the performance of this method should be equivalent to that of a normal assert().
Note: As written, this is specific to the Intel x86 architecture -- other processors may have different instructions for generating a breakpoint.