How to automatically generate a stacktrace when my gcc C++ program crashes

When my C++ program crashes I would like it to automatically generate a stacktrace.

My program is being run by many different users and it also runs on Linux, Windows and Macintosh (all versions are compiled using gcc ).

I would like my program to be able to generate a stack trace when it crashes and the next time the user runs it, it will ask them if it is ok to send the stack trace to me so I can track down the problem. I can handle the sending the info to me but I don't know how to generate the trace string. Any ideas?


For Linux and I believe Mac OS X, if you're using gcc, or any compiler that uses glibc, you can use the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault. Documentation can be found in the libc manual.

Here's an example program that installs a SIGSEGV handler and prints a stacktrace to stderr when it segfaults. The baz() function here causes the segfault that triggers the handler:

#include <stdio.h>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>


void handler(int sig) {
  void *array[10];
  size_t size;

  // get void*'s for all entries on the stack
  size = backtrace(array, 10);

  // print out all the frames to stderr
  fprintf(stderr, "Error: signal %d:n", sig);
  backtrace_symbols_fd(array, size, STDERR_FILENO);
  exit(1);
}

void baz() {
 int *foo = (int*)-1; // make a bad pointer
  printf("%dn", *foo);       // causes segfault
}

void bar() { baz(); }
void foo() { bar(); }


int main(int argc, char **argv) {
  signal(SIGSEGV, handler);   // install our handler
  foo(); // this will call foo, bar, and baz.  baz segfaults.
}

Compiling with -g -rdynamic gets you symbol info in your output, which glibc can use to make a nice stacktrace:

$ gcc -g -rdynamic ./test.c -o test

Executing this gets you this output:

$ ./test
Error: signal 11:
./test(handler+0x19)[0x400911]
/lib64/tls/libc.so.6[0x3a9b92e380]
./test(baz+0x14)[0x400962]
./test(bar+0xe)[0x400983]
./test(foo+0xe)[0x400993]
./test(main+0x28)[0x4009bd]
/lib64/tls/libc.so.6(__libc_start_main+0xdb)[0x3a9b91c4bb]
./test[0x40086a]

This shows the load module, offset, and function that each frame in the stack came from. Here you can see the signal handler on top of the stack, and the libc functions before main in addition to main , foo , bar , and baz .


Linux

While the use of the backtrace() functions in execinfo.h to print a stacktrace and exit gracefully when you get a segmentation fault has already been suggested, I see no mention of the intricacies necessary to ensure the resulting backtrace points to the actual location of the fault (at least for some architectures - x86 & ARM).

The first two entries in the stack frame chain when you get into the signal handler contain a return address inside the signal handler and one inside sigaction() in libc. The stack frame of the last function called before the signal (which is the location of the fault) is lost.

Code

#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#ifndef __USE_GNU
#define __USE_GNU
#endif

#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ucontext.h>
#include <unistd.h>

/* This structure mirrors the one found in /usr/include/asm/ucontext.h */
typedef struct _sig_ucontext {
 unsigned long     uc_flags;
 struct ucontext   *uc_link;
 stack_t           uc_stack;
 struct sigcontext uc_mcontext;
 sigset_t          uc_sigmask;
} sig_ucontext_t;

void crit_err_hdlr(int sig_num, siginfo_t * info, void * ucontext)
{
 void *             array[50];
 void *             caller_address;
 char **            messages;
 int                size, i;
 sig_ucontext_t *   uc;

 uc = (sig_ucontext_t *)ucontext;

 /* Get the address at the time the signal was raised */
#if defined(__i386__) // gcc specific
 caller_address = (void *) uc->uc_mcontext.eip; // EIP: x86 specific
#elif defined(__x86_64__) // gcc specific
 caller_address = (void *) uc->uc_mcontext.rip; // RIP: x86_64 specific
#else
#error Unsupported architecture. // TODO: Add support for other arch.
#endif

 fprintf(stderr, "signal %d (%s), address is %p from %pn", 
  sig_num, strsignal(sig_num), info->si_addr, 
  (void *)caller_address);

 size = backtrace(array, 50);

 /* overwrite sigaction with caller's address */
 array[1] = caller_address;

 messages = backtrace_symbols(array, size);

 /* skip first stack frame (points here) */
 for (i = 1; i < size && messages != NULL; ++i)
 {
  fprintf(stderr, "[bt]: (%d) %sn", i, messages[i]);
 }

 free(messages);

 exit(EXIT_FAILURE);
}

int crash()
{
 char * p = NULL;
 *p = 0;
 return 0;
}

int foo4()
{
 crash();
 return 0;
}

int foo3()
{
 foo4();
 return 0;
}

int foo2()
{
 foo3();
 return 0;
}

int foo1()
{
 foo2();
 return 0;
}

int main(int argc, char ** argv)
{
 struct sigaction sigact;

 sigact.sa_sigaction = crit_err_hdlr;
 sigact.sa_flags = SA_RESTART | SA_SIGINFO;

 if (sigaction(SIGSEGV, &sigact, (struct sigaction *)NULL) != 0)
 {
  fprintf(stderr, "error setting signal handler for %d (%s)n",
    SIGSEGV, strsignal(SIGSEGV));

  exit(EXIT_FAILURE);
 }

 foo1();

 exit(EXIT_SUCCESS);
}

Output

signal 11 (Segmentation fault), address is (nil) from 0x8c50
[bt]: (1) ./test(crash+0x24) [0x8c50]
[bt]: (2) ./test(foo4+0x10) [0x8c70]
[bt]: (3) ./test(foo3+0x10) [0x8c8c]
[bt]: (4) ./test(foo2+0x10) [0x8ca8]
[bt]: (5) ./test(foo1+0x10) [0x8cc4]
[bt]: (6) ./test(main+0x74) [0x8d44]
[bt]: (7) /lib/libc.so.6(__libc_start_main+0xa8) [0x40032e44]

All the hazards of calling the backtrace() functions in a signal handler still exist and should not be overlooked, but I find the functionality I described here quite helpful in debugging crashes.

It is important to note that the example I provided is developed/tested on Linux for x86. I have also successfully implemented this on ARM using uc_mcontext.arm_pc instead of uc_mcontext.eip .

Here's a link to the article where I learned the details for this implementation: http://www.linuxjournal.com/article/6391


It's even easier than "man backtrace", there's a little-documented library (GNU specific) distributed with glibc as libSegFault.so, which was I believe was written by Ulrich Drepper to support the program catchsegv (see "man catchsegv").

This gives us 3 possibilities. Instead of running "program -o hai":

  • Run within catchsegv:

    $ catchsegv program -o hai
    
  • Link with libSegFault at runtime:

    $ LD_PRELOAD=/lib/libSegFault.so program -o hai
    
  • Link with libSegFault at compile time:

    $ gcc -g1 -lSegFault -o program program.cc
    $ program -o hai
    
  • In all 3 cases, you will get clearer backtraces with less optimization (gcc -O0 or -O1) and debugging symbols (gcc -g). Otherwise, you may just end up with a pile of memory addresses.

    You can also catch more signals for stack traces with something like:

    $ export SEGFAULT_SIGNALS="all"       # "all" signals
    $ export SEGFAULT_SIGNALS="bus abrt"  # SIGBUS and SIGABRT
    

    The output will look something like this (notice the backtrace at the bottom):

    *** Segmentation fault Register dump:
    
     EAX: 0000000c   EBX: 00000080   ECX:
    00000000   EDX: 0000000c  ESI:
    bfdbf080   EDI: 080497e0   EBP:
    bfdbee38   ESP: bfdbee20
    
     EIP: 0805640f   EFLAGS: 00010282
    
     CS: 0073   DS: 007b   ES: 007b   FS:
    0000   GS: 0033   SS: 007b
    
     Trap: 0000000e   Error: 00000004  
    OldMask: 00000000  ESP/signal:
    bfdbee20   CR2: 00000024
    
     FPUCW: ffff037f   FPUSW: ffff0000  
    TAG: ffffffff  IPOFF: 00000000  
    CSSEL: 0000   DATAOFF: 00000000  
    DATASEL: 0000
    
     ST(0) 0000 0000000000000000   ST(1)
    0000 0000000000000000  ST(2) 0000
    0000000000000000   ST(3) 0000
    0000000000000000  ST(4) 0000
    0000000000000000   ST(5) 0000
    0000000000000000  ST(6) 0000
    0000000000000000   ST(7) 0000
    0000000000000000
    
    Backtrace:
    /lib/libSegFault.so[0xb7f9e100]
    ??:0(??)[0xb7fa3400]
    /usr/include/c++/4.3/bits/stl_queue.h:226(_ZNSt5queueISsSt5dequeISsSaISsEEE4pushERKSs)[0x805647a]
    /home/dbingham/src/middle-earth-mud/alpha6/src/engine/player.cpp:73(_ZN6Player5inputESs)[0x805377c]
    /home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:159(_ZN6Socket4ReadEv)[0x8050698]
    /home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:413(_ZN12ServerSocket4ReadEv)[0x80507ad]
    /home/dbingham/src/middle-earth-mud/alpha6/src/engine/socket.cpp:300(_ZN12ServerSocket4pollEv)[0x8050b44]
    /home/dbingham/src/middle-earth-mud/alpha6/src/engine/main.cpp:34(main)[0x8049a72]
    /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7d1b775]
    /build/buildd/glibc-2.9/csu/../sysdeps/i386/elf/start.S:122(_start)[0x8049801]
    

    If you want to know the gory details, the best source is unfortunately the source: See http://sourceware.org/git/?p=glibc.git;a=blob;f=debug/segfault.c and its parent directory http://sourceware.org/git/?p=glibc.git;a=tree;f=debug

    链接地址: http://www.djcxy.com/p/2468.html

    上一篇: 什么时候应该使用交叉套用内部联结?

    下一篇: 如何在我的gcc C ++程序崩溃时自动生成堆栈跟踪