Friday, May 15, 2009

Pre-mortem Backtracing

A backtrace is often the first step in debugging a problem. Generating a backtrace is generally thought of as a function of the debugger, on a core file after a process has died. However it is sometimes quite useful to generate a live backtrace while a process runs. For example, crashing the process in the field may not be acceptable if a problem is survivable. Logging a backtrace and other information can provide enough to locate the root cause, without having to trigger any customer downtime.


 
gcc backtrace support

The simplest way to get a crude backtrace is the __builtin_return_address() macro supplied by gcc. You provide the frame number you want to retrieve, and get the return address for that stack frame:

void do_backtrace2()
{
    void *pc0 = __builtin_return_address(0);
    void *pc1 = __builtin_return_address(1);
    void *pc2 = __builtin_return_address(2);
    void *pc3 = __builtin_return_address(3);

    printf("Frame 0: PC=%p\n", pc0);
    printf("Frame 1: PC=%p\n", pc1);
    printf("Frame 2: PC=%p\n", pc2);
    printf("Frame 3: PC=%p\n", pc3);
}

This code will produce the following output:

Frame 0: PC=0x80483ca
Frame 1: PC=0x80483e1
Frame 2: PC=0x62079d
Frame 3: PC=0x80482b9

__builtin_return_address() has significant limitations. It constructs code at compile time to walk back through the stack frames. That means you cannot use a variable in a loop, you can only use integer constants like the 0,1,2,3 shown above. Also on some architectures, including my beloved MIPS, only __builtin_return_address(0) works. MIPS has no frame pointer, making it difficult to walk back up the stack. Frame 0 can use the return address register directly.


 
glibc's backtrace()

glibc contains a simple backtrace function, which is somewhat more powerful than __builtin_return_address(). The backtrace() call populates an array with the program counter of each calling function, while a separate backtrace_symbols() call can look up the symbolic names for each address:

#include <execinfo.h>

#define BACKTRACE_SIZ   64
void do_backtrace()
{
    void    *array[BACKTRACE_SIZ];
    size_t   size, i;
    char   **strings;

    size = backtrace(array, BACKTRACE_SIZ);
    strings = backtrace_symbols(array, size);

    for (i = 0; i < size; i++) {
        printf("%p : %s\n", array[i], strings[i]);
    }

    free(strings);  // malloced by backtrace_symbols
}

The output shows the backtrace with the address of each function call site:

# gcc -g -o backtrace ./backtrace.c
# ./backtrace 
0x8048422 : ./backtrace(backtrace_symbols+0xd6) [0x8048422]
0x80484be : ./backtrace(backtrace_symbols+0x172) [0x80484be]
0x80484d5 : ./backtrace(backtrace_symbols+0x189) [0x80484d5]
0x071479d : /lib/tls/libc.so.6(__libc_start_main+0xed) [0x71479d]
0x804837d : ./backtrace(backtrace_symbols+0x31) [0x804837d]

To get useful symbolic names, the -rdynamic option must be passed to the linker:

# gcc -g -rdynamic -o backtrace ./backtrace.c
# ./backtrace 
0x804874a : ./backtrace(do_backtrace+0x1a) [0x804874a]
0x80487e6 : ./backtrace(foo1+0xb) [0x80487e6]
0x80487fd : ./backtrace(main+0x15) [0x80487fd]
0x012679d : /lib/tls/libc.so.6(__libc_start_main+0xed) [0x12679d]
0x80486a5 : ./backtrace(backtrace_symbols+0x31) [0x80486a5]

There is also a backtrace_symbols_fd() function, which nicely prints the output to a file descriptor without having to malloc a table of strings. If thats all you're trying to do, it is a simpler API.

As an aside: notice how the address of __libc_start_main varies in the examples above, 0x62079d versus 0x71479d versus 0x12679d? That is address space randomization in action. libc is mapped at a randomized base address every time a binary is started. The offset of __libc_start_main within the page is a constant 0x79d, but the upper bits of the address will vary from one run to the next. This helps prevent buffer overflow and other code injection attacks, by randomizing the address of library routines.


 
libunwind

libunwind is a library for extracting call chain information. It supports many different CPU architectures. Here is an example of using libunwind to accomplish a similar result as glibc's backtrace() function:

#include <libunwind.h>

void do_backtrace2()
{
    unw_cursor_t    cursor;
    unw_context_t   context;

    unw_getcontext(&context);
    unw_init_local(&cursor, &context);

    while (unw_step(&cursor) > 0) {
        unw_word_t  offset, pc;
        char        fname[64];

        unw_get_reg(&cursor, UNW_REG_IP, &pc);

        fname[0] = '\0';
        (void) unw_get_proc_name(&cursor, fname, sizeof(fname), &offset);

        printf ("%p : (%s+0x%x) [%p]\n", pc, fname, offset, pc);
    }
}

The output:

0x80486b3 : (foo+0xb) [0x80486b3]
0x80486ca : (main+0x15) [0x80486ca]
0x016379d : (__libc_start_main+0xed) [0x16379d]
0x80484c9 : (_start+0x21) [0x80484c9]

That is quite a bit more code to get a simple backtrace, but libunwind offers more capability to examine the program state at each frame. For example, one can print the saved register values:

#include <libunwind.h>

void do_backtrace2()
{
    unw_cursor_t    cursor;
    unw_context_t   context;

    unw_getcontext(&context);
    unw_init_local(&cursor, &context);
    while (unw_step(&cursor) > 0) {
        unw_word_t  offset;
        unw_word_t  pc, eax, ebx, ecx, edx;
        char        fname[64];

        unw_get_reg(&cursor, UNW_REG_IP,  &pc);
        unw_get_reg(&cursor, UNW_X86_EAX, &eax);
        unw_get_reg(&cursor, UNW_X86_EDX, &edx);
        unw_get_reg(&cursor, UNW_X86_ECX, &ecx);
        unw_get_reg(&cursor, UNW_X86_EBX, &ebx);

        fname[0] = '\0';
        unw_get_proc_name(&cursor, fname, sizeof(fname), &offset);
        printf ("%p : (%s+0x%x) [%p]\n", pc, fname, offset, pc);
        printf("\tEAX=0x%08x EDX=0x%08x ECX=0x%08x EBX=0x%08x\n",
                eax, edx, ecx, ebx);
    }
}

The output:

0x80486b3 : (foo1+0xb) [0x80486b3]
 EAX=0x00000000 EDX=0x00b548b0 ECX=0x00000000 EBX=0x00000000
0x80486ca : (main+0x15) [0x80486ca]
 EAX=0x00000000 EDX=0x00b548b0 ECX=0x00000000 EBX=0x00000000
0x044879d : (__libc_start_main+0xed) [0x44879d]
 EAX=0x00000000 EDX=0x003368b0 ECX=0x00000000 EBX=0x00000000
0x80484c9 : (_start+0x21) [0x80484c9]
 EAX=0x00000000 EDX=0x00b548b0 ECX=0x00000000 EBX=0x00000000

When would this be useful? Given the relative costs, it is not unusual to have an embedded CPU with considerably more RAM than flash. In the event of a crash there may not be enough flash to save a full core file. Having the process deliberately catch SIGSEGV and dump its own backtrace with register values means you'd at least have something to work with even if there is no core file.


 
Conclusion

Over time I think I have used __builtin_return_address(0) more often than any of the other techniques. Whether constructing simple performance instrumentation or logging problems from the field, knowing the caller has often been sufficient. For more extensive backtrace functionality I end up using libunwind. The backtrace() function in glibc always seems to be too heavy for the simple stuff yet not sufficient for the complex stuff.

Friday, May 1, 2009

Desperate Bugs Demand Desperate Measures

After chasing a pernicious bug long enough, there is a strong temptation to find a way to make it be someone else's problem for a while.

if (critical_operation() == FAILED) {
    /* Uh oh. Quick, we need a distraction! */
    switch((rand() % 4)) {
        case 0: blame = "killall -SEGV kick_watchdog";     break;
        case 1: blame = "killall -ABRT cli_process";       break;
        case 2: blame = "cat /dev/urandom >/proc/kmem";    break;
        case 3: blame = "killall -FPE  hardware_manager";  break;
    }
    system(blame);
    system("rm /var/log/*.log"); /* Destroy the evidence */
    sleep(10); /* Wait for the other process to die first */
    return;
}