Tuesday, June 30, 2009

Post-mortem debugging: core files

When one has a core file, one runs gdb. Its simply the way things were Meant To Be, right? Yet gdb isn't the right tool for the job in all cases. If you're dealing with a corrupted heap, gdb is not very helpful. You can see the portion of the heap which caused the process to fault (most likely in malloc or free), but identifying the junk in memory is an exercise in puzzling it out and looking for patterns. It is often useful in such cases to search the rest of the process address space for pointers into the corrupted area of the heap. Though gdb can be used to search for patterns in memory, it isn't very good at it. For example consider the following macro:

define searchmem
    set $start = (char *) $arg0
    set $end = (char *) $arg1
    set $pattern = (unsigned int) $arg2
    set $p = $start
    while $p < $end
        if (*(unsigned int *) $p) == $pattern
            printf "pattern 0x%x found at 0x%x\n", $pattern, $p
        end
        set $p++
    end
end

document searchmem
    search between $argv0 and $argv1 for pattern $argv2
end

This macro can look only for 32 bit numbers, not any sort of regular expression, and it is very, very slow. We'd really like to run grep, but if gdb provides a way to run grep over the core contents I haven't found it. Instead, Gentle Reader, we'll write a utility to output the core file to text so that grep or any other Unix tool can be used. This would be a job for od or hexdump except for two things:

  1. We'd like to see the addresses of the data being dumped.
  2. The core might be in the wrong endianness, such as a MIPS-BE core file on an x86 host.

Instead of using a generic binary dump like od we'll construct a tool specifically for core files, but first we need to understand their contents.


 
Process Address Space Linux process address space

Linux and modern Unix-ish operating systems dynamically map shared libraries into the process address space. These libraries are not packed tightly up against one another. For alignment and page protection reasons, there are gaps between them.

Each library generally consists of multiple segments:

  • instructions, called the TEXT segment.
  • uninitialized data to be zero filled, called BSS
  • initialized data (i.e. variables initialized to a non-zero value), called DATA

You can see the memory segments for a running process in /proc/<pid>/maps. Here is an example:

# cat /proc/1261/maps
0fc73000-0fc80000 r-xp 00000000 01:00 713        /lib/libA.so.1
0fc80000-0fc83000 ---p 0000d000 01:00 713        /lib/libA.so.1
0fc83000-0fc92000 rwxp 00000000 01:00 713        /lib/libA.so.1
...deleted...
10000000-1000c000 r-xp 00000000 01:00 1190       /bin/myapp
1001b000-1001c000 rwxp 0000b000 01:00 1190       /bin/myapp
...deleted...

libA's callable functions are in the TEXT segment, mapped at addresses 0x0fc73000 through 0x0fc80000. Note the permission bits on these pages: read-only plus executable. Write permission is denied, making it safe to share the same physical pages of RAM amongst multiple processes.

The libA BSS segment extends from 0x0fc80000 through 0x0fc83000. These pages are mapped with no permissions at all, even a read access will trigger a page fault. The kernel will supply a zero-filled page on the first fault, and mark its permissions as read+write.

The libA DATA segment is at 0x0fc83000 though 0x0fc92000. This region is populated with the initialized data, so it has read+write permissions already. No page fault will be triggered on access to these pages.

When the process dies, the core file needs to preserve these scattered memory areas. Core files from a Linux process are written in ELF format, which I will stubbornly continue to refer to as the Extensible Linking Format. ELF defines data sections with associated virtual addresses, allowing it to describe data scattered across a process address space. Let's examine the output of "readelf -l" on the core from this process:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz   Flg Align
  NOTE           0x0003f4 0x00000000 0x00000000 0x006f4 0x00000      0
  LOAD           0x001000 0x0fc73000 0x00000000 0x00000 0x0d000  R E 0x1000
  LOAD           0x001000 0x0fc80000 0x00000000 0x00000 0x03000      0x1000
  LOAD           0x001000 0x0fc83000 0x00000000 0x0f000 0x0f000  RWE 0x1000
  ..etc...

The NOTE section contains various global information about the dead process, including its name and the contents of the CPU registers at the time it died. If you use "objdump -h" to list the core sections you'll see two additional sections: reg0/2 and reg0. These sections don't actually exist in the file, objdump breaks out the register contents from the NOTE section into their own pseudo-sections.

The LOAD sections contain the data from the process address space. The first one starts at a virtual address of 0xfc73000. That is the TEXT segment for libA, which we looked at earlier. Note the FileSiz is 0: to save space, the kernel skips dumping the contents of non-writable pages. gdb would fetch these sections from the executable file instead. The second LOAD section contains the libA BSS, and similarly its FileSiz is zero: this process died before it ever referenced anything in the BSS of libA. None of the pages had been faulted in and therefore none were writable.

The third LOAD section is the interesting one. This is the initialized DATA section. Because these pages were writable, they were saved to the core file and so the FileSiz is 0x0f000 (which matches the size of the segment from the /proc/<pid>maps file, above).


 
BFD

libbfd is one of several libraries available for working with ELF files. We'll use libbfd to process the core file, dumping it as hex words to a text file.

According to the history I can find, libbfd was created at Cygnus Support to deal with the myriad binary formats that sprang up in the 1980s and 90s. In response to how difficult it would be to encapsulate the various formats in a single library David Henkel-Wallace reportedly responded "big f---ing deal," and thus libbfd was christened. The name has since been clarified as "binary file descriptor."


    #include <bfd.h>

    bfd        *abfd;
    asection   *sect;
    char       *corefilename;
    enum bfd_endian endian;

    if ((abfd = bfd_openr(corefilename, NULL)) == NULL) {
        /* ... error handling ... */
    }

First we open the file using bfd_openr(). *abfd is the handle used to access the file, all other libbfd APIs take it as an argument.


    if (!bfd_check_format (abfd, bfd_core)) {
        printf("%s does not appear to be a core file.\n", corefilename);
        /* ... error handling ... */
    }

We check that the file is actually a core and not some other type of file. bfd_core is part of an enumerated type; other types which libbfd can check for include bfd_object for ELF programs or .o files, and bfd_archive for ar-style arcives.


    endian = abfd->xvec->byteorder;

libbfd handles the endianness of fields in the program and section headers, returning the result in the CPU's native byte order regardless of the endianness of the ELF file. It doesn't do anything about the data within those sections. Since we're going to dump the data in the core file, we need to know the endianness.


    for (sect = abfd->sections; sect != NULL; sect = sect->next) {
        bfd_vma        vma  = bfd_section_vma (abfd, sect);
        bfd_size_type  size = bfd_section_size(abfd, sect);
        const char    *name = bfd_section_name(abfd, sect);

We loop over each ELF section, pulling out the virtual address and size of the data it holds. The size will be zero in many cases, as shown above for the libA TEXT and BSS sections.


        if (!bfd_get_section_contents(abfd, sect, buf, (file_ptr)0, size)) {
            fprintf(stderr, "Could not read section %s\n", name);
            exit(5);
        }

We fetch the contents of each section, ready to print them to hex.


With a modicum of string manipulation we have a hex dump of the core contents:

0fc84710:  615f6669  6e616c69  7a65005f  4a765f52    a_finalize._Jv_R
0fc84720:  65676973  74657243  6c617373  6573006c    egisterClasses.l
0fc84730:  6962632e  736f2e36  0061646c  65723332    ibc.so.6.adler32
0fc84740:  00636f6d  70726573  73320064  65666c61    .compress2.defla

 
libbfd licensing

libbfd is GPL. It is not LGPL, which allows dynamic linking to proprietary code, but the full GPL. Any use of libbfd encumbers the rest of the software linked to it with the GPL and obligates you to provide the source code to anyone to whom you provide a binary. Its important to consider what this means: if you're developing tools for internal use by developers, the GPL is not onerous. Indeed, the source code of the utility will likely be checked into the version control system that all developers can access anyway.


 
More later...

If you for need to incorporate ELF file support into a proprietary software tool, then libbfd is not useable. I intended to provide the same core2hex example using FreeBSD's libelf, but this article is already quite long so I'm going to defer it until next time.