Wednesday, September 22, 2010

GCC Function Instrumentation

One of gcc's more obscure features is -finstrument-functions. It was implemented by Cygnus Solutions, presumably as part of a contract for sombody-or-other to deliver something-or-other. When enabled, the compiler will emit calls to __cyg_profile_func_enter() and __cyg_profile_func_exit() at the top and bottom of every function.

Let's examine a simple example which prints the function addresses at entry and exit.

#include <stdio.h>

void __cyg_profile_func_enter(void *this_fn, void *call_site)
                              __attribute__((no_instrument_function));
void __cyg_profile_func_enter(void *this_fn, void *call_site) {
  printf("ENTER: %p, from %p\n", this_fn, call_site);
} /* __cyg_profile_func_enter */

void __cyg_profile_func_exit(void *this_fn, void *call_site)
                             __attribute__((no_instrument_function));
void __cyg_profile_func_exit(void *this_fn, void *call_site) {
  printf("EXIT:  %p, from %p\n", this_fn, call_site);
} /* __cyg_profile_func_enter */

int foo() {
  return 2;
}

int bar() {
  return 1;
}

int main(int argc, char** argv) {
  printf("foo=%d bar=%d\n", foo(), bar());
}

The __cyg_profile_func_enter and exit functions are passed two parameters: the address of the function being entered/exited, and the address from which it was called. Note the use of the no_instrument_function attribute. If not present, then __cyg_profile_func_enter would be instrumented like any other function. Every call would result in calling the instrumentation again, which results in another call, etc etc until it blows the stack. Previously I've used -finstrument-functions to construct a profiler for a CPU whose interrupt structure was not suitable for a sample-based profiler. All of the routines implementing the profiler were labelled no_instrument_function.

Next we'll examine the output, with just enough of the disassembled binary to make sense of it.

$ cc t.c -finstrument-functions
$ ./a.out
ENTER: 0x4005d0 @ 0x2b59e0d471c4 (calling main)
ENTER: 0x40059d @ 0x40060c       (calling foo)
EXIT:  0x40059d @ 0x40060c       (returning from foo)
ENTER: 0x40056a @ 0x400618       (calling bar)
EXIT:  0x40056a @ 0x400618       (returning from bar)
foo=2 bar=1
EXIT:  0x4005d0 @ 0x2b59e0d471c4 (returning from main)

000000000040056a <foo>:
  40056a: push   %rbp
  ...

000000000040059d <bar>:
  40059d: push   %rbp
  ...

00000000004005d0 <main>:
  4005d0: push   %rbp
  ...
  400607: callq  40059d <bar>
  40060c: mov    %eax,%ebx
  ...
  400613: callq  40056a <foo>
  400618: mov    %eax,%esi
  ...

There are a few interesting things to note in the output.

  1. Though main calls printf, we don't see a call to printf in the output. Function instrumentation is implemented during compilation, and we didn't compile printf we linked to an existing library. We'll only see the instrumentation for functions compiled with -finstrument-functions.
  2. The call_site is the instruction after the one which vectors over to run the function.
  3. The call_site which called main() looks strange. It is not in the TEXT segment, it is way up at some weird address. This is address space layout randomization in action, every run of this binary has a different address calling main. I don't know exactly what that routine is, but presumably it is part of the trampoline when the kernel begins executing a new process.

This instrumentation facility is not often used. The aforementioned call graph profiler is the only time I've used it. Nonetheless I hope you find it interesting.


Update 7/2011: In the comments Frank Denis notes that on OSX the functions are profile_func_enter() and profile_func_exit().