Sunday, November 30, 2008

Blogroll, November 2008

The half-life of engineering knowledge is depressingly short. The old school way to stay current is via books covering technical topics, such as those published by O'Reilly and Associates and more recently by The Pragmatic Programmers. Though the quality of the writing in published books is generally quite good owing to the involvement of an editor, the other drawbacks are legion. Books have very long lead times before publication, leading to dated material. The enormous burdens placed upon the author mean that some interesting topic areas will never have a book developed. Technical books carry a relatively high price tag, and will continue to do so as long as the volumes stay low (a run of ten thousand is considered a successful technical title).

RSS I recent years I have sought my continuing education online, as presumably does the Gentle Reader as well. Sometimes this is in the form of publication-ready papers, such as those Ulrich Drepper is fond of writing (for example, "What Every Programmer Should Know About Memory" is excellent). I also check the links on Reddit regularly; I have mixed feelings about Reddit, but it does draw links to quite a bit of interesting material. If I find a particularly good author I'll add their RSS feed to Google Reader.

One blog I follow regularly is that of Louis Gray, a prolific blogger about social media and web 2.0. Each month he highlights other blogs in that space. I really like that idea: each link makes the entire genre just a little bit more richly connected. I'm going to try to regularly suggest other blogs which I follow, though I will probably not make these lists as frequently as Louis Gray does.

Some starting assertions:

  • I'm going to focus on blogs relevant to software development. For example I also read parenting blogs, and though the Gentle Reader might also be a parent you'll need to go elsewhere for that material.
  • I'm not going to spend time on Jeff Atwood, Steve Yegge, or other high profile writers. I suspect the Gentle Reader has already decided whether to follow those authors or not.
  • Though I try to focus on embedded system articles and many amongst this first set of links are related to low-level systems programming, future blogrolls will run further afield. You have been warned.

The Blogroll:
 
1) Proper Fixation, by Yossi Kreinin

Yossi writes from the perspective of a senior individual contributor, a point of view I greatly appreciate. He's covered close-to-the-metal topics like CPU architecture and debugging, but also other topics such as organizational challenges.


2) The Cranky Product Manager, by the Cranky PM

Written from the perspective of a Product Manager at Dysfunctosoft, a highly exaggerated (one hopes) parody of the CrankyPM's employer. Entertaining, with a dash of truthiness.


3) Dadhacker, by Landon Dyer

Landon Dyer is another senior developer with a long history in the computer industry. In the 1980s he worked on video games at Atari, and regularly posts reminiscences about those days. He also writes about the experience of being a developer in the lower levels of the system, like writing software for broken hardware and various pearls of wisdom.


4) Gustavo Duarte, by the eponymous Gustavo Duarte

Gustavo is best known for his extraordinarily detailed descriptions of various facets of the x86 architecture, such as the privilege levels and MMU hardware. He's also written on a wide range of topics such as the GPL, certifications, and software business philosophy.


More to come

That is it for this installment, but I'd like to make the blogroll a regular topic.

Friday, November 28, 2008

The Six Million Dollar LibC

Android Today, Gentle Reader, we will examine the Bionic library, a slim libc developed by Google for use in the Android mobile software platform. Bionic is clearly tailored for supporting the Android system, but it is interesting to see what might be done with it in other embedded system contexts.

Google's stated goals for Bionic include:

  1. BSD license: Android uses a Linux kernel, but they wanted to keep the GPL and LGPL out of user space.
  2. Small size: glibc is very large, and though uClibC is considerably smaller it is encumbered by the LGPL.
  3. Speed: designed for CPUs at relatively low clock frequencies, Bionic needs to be fast. In practice this seems to drive the decisions of what to leave out, rather than any special Google pixie dust to make code go fast.

In this article we'll delve into the Bionic libc via source inspection, retrieved from the git repository in October 2008. The library is written to support ARM CPUs, though some x86 support is also present. There is no support for other CPU architectures, which makes it a bit inconvenient as all of my current systems are PowerPC or MIPS. Nonetheless I'll concede that for the mobile phone market which Bionic targets, ARM is the only architecture which matters.

As one might expect for a BSD-licensed libc, a significant amount of code is sourced from OpenBSD, NetBSD, and FreeBSD. Additional BSD-licensed bits come from Sun and public domain code like the time zone package. There is also a significant amount of new code written by Google, particularly in the pthread implementation.


 
C++ support

So what is different about the Bionic libc versus glibc? The most striking differences are in the C++ support, as detailed in the CAVEATS file:

  • The Bionic libc routines do not handle C++ exceptions. They neither throw exceptions themselves, nor will they pass exceptions from a called function back through to their caller. So for example, if the cmp() routine passed to qsort() throws an exception the caller of qsort() will not see it.
     
    Support for C++ exceptions adds significant overhead to function calls, even just to pass thrown exceptions back to the caller. As Android's primary programming language is Java, which handles exceptions entirely within the runtime package, the designers chose to omit the lower level exception support. C++ code can still use exceptions internally, so long as they do not cross a libc routine. In practice, it would be difficult to actually guarantee that exceptions never try to transit a library routine.

  • There is no C++ Standard Template Library included. Developers are free supply their own, such as the free SGI implementation.

Lack of exceptions is obviously a big deal for C++ programmers, but nonetheless we'll push on.


 
libpthread

The pthread implementation appears to be completely new and developed by Google specifically for Android. It is, quite deliberately, not a complete implementation of POSIX pthreads. It implements those features necessary to support threads in the Dalvik JVM, and only selectively thereafter.

In other embedded Linux environments, the pthread library is crucial. There are a large number of developers in this space from a vxWorks background, to whom threads are simply the way software should be written. So we'll spend a bit more time delving into libpthread.

  • Mutexes, rwlocks, condvars, etc are all implemented using kernel futexes, which makes the user space implementation impressively simple. It seems a little too simple actually, I intend to spend a bit more time studying the implementation and Ulrich Drepper's futex whitepaper.

  • There is no pthread_cancel(). Threads can exit, but can not be killed by another thread.

  • There is no pthread_atfork(). This routine is useful if you're going to fork from a threaded process, allowing cleanups of resources which should not be held in the child. I've mostly seen pthread_atfork() used to deal with mutex locking issues, and need to study how the use of futexes affects fork().

  • Thread local storage is implemented, with up to 64 keys handled. Android reserves several of these for its own use: the per-thread id and errno, as well as two variables related to OpenGL whose function I do not understand. Interestingly the ARM implementation places the TLS map at the magic address 0xffff0ff0 in all processes. This technique is presumably part of the Google performance enhancing pixie dust.

  • POSIX realtime thread extensions like pthread_attr_{set,get}inheritsched and pthread_attr_{set,get}scope are not implemented. Frankly I've never worked on a system which did implement these APIs and am completely unfamiliar with them, so I don't find their omission surprising.

I haven't drawn a final conclusion of the Bionic pthread implementation yet. It is pleasingly simple, but lack of pthread_atfork() is troublesome and use of a magic address for the TLS map may make porting to other architectures more difficult. I need to get this puppy running on a PowerPC system and see how well it works.


 
Miscellaneous notes

In the course of digging through the library I generated a number of other notes, which don't really clump into categories. So I'm simply going to dump it all upon the Gentle Reader, in hopes that some of it is useful.

  • The README says there is no libm, though the source for libm is present with a large number of math routines. I need to investigate further whether it really works, or whether the README is out of date.

  • There is no wchar_t and no LOCALE support. I think this is fine: wchar_t is an idea whose time has come... and gone. The world has moved on to Unicode with its various fixed and variable width encodings, which the wide character type is not particularly useful for.
    I've used ICU in recent projects for internationalization support, and this is also what Google suggests in the README for Bionic.

  • There is a shared memory region of configuration properties. For example, DNS settings are stored in shared memory and not /etc/resolv.conf. The Android API also makes this shared memory configuration store available to applications via property_get() and property_set().

  • As one might expect, the stdio/stdlib/string/unistd implementation comes from OpenBSD, NetBSD, and FreeBSD with minimal changes. The only change I noticed was to remove the LOCALE support from strtod() (i.e., is the decimal point a period or a comma? In the Bionic library it is always a period).

  • There is no openlog() or syslog() implementation. There is a __libc_android_log_print() routine, to support Android's own logging mechanism.

  • Bionic uses Doug Lea's malloc, dlmalloc. Bionic also provides a hash table to track allocations looking for leaks, in malloc_leak.c.

  • There is no pty support that I can find, and no openpty(). There are reports of people starting an SSH daemon on a jailbroken Android device, so presumably there is some pseudo-terminal implementation which I've missed.

  • There are no asynchronous AIO routines like aio_read() or aio_write().

  • Bionic contains an MD5 and SHA1 implementation, but no crypt(). Android uses OpenSSL for any cryptographic needs.

  • Android dispenses with most file-based Unix administration. Bionic does not implement getfsent, because there is no /etc/fstab. Somewhat incongruously there is a /var/run/utmp, and so getutent() is implemented.

  • Android implements its own account management, and does not use /etc/passwd. There is no getpwent(), and getpwnam()/getpwuid() are implemented as wrappers around an Android ID service. At present, the Android ID service consists of 25 hard-coded accounts in <android_filesystem_config.h>

  • Bionic isn't finished. getprotobyname(), for example, will simply print "FIX ME! implement getprotobyname() __FILE__:__LINE__"

  • There is no termios support (good riddance).

 
Conclusion

Bionic is certainly interesting, and pleasingly small. It also represents a philosophical outlook of keeping the GPL some distance away from the application code.

Bionic is a BSD-based libc with support for Linux system calls and interfaces. If the lack of C++ exceptions or other limitations prove untenable, the syscall and pthread mutex implementation could be repurposed into the heavier FreeBSD/NetBSD/OpenBSD libc, though handling thread cancellation using the Bionic mutexes could require additional work.


 
Postscript

If you don't understand the reference in the title of this article, don't fret: you have simply not watched enough bad 1970's American television.

Update: In the comments, Ahmed Darwish points out another Android-related article discussing the kernel and power management interfaces Google added.

Update2: Embedded Alley is working on a MIPS port of the Android software.

Update3: In the comments Shuhrat Dehkanov points out an interview with David Turner, who works at Google on the Bionic implementation. Shuhrat also notes that you might have to log in to Google Groups to see the attachment. "Here is an overview by David Turner (though non-official) which answers some of the questions/unclear parts in your article."