Monday, September 28, 2009

49.710269618056 Days

Western Digital recently corrected a firmware issue in certain models of VelociRaptor where the drive would erroneously report an error to the host after 49 days of operation. Somewhat inconveniently for RAID arrays, if all drives powered on at the same time they would all report an error at the same time.

Informed speculation: the drive reports an error after exactly 49 days, 17 hours, 2 minutes, 47 seconds, and 294.999 milliseconds of operation. That is the moment where a millisecond timer overflows an unsigned 32 bit integer.

WD VelociRaptor

Tuesday, September 22, 2009

A Pudgier Tux

Tux the Penguin At LinuxCon 2009 a discussion arose about the Linux kernel becoming gradually slower with each new release. "Yes, it's a problem," said Linus Torvalds. "The kernel is huge and bloated, and our icache footprint is scary. I mean, there is no question about that. And whenever we add a new feature, it only gets worse."

I think the addition of new features is a red herring, and the real problem is in letting Tux eat the herring. Just hide the jars, maybe get a treadmill, and everything will go back to the way it was.

Pickled Herring

Story originally noted in The Register.

Thursday, September 17, 2009

Jasper Forest x86

Intel has a long but uneven history in the embedded market. In the early days of the personal computer Intel released the 80286 as a followon to the original 8086. There actually was an 80186: it was a more integrated version of the 8086 aimed at embedded applications. Intel's interest in embedded markets has waxed and waned over the years, but it is an area where Intel still has room for significant growth.

I wrote about x86 for embedded use about a year and a half ago, with four main points:

  • Volume Discounts
    PC pricing thresholds at 50,000 units have to be rethought for a less homogenous market
  • System on Chip (SoC)
    Board space is at a premium, we need fewer components in the system
  • Production lifetime
    These systems are not redesigned every few months, chips have to remain in production longer
  • Power and heat
    Airflow is more constrained, and the system has other heat generating components besides the CPU complex
Nehalem vs Jasper Forest

At the Intel Developer Forum next week Intel is expected to focus on embedded applications for its products. In advance of IDF Intel announced the Jasper Forest CPU, a System on Chip version of Nehalem. It is based on a 1, 2, or 4 core CPU plus an integrated PCI-e controller, so it does not need a separate northbridge chip. Intel also committed to a 7 year production lifetime, allowing the part to be designed into products which will remain on the market for a while. I'd speculate that Intel will offer industrial temperature grade parts as well, perhaps at lower frequencies.

Jasper Forest is particularly suited for and aimed at storage applications. It has additional hardware for RAID support (presumably XOR & ECC generation), and a feature to use main memory as a nonvolatile buffer cache. When loss of power is detected the chip will flush any pending writes out to RAM and then set the DRAM to self-refresh before shutting down. By including a battery sufficient to power the DRAM, the system can avoid the need for a separate nonvolatile data buffer like SRAM.

This is a good approach for Intel: target silicon at specific high margin, growing application areas. Go for markets with moderate power consumption requirements, as x86 is clearly not ready for small battery powered applications like phones. Ars Technica discusses Intel's upcoming weapon for getting into mobile and other battery powered markets, a version of their 32nm process which reduces leakage current to almost nothing. An idle x86 would consume essentially no power, which would be huge.

Tuesday, September 15, 2009

Soft Errors Are Hard Problems

"Soft Error" is a euphemism in the semiconductor industry for "the silicon did the wrong thing." Soft errors can occur when a circuit is infused with a sudden burst of energy from an external source, for example when it is hit by a high energy subatomic particle or by radiation.

Alpha particle strike - two protons plus two electrons, emitted when a heavy radioactive element decays into a lighter element. Alpha particles are so large that the chip packaging will normally block them, they are only a problem when something inside the package undergoes radioactive decay.

Cosmic ray strike - a high energy neutron (or other particle) emitted by the Sun. These particles are gradually absorbed by the Earth's atmosphere, so they are more of a problem in orbit and at high altitude. Cosmic rays can directly impact the silicon, or can hit a nearby atom and throw off neutrons which in turn cause a soft error.

Beta particle strike - an electron, emitted when a neutron is converted into a proton + electron + antineutrino. Beta particles rarely hold enough energy to affect current silicon technology, alpha and cosmic ray strikes are more of a problem.

A DRAM bit Soft errors are usually discussed in the context of DRAM, where the problem was initially noticed. DRAM consists of a capacitor to store the bit, with a transistor to keep the capacitor charge stable. A capacitor is an energy storage circuit: it stores voltage. A particle strike on the capacitor will impart a large amount of energy, which can spontaneously change it to a 1 or, more rarely, overload its capacity such that the energy quickly escapes into the substrate and leaves the bit as a 0.

Some quick searching will turn up a few facts about soft errors:

  • Soft errors were first noted in the 1970s.
  • The primary cause was the use of slightly radioactive isotopes in the chip packaging, such as lead (Pb-212).
  • Materials in chip packaging are now carefully screened to substantially eliminate radioactivity.
  • Soft errors are now very rare and mostly caused by cosmic rays.

We'll come back to these later.


 
Beyond DRAM A 6 transistor SRAM bit

Soft errors are not confined to DRAM alone. Any circuit will glitch if hit by a sufficiently energetic particle - whether DRAM, SRAM, or a logic element. DRAM began to be affected by soft errors when the energy stored in the capacitor shrunk to be on the order of the energy induced by an alpha particle. SRAM was not initially affected because its cells are actively driven via 6 transistors, whose energy level is considerably higher. Nonetheless as silicon feature sizes have shrunk it is now quite possible for SRAM to suffer a soft error. Soft errors in logic elements are somewhat less noticeable in that they will correct themselves on the next clock cycle, while an error in a storage element will persist until it is rewritten.

Intel Nehalem with SRAM highlighted Modern CPUs include a great deal of SRAM on the die, comprising the caches, TLBs, reorder buffers, and numerous other uses. The image shown here is Intel's quad core Nehalem die, with the SRAM areas highlighted and logic deemphasized (both based on my best guesses). SRAM is a significant fraction of the die. Many, many other ASIC and CPU designs contain similar or higher fractions of SRAM.

What impact can a bitflip in the SRAM have? Consider that there is just one bit difference between the following two instruction opcodes. A bitflip can make the software come up with results which should be impossible.

ADD R1,1
ADD R1,32769

Even more bizarrely a bitflip could change some random instruction into a memory reference, such as a load or store. As the register being dereferenced would likely not contain a valid pointer, the process would segfault for inexplicable reasons.

To prevent this problem Intel and all major CPU vendors protect their caches and other on-chip memories using Error Correcting Codes, but many ASIC designs do not. They might implement parity, but on-chip ASIC memories commonly have no error checking at all. A soft error will simply corrupt whatever was in the SRAM, which will only be noticed if it causes the ASIC to misbehave in some perceptible way.

What happens if a particle strike causes the hardware to misbehave, or to get the wrong answer? Usually, we blame the software. It must be some weird bug.


 
Back To The Future

Returning to the earlier list of common facts about soft errors, lets focus on the last two.

  • Materials in chip packaging are now carefully screened to substantially eliminate radioactivity.
  • Soft errors are now very rare and mostly caused by cosmic rays.

There are many different materials used in a finished chip, beyond the silicon die and the gold wires connecting to its pins. The chip package is plastic or ceramic, which is composed of a host of different elements including boron. Solder bonds the wires to the pins. Solder used to be mostly lead, later tin, and might now be a polymer. There are heat spreading compounds and shock absorption goo, which are often organic polymers. Some of these materials are naturally slightly radioactive. For example, in nature Lead (Pb-208) contains traces of Pollonium (Po212) which will emit alpha particles and decay into Pb-208. Similarly boron-10 is more prone to fission than boron-11 - or so they tell me, I've no idea why.

Modern chip packaging uses strained versions of these materials, to reduce the level of undesirable isotopes and leave purified inert material behind. This is an expensive process. Each new generation of silicon imposes more stringent requirements, making them even more costly. It is crucial that the correct materials be used... and this is where human error can creep in.

Using the wrong packaging materials increases alpha emissions to the point where the silicon will experience an unacceptable soft error rate. Yet to control costs it is also important to not overshoot the alpha emission requirements by using a more expensive material than necessary. So each chip design may use a different mix of materials depending on its process technology, die size, and the amount of SRAM it contains. The manufacturer will have checks in place to ensure the correct materials are used, but mistakes can occur. Sometimes a batch of chips is produced which suffers an unusually high soft error rate. When this happens the manufacturer is generally loathe to admit it, and will quietly replace the chips with a corrected batch. One recent case where the problem was too widespread to cover up was with the cache of the Ultrasparc-II CPU from Sun Microsystems, see point #5 of an Actel FAQ on the topic for more details.


 
The Moral of the Story

The impossible triangle, a tribar If you are working on low level software for a chip and run into bizarre errors, you should suspect a software problem first. Soft errors really are rare, and the manufacturing screwups described above are very uncommon. If the problem is repeatable, even if it has only happened twice, it is not a soft error. Particle strikes are too random for that. However if you keep running into different symptoms where you think "but that is impossible..." you should consider the possibility that the problem really is impossible and was introduced by a bitflip. You should start checking whether the problems are confined to a particular batch of parts, or only produced in a certain range of dates. If so, it could be that batch of parts has a problem with the packaging materials.


 
Other resources

While researching this post I came across a few additional sources of information which I found fascinating, but did not have a good place to link to them. They are presented here for your edification and bemusement.

  • An analysis of the high soft error rate of the ASC Q supercomputer at LLNL. These errors were cosmic ray induced, not due to a badly packaged batch of chips. The part of the system in question did not implement ECC to correct errors, only parity to detect them and crash.
  • Cypress Semiconductor published a book to explain soft errors to their customers.
  • Fujitsu has a simulator to predict soft error rates, called NISES. The linked PDF is mostly in Japanese, but the images are fascinating and very illustrative.

This post was many years in the making.

Friday, September 11, 2009

Motorola Mobile Meanderings

On September 10th Motorola announced the CLIQ, its first Android phone and a product which a good friend of mine has labored over for quite some time.



For many years I have used my trusty Sony-Ericsson T616, but I admit it might be getting a bit dated. I decided to compare the two devices to see whether its time for me to take up a new phone.

T616
CLIQ
Cracked LCD screen? Y N
Broken down arrow key? Y N
Wallpaper picture of my Daughter? Y N

As you can see, clearly I cannot replace my T616 until Motorola rectifies these glaring omissions in the CLIQ. The cracked screen and non-functional arrow key could both be achieved via sufficiently rough handling. Frankly I don't know how Motorola would obtain pictures of my daughter to use as wallpaper, though I might be willing to accept this little flaw and add my own wallpaper later.

This chart shows that my T616 is the perfect phone, but I'm in a generous mood. The first person to offer me enough money for me to buy a CLIQ, takes it.

Tuesday, September 8, 2009

Infinite Arrays of Tweeples

Twitter followers list Twitter is somewhat out of the range of topics I normally cover here, but I promise we'll come around to a software development angle by the end of this post.

When you follow someone on twitter, you appear in their followers list and they appear in your following list. New entries appear at the top, so the newest follower will be the first entry in your list. Recently I noticed an exception to this sort ordering, when someone who had been following a long time ago and later unfollowed decided to follow again. I received the notification email from twitter, yet he wasn't at the top of the list of followers. Instead he appeared much further down in the listing, off the first page. That is where he was when he followed me the first time, many months ago. This entry disappeared when he unfollowed, and when he re-followed he ended up back in that same place. Why would that be?


 
Possibility #1: Timestamp

The first possibility, and more likely the correct one, is that twitter tracks the timestamp of every new follow and chooses not to update it on a subsequent refollow. No matter how many times you have followed/unfollowed, you retain the timestamp of the very first time and will show up in the followers list at that position.

If an unfollow+refollow was sufficient to move you back up to the top of the list of followers, the bots would do it all the time to make it more likely you'd follow them back. Yet this is a boring root cause. So lets consider a second possibility which is more illustrative to software development.


 
Possibility #2: Array Deletion

Twitter operates at a scale where performance optimization is essential. If they are not cognizant about performance the wheels fly off and users start to see the Fail Whale. An area of particular importance is the list of followers, as the backend infrastructure has to traverse it for every tweet. It is possible that twitter implemented the followers list as an array in memory instead of a linked list, presumably to get better locality. The classic drawback of an array is deletion: you cannot delete an element from the middle without moving all subsequent elements into the hole thus created. To avoid this compaction a "deleted" or "active" bit is commonly kept for each element, allowing deleted entries to be left in place but skipped without processing.

When Scobleizer unfollowed everyone it would have resulted in holes in the followers list of 106,000 different accounts, entries with the deleted bit set.

Array with deleted bits

I suspect that Twitter does not immediately compact these arrays, so long as the ratio of holes/filled entries is tolerable. When Scobleizer decided to re-follow me the twitter backend located the earlier, deleted entry and flipped the bit back to active.

Array after refollowing

Thus the newly restored entry will re-appear in the followers list, but not as the top most entry. It will re-appear at its existing position within the array. This, or a similar implementation choice of retaining deleted entries in some way, could be why re-follows do not appear at the top of the list.


 
The Moral of the Story

Optimization is fine, and absolutely crucial to function at Twitter scale, but one must to be careful when an optimization changes user-visible behavior. This is particularly true for social media, where we're explicitly conversing with other humans and ascribe human motivations to their actions. Twitter's handling of deleted and re-added follows can cause considerable consternation, because to the casual observer it appears the person followed but then immediately unfollowed. It can seem judgmental.

Of course, I am most likely completely wrong about Twitter's implementation using arrays. It wouldn't be the first time I've made a complete fool out of myself in a blog post. Its cathartic, in a way. Perhaps I'll do it more often.

Monday, August 24, 2009

Plummeting Down the Chasm

Crossing the Chasm book coverCrossing the Chasm is a seminal book in technology marketing, whose ideas quickly spread through the industry. Originally written in 1991 by Geoffrey Moore, it showed a new take on the technology adoption lifecycle. The lifecycle starts with tech enthusiasts willing to buy an immature product and runs through the majority buyers who make up the bulk of the market, finally trailing off when market saturation is reached. It had been commonly depicted as a bell curve:


Technology Adoption Life Cycle

Moore's key observation is that while the bell curve implies there is a smooth transition from early market to majority, in reality the buyers in the early market are fundamentally different from the majority that comes later. Technology companies who don't appreciate this gap will stumble and often fail once they saturate the small pool of early buyers. Moore referred to this gap as "the chasm."

Technology Adoption Life Cycle

Early adopters are visionaries. They are willing to look at an immature product and figure out how to use it in their operations to get a competitive advantage. They will sponsor integration work within their IT departments, and generate long lists of product feedback to better fit their needs. They are fundamentally different from the majority buyers in that they will look at an interesting product and figure out what problem it can solve. The majority market comes from the opposite direction with a problem to solve, looking for a solution. Product planning and marketing which works in the early part of a product lifecycle will fail utterly later in the game.


 
Yon Chasm Approacheth

It is quite possible to get stuck in the early market, continuously trying to meet the needs of early adopters and never enjoying the big sales of the majority market. Having spent the last several years in this predicament, I'll offer my version of the lifecycle chart. What it lacks in precision, it makes up for in snark.


Rope bridge with gaps

As a development engineer it can be difficult to tell how well the sales cycle is working as one rarely gets direct visibility, but the indirect evidence is plentiful.

  • If nearly every deal comes with a list of new product features to be implemented, you have not crossed the chasm.
  • If in every medium to large deal it is not clear whether the cost of getting the business is greater than the revenue it would bring, you have not crossed the chasm.
  • If every deal is "high touch," requiring multiple visits by a salesperson and sales engineer and possibly a consultation with the development team, you have not crossed the chasm.
  • If your product cannot be sold via a web site but instead always requires an evaluation period and report, you have not crossed the chasm.
  • If every customer is using a different subset of the product functionality, you have not crossed the chasm.

This is important: if the company does not realize that the real problem is in the approach to the market, all of these things will be blamed on the product. The reasoning will be that "if we just pound out a few more of these deals, we'll have finally implemented everything that everybody wants and sales will take off." There may even be an element of truth in this sentiment, if the product shipped early before its natural feature set was complete. However if the company has not crossed the chasm, the fundamental problem is elsewhere.

You are not getting a list of feature requirements because the product is incomplete, but because you are selling to the type of customer who generates lists of requirements.

You are getting that list of requirements because you are still selling into the visionary and early adopter segments of the market, the people who are willing to think about how best to integrate the product into their operations. If you were selling into the mainstream market there would be no list of requirements because the mainstream won't do an extensive integration on their own. The product either meets their needs or it doesn't, and you'll either get the sale or you won't. In the mainstream there will be no back and forth of what the product could do to win the business. At most, the mainstream buyer might tell you why you lost the business.

Don't get stuck wandering around in the chasm. Trust me, it sucks.

Friday, August 21, 2009

Smartbooks and Handheld PCs

Intel markets the Atom processor for netbooks. It trades lower processing power for very low power consumption, and is quite inexpensive in large quantities. These are product features where ARM/PowerPC/MIPS have long focussed, though they have aimed at non-PC form factor devices. Now, a new round of small laptops is hitting the market using non-x86 processors with either Windows CE or a Linux software stack like Google's Android or, eventually, Chrome OS. Most notably, Dell appears to be on the verge of introducing such a device - the true measure of whether a product category has entered the mainstream. These devices are generally called Smartbooks, owing to Qualcomm's extensive marketing push for its ARM Snapdragon chips in such a role.

Vadem Clio CL-1050 What strikes me about these devices is that we've been down a similar road before. In the late 1990s there was a flurry of activity around HandHeld PCs, which were relatively small and inexpensive compared to the laptops of the day. They typically used MIPS processors, ran Windows CE, and supplied basic word processing and communications software. Handheld PCs didn't last long on the market, and Microsoft ceased work on the software after just two years. The devices simply were not useful enough when compared to a full laptop.

This is another example of how infrastructure and market can make more of a contribution to the success of a new product as its design. If a product is too early, it won't get enough traction to drag the rest of the environment along. In 2009 these devices can depend on a fast wireless infrastructure and plentiful cloud-hosted applications, which did not exist in 1999. The world has changed.

Tuesday, August 18, 2009

Virtual Machines And Manual Transmissions

Stick Shift Knob

I've chosen a manual transmission for every vehicle I've purchased. It is a personal preference, I like the feeling of control over the engine and the ability to trade power for torque. Driving a stick shift was also an advantage in school: practically nobody knew how to drive it, so nobody could borrow my car.

For software development I code mostly in C, which is a rather thin layer on top of the machine. Even C++, while still considered a low-level language, nonetheless implements significantly more abstraction. Consider the following simple example of incrementing a variable in C and in C++:

C:
int val = 0;

int incr() {
    val++;
}
C++:
class exi {
    int val;

    public:
        exi() { val = 0; };

        int incr() {
            val++;
        }
};
Note that in neither case is incr() declared to take an argument. We'll use one of my favorite techniques, disassembling the binary to see how it works. This time we're looking at PowerPC opcodes.
_main:
 bl _incr  branch to incr()
 
_incr:
 mfspr r9,lr address of val
 lwz r2,0xbc(r9) fetch val
 addi r2,r2,0x1 increment
 stw r2,0xbc(r9) store new val
 blr return
_main:
 ... much C++ object init removed...
 addi r3,r1,0x38 object addr in arg0
 bl __ZN6exi4incrEv  branch to exi::incr()
 
__ZN6exi4incrEv:
 lwz r2,0x0(r3) fetch val from *arg0
 addi r2,r2,0x1 increment
 stw r2,0x0(r3) store new val
 blr return

Though the C++ source code does not show an argument to exi::incr(), at the machine level there nonetheless is one. The object address is passed as the first argument. Passing the object is necessary for C++ to handle "this" object - it has to have a pointer to operate on.

In low level languages you can generally see the relationship from source to the resulting machine code, even when significant compiler optimization is done. As we move to higher level languages, the abstractions between the source and machine code grow ever larger. C++ is somewhat higher level than C, and at the machine level the mapping from instructions back to source code is less clear. More abstract languages like Java, Python, and C# compile to a virtual machine running on top of the real system. If one gathered an instruction trace of CPU execution, one would be hard-pressed to correlate these instructions back to the source code they implement.


 
Foreshadowing

One can see the day coming when manual transmissions will be unavailable in most car models. Continuously variable transmissions were an early indication of this trend, with an essentially infinite number of gearing ratios that could only be effectively controlled by an engine computer. Hybrid vehicles have a complex transmission which meshes the output of two motors, and again can only be controlled by computer. Future vehicles will likely have an entirely electric drivetrain, with no need for a conventional transmission at all. The simple fact is that the engine computer can do a far better job of optimizing the behavior of the drivetrain than I can.

I'm currently digging in to the low level aspects of virtual machines. Running compilation just-in-time as part of a virtual machine has several notable advantages over static compilation with gcc:

  • gcc's optimization improves if you compile with profiling, run the program, and then compile again. This is so annoying that it is hardly ever done. A virtual machine always has profiling data available, as it interprets the bytecodes for a while before running the JIT.
  • gcc's profile-guided optimization is done in advance, on a representative corpus of input data which the programmer supplies. If the program operates on inputs which differ substantially from this, its performance will not be optimal. The JIT optimization is always done with the real data as profiling input.
  • gcc can optimize for a specific CPU pipeline, such as Core2 vs NetBurst vs i486. One is trading off performance improvement on the favored CPU versus degradation on other CPUs. The JIT can know the specific type of CPU being used and can optimize accordingly.
  • gcc can do static constant propagation across subroutines. That is, if a constant NULL is passed to a function gcc can create a version of that function which will eliminate any unreachable code. The JIT can create optimal versions of functions tuned for specific arguments dynamically, whether they are constant or variable. It just has to validate that the arguments still match the expected, and it is free to jump back to the interpreted bytecode on a mismatch.

This should be fun. Some initial thoughts:

  • modern CPUs have extensive branch prediction and speculative execution features, to keep it from spending all its time stalled for the outcome of a branch decision. What happens when we have a lot more big loops with straight line code, where the JIT has optimized all the conditionals up to sanity checks at the entry to the function?
  • Does widespread use of JIT mean that VLIW architectures become more viable? VLIW is particularly dependent on the compiler to match the code to available hardware resources, which a JIT is better positioned to tackle.

Tuesday, August 11, 2009

blog.8.11.2009 < /dev/random


A while ago I wrote a FeedFlare for FriendFeed, my second Google App Engine project. Monday morning, friendfeed.com announced that it has been acquired by Facebook. Anybody want to buy a slightly used FeedFlare?




A year ago I wrote about the importance of publishing the GPLd source used in commercial products, to avoid the public relations nightmare that comes of being accused of a GPL violation. One of the points in that article was of the SFLC suing Extreme Networks. I'm happy to report that the suit was settled in October 2008. I'm happy to report this because I imported busybox into the source tree at Extreme, and I really didn't want to get dragged in for depositions. Extreme's engineering management was convinced that waiting until somebody asked for the source code would be fine.


Thursday, August 6, 2009

Toward a More Robust Twitter Infrastructure

This morning's twitter outage due to DDoS attack reminds me: I wrote a guest post for the most excellent Cranky Product Manager, concerning how to make twitter more robust. It ran on April 1, 2009.

The Cranky Engineer Responds to a PRD

Wednesday, July 29, 2009

Embedded Linux Market Share

Last week Wind River, now a subsidiary of Intel, announced that it had taken the lead in the share of embedded Linux revenue.

ALAMEDA, CA - July 22, 2009 - Wind River, a wholly owned subsidiary of Intel Corporation, today announced it has been named the embedded Linux market leader by VDC Research Group. Released today in VDC's 2009 Linux in the Embedded Systems Market report, Wind River achieved the market share lead in 2008 with greater than 30 percent of total market revenue, more than seven percentage points over the next closest competitor. Wind River entered the Linux business in 2004 to complement its market-leading, proprietary operating system, VxWorks.

The unnamed "next closest competitor" is MontaVista Software, but the real competition is not between the different embedded software vendors. The real competition is between any commercial Linux vendor versus rolling your own distribution from source. The various kernel distributions for PowerPC and MIPS are easy to download and cross-compile. Assembling a filesystem is not difficult, as discussed in an earlier article on this site. Add busybox and either glibc or uClibc, and you are most of the way to a bootable system.

There are a few areas where embedded Linux vendors provide value, and why I generally advocate purchasing support from them for a Linux development project:

  • The compiler toolchain: Maintaining a cross-compiling toolchain is quite a bit of work. Most importantly, one has to stay on top of CPU bugs. All CPUs ship with bugs, even x86, but a dirty little secret of the RISC SoC business is that they can go to market with more significant problems than Intel or AMD could get away with. So long as the issue can be worked around in the compiler or assembler - by not emitting the problematic sequence of instructions - the chip will ship anyway and rectify problems in later spins. The bugs will be documented in the Errata, but the descriptions are made to sound quite innocuous. CPU developers make sure that the major commercial embedded Linux vendors have the needed workarounds in their toolchain.
  • The kernel development tax: if you look at the version control logs for Linux/MIPS or Linux/PowerPC, the people doing the heavy lifting are often employed at one of the Linux vendors. Those companies have an economic reason to pay for that development. Unfortunately this leads to a variation of the Prisoner's Dilemma: somebody has to fund it. One can either pay for a support contract in order to fund Linux development, or not pay but hope that enough other people do.
  • Proprietary tools: The package management tools and customized Eclipse IDE supplied by these vendors are generally not useful to me, but some of their supplementary tools for profiling or shared library size reduction are quite interesting.

It is sometimes galling to be paying for support for embedded Linux, particularly because the technical support for specific problems has never actually resolved anything for me. Nonetheless I do advocate having at least a minimal contract, using the supplied compiler toolchain and investigating the other tools they provide. It is worth spending some resources for.

(Original press release via Linux For Devices)

Monday, July 20, 2009

Microsoft Releases Linux Paravirtualization Driver Source

From the Microsoft press release:

REDMOND, Wash., July 20, 2009 - Today, in a break from the ordinary, Microsoft released 20,000 lines of device driver code to the Linux community. The code, which includes three Linux device drivers, has been submitted to the Linux kernel community for inclusion in the Linux tree. The drivers will be available to the Linux community and customers alike, and will enhance the performance of the Linux operating system when virtualized on Windows Server 2008 Hyper-V or Windows Server 2008 R2 Hyper-V.

Microsoft wants Hyper-V to compete with VMWare in all markets, and to do this it needs to have good support for virtualizing Linux. Microsoft very pragmatically decided that closed source paravirtualization drivers for Linux had no chance of success. They'd get a press release out of such a move, but no significant adoption without Red Hat/Canonical/etc pulling the drivers in. Opening the source is in their best interests.

The last two posts have been an experiment of sorts. Prior posts had been written entirely from scratch on a technical topic. They take a long, long time to write. I wanted to try posting more frequently by adding a few thoughts to a relevant news item, but thus far I haven't been happy with the results. In this article I said opening the source would allow a Linux platform vendor to include the Microsoft paravirtualization drivers, but on further reflection it seems unlikely that they would actually do so. Red Hat has their own virtualization strategy which doesn't include Microsoft, and there is no reason to believe Canonical would be interested in being an enabler for sales of Microsoft Hyper-V.

Thursday, July 16, 2009

Courgette binary patch compression

Recently on the Chromium blog Google announced an improved binary compression algorithm called Courgette. In the example cited Courgette produced a patch that was only 11% of the size of that produced by bsdiff. The design overview has more details on its operation:

Courgette uses a primitive disassembler to find the internal pointers. The disassembler splits the program into three parts: a list of the internal pointer's target addresses, all the other bytes, and an 'instruction' sequence that determines how the plain bytes and the pointers need to be interleaved and adjusted to get back the original input. We call this an 'assembly language' because we can run an 'assembler' to process the instructions and emit a sequence of bytes to recover the original file.

The non-pointer part is about 80% of the size of the original program, and because it does not have any pointers mixed in, it tends to be well behaved, having a diff size that is in line with the changes in the source code. Simply converting the program into the assembly language form makes the diff produced by bsdiff about 30% smaller.

I haven't checked, but I suspect its disassembler supports x86 only. Chromium runs on Windows, MacOS X, and Linux, which all run primarily on x86 systems.

Courgette is of course aimed at updates to Google's Chrome browser, which is installed in very large numbers and frequently updated. Reducing the size of the updates results in a better user experience. Nifty.


 
Incremental Patching and Embedded Systems

When first posted, this article launched directly from Courgette into a discussion of incremental patching in embedded systems. In the comments Wayne Scott pointed out that this really wasn't fair: Courgette is purely a way to make binary patches smaller. In fact because Courgette requires the complete original binary in order to generate its diffs, it cannot be used to generate independent incremental patches at all. After clarifying my thinking, I've updated the post and added a bit of segue text.

Let us now turn to the more general subject of patching of embedded systems. Whenever there is a problem in the field, there is a strong temptation to push out a fix as rapidly as possible. Whether called a point patch or a hotfix, the basic idea is to patch just the portion of the software causing issues for that customer. Larger, periodic maintenance releases collect all existing hotfixes (plus additional ongoing maintenance work) into a single release suitable for all deployments. For embedded systems, on general principles I don't favor the use of hotfixes. Though it reduces the bandwidth required for updates, I feel the disadvantages outweigh the advantages:

  1. perhaps obviously, you need management software to apply, remove, and list installed patches
  2. tech support has a larger number of variations to deal with
  3. test load increases rather dramatically. If you have 5 independent patches you may need to test the combinations, up to 2^5=32 variations to test, not just 5.
  4. Frequent updates are not a good thing for most embedded systems. Customers want the gear to fade into the background and just work, making them update and reboot too often becomes a distinct negative.
  5. As described in an earlier article I favor storing the boot images in a raw flash partition, not any sort of filesystem, which would make installation of an incremental patch more complex.

I recommend not trying to maintain the most recent maintenance release plus an ever-growing collection of hotfixes. I suggest instead to revise the maintenance release whenever there is a cutomer problem. If other customers are not experiencing the problem then they need not deploy the new release right away. The main benefit is to avoid having 2^N possible combinations of patches in the field, instead having only N minor maintenance releases. Revving the maintenance release also tends to be treated with more care than a simple hotfix is; rushing the process is rarely beneficial.

Tuesday, July 14, 2009

DRY and the DMV

The Pragmatic Programmer is one of the best books available concerning the development of quality software. It is structured as a series of tips, with illustrative examples and the occasional horror story. One of the first tips is the DRY principle:

DRY - Don't Repeat Yourself
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

DRY is often misinterpreted to mean simply that code should not be duplicated, but it is somewhat more subtle: don't duplicate state. If you have multiple different places in the code which keep state about an aspect of the system, and all places have to have the same content at all times for the system to work properly, then you have made maintenance of the system harder than it needs to be. You'll have to debug cases where the representations fall out of sync, and all such places must be updated at the same time when code changes are made. The Pragmatic Programmers extend the DRY principle outside of the code itself to include database schemes, documentation, and build systems. Everything should have one authoritative source.

This brings us to the Department of Motor Vehicles, though the Gentle Reader might at first not see the connection. I received a form in the mail to renew my driver's license, which I promptly signed and sent back. The new license arrived in due course, and things were fine until a few weeks later when I noticed the address was incorrect. The old license is correct, the new one is wrong.

sample CA drivers license

I've no idea whether the address was correct on the renewal form, I did not check it. Apparently I should have, but I didn't bother - it hadn't changed. At some stage of the renewal process, a single digit was altered in a subtle way.

Why was it even possible for the address to be changed in the renewal process? Here we can only speculate. The DMV does need a procedure to update an address as part of a license renewal, because sometimes people supply a new one on the form. I'll speculate that the DMV, either via OCR or manual typing, re-enters the address in all cases and not just if the form supplied a change. This procedure depends on the original address to be faithfully reproduced in cases where it wasn't supposed to change. In my case, either due to OCR glitch or typing error, a digit changed resulting in the new license being printed with an incorrect address.

I believe this is an example of the consequences of a violation of the DRY principle. The same state - my address - exists in two places: in the DMV database and on the form. Those two pieces of state are supposed to be the same, indeed must be the same for the process to work correctly, but errors can easily occur which allow their contents to get out of sync.

A corollary lesson in this situation: if state isn't supposed to change, don't change it. If the form does not indicate a change of address, the authoritative state is in the database and the form contents should be ignored.


 
Aftereffects

I've already received a jury summons at the incorrect address, which the post office helpfully delivered to me anyway. Even after correcting the address I suspect I will receive a summons twice as often from now on. That will form the basis of a future blog post to illustrate the importance of duplicate suppression in databases, I suppose.

I can change my address back by submitting a form to the DMV, but issuing a new license with the correct address will be at my own cost. This is part of the price of modern life, I suppose.