Tuesday, February 23, 2010

Thatsa Lottabytes

There is a Facebook fan page to establish hella- as the official SI prefix for 1024. This is good, but I don't think it goes far enough. I propose that lotta be designated to mean "several orders of magnitude larger than currently achievable."

In 2010 lotta would be defined as 1018.3, as having exabytes of storage seems comfortably out of reach. Each year it would be redefined, marching slowly to infinity (and beyond).

Graphic depiction of kilo, mega, giga, tera, peta, exa, zeta, yotta, hella, and lotta.
kilo 103
mega 106
giga 109
tera 1012
peta 1015
exa 1018
zetta 1021
yotta 1024
hella 1027 (proposed)
lotta > 1018 (my proposal)

Thursday, February 18, 2010

Email Contact Virii

The Matrix imageWe're all familiar with computer viruses: bits of code which attempt to transmit themselves to another system, from which they look for yet more systems to infect. The key point is that the virus must transmit code to infect another system.

Very occasionally there are more obscure examples of viral behavior, of data alone unaccompanied by code. Certainly, there is code involved, but it is the normal code of the system which performs a useful function and is not intended to spread data virally.

The contacts viral spreadFor example: people usually refer to me by the nickname Denny. My personal email address is denny@geekhold.com (1) but my work address is not "denny". One person had my personal email address. Shortly after I started he sent an email to a few team members. He typed "Denny," and GMail helpfully auto-completed my personal account from his contacts. When another member of the team responded to that email, the address was automatically added to his contacts. When he later sent an email to another set of coworkers... the Contact virus spread further.

The trouble with data-driven viral behavior is that there are few tools to stamp it out. With code viruses, a huge ecosystem of tools and malware signatures has been created. There are few tools to deal with an annoying bit of data spreading through the system, just manual exhortations to not respond to the email containing the external address.

Do people know of other instances of a viral spread of data, unaccompanied by code? There are certainly instances of poor auto-complete behavior, such as a mistyped URL leading to the browser forevermore suggesting the bogus site, but none I can think of which spread from one person to another. I suspect the root cause is a poor model for how auto-complete is supposed to work: the developer wants it to be completely automatic and not something the user should have to adjust.

(1): I long ago gave up keeping my email address off of spam lists. As I've used this address since 1996, I am wholly reliant on good filtering.

Tuesday, February 16, 2010

Resource Leak

Copper sewage pipe with corroded hole

Resource leak located after extensive debugging. This portion of the system required significant refactoring to resolve the leak.

Friday, February 5, 2010

Apple == A, Plus Four More Letters

What the web needs right now is another blog post about the iPad.

Apple A4 chipNo, don't run away! This will be different, I promise. We'll focus on Apple's A4, a custom CPU first used in the iPad. It has been widely assumed that A4 uses a licensed ARM Cortex A8 core. I have no reason to dispute this assertion, it seems like a fine choice. It has also been asserted that because the same core is used in parts from Samsung, TI, and Qualcomm, Apple should not have bothered making its own chip. Today, Gentle Reader, we'll explore that notion a bit.

Apple ASIC Expertise

Apple has a long history of ASIC design. Apple produced custom silicon for various Macintosh models since at least the late 1980s, when they designed the audio chips used in the Quadra 700 and 900 (a chip called "Batman"). Later, Apple designed entire chipsets to interface with the PowerPC 60x bus. Apple licensed a gigabit Ethernet MAC design from Sun, and used it plus IDE controller and other peripherals in chipsets for several Powermac models. With the switch to x86, Apple's efforts became much more constrained. The x86 bus interface is difficult to license, and Intel's own chipsets are quite reasonable. So far as I know, x86 Macintoshes no longer use custom Apple ASICs.

Custom chip design isn't a radical departure for Apple.

With that as background, what might Apple have done in the A4 chip? I have absolutely no inside information about the iPad or A4 processor, I'm going to make stuff up because its fun to speculate.

Graphics & OpenCL

PowerVR SGX CPU OverviewApple holds a nearly 10% stake in Imagination Technologies Group, which designs the PowerVR graphics accelerator and other IP relating to massively threaded processing. Apple uses their PowerVR SGX 535 in iPhone 3GS, and used various PowerVR graphics in earlier iPhone models. The A4 chip will certainly integrate a graphics core from PowerVR. As with essentially all GPU designs today, the PowerVR makes use of multiple, specialized CPU cores. There is relatively little information about its instruction set on the web, only that it is called META MTX and uses 16 bit RISC-ish instruction words. Update: PowerVR SGX does not use the META architecture, it has a distinct architecture of its own. Additional information can be downloaded after registration.

Apple has also invested heavily in two relevant technologies: OpenCL and LLVM.

  • OpenCL allows processing to be distributed across multiple CPUs in the system, even if they have different instruction sets. OpenCL algorithms are written in a language with syntax very similar to C99, and the framework handles the rest.
  • LLVM is a compiler toolkit, one aspect of which is a machine independent instruction set. Source code can be compiled to the LLVM virtual machine, and from there be translated into the equivalent opcodes for the target CPU. The compilation can be done statically before running it, or by a Just-In-Time compiler while interpreting the LLVM bytecodes.

iPhone applications are compiled to ARM instructions, but it is not much of a stretch to imagine support for sections of LLVM bytecodes as well. If the hardware has sufficient GPU power, the bytecode could be translated to the GPU instruction set and offloaded. Devices with less sophisticated GPUs would use the ARM instead. Apple does not allow iPhone apps to include their own virtual machine in this way, but would be free to provide the VM function as part of the OS.

I suspect this is the most compelling reason for Apple to build its own chip as opposed to buying off the shelf. The rest of the mobile industry is satisfied to offload 3D graphics and video decoding to the GPU. Apple has greater ambitions, and could make use of significantly more GPU pipelines. By controlling the complete platform from CPU to software, Apple can make tradeoffs which are not practical for the rest of the market. For example: a very large GPU plus very fast ARM would generate more heat than can be dissipated in a small form factor like a phone. Apple has the option to dynamically throttle the ARM clock speed in order to open up more thermal envelope for the GPUs, if sufficient OpenCL workload is ready to run. When the GPUs are less busy, the ARM clock speed can be brought back up.

Multi Package Modules

Multi Package ModuleThe CPU in the iPhone 3GS is a Samsung S5PC100. This is a multi package module with CPU, I/O chip, and SDRAM sandwiched tightly together. Multi chip modules have been around for a long time, where multiple dies wired together in one big package. The amount of testing which can be done on a raw die is rather limited, so MCM yields suffer as one bad die ruins the whole assembly.

Multi package modules are relatively new: each chip is in its own package, but use very tight pin spacings and do not have a heat spreader. They are soldered together on a small PCB, which in turn has a Ball Grid Array on the bottom with normal pin spacings. Because each chip is packaged separately a full suite of test vectors can be run before the final assembly is put together, improving the yield considerably and lowering the cost of the final product.

If we examine the main board of an iPhone 3GS, the largest component is not the Samsung processor - it is the Flash memory, an MPM containing a number of flash chips. The Samsung CPU in the iPhone 3GS comes in a close second in size, and is also a multi package module with CPU, I/O chip, and SDRAM.

With the A4, Apple will probably have one die containing both CPU and I/O. Samsung uses different I/O chips to tailor their offering to many market segments, which is not a goal for Apple. By arranging the pinout carefully, Apple might be able to make an MPM containing CPU, SDRAM, and Flash, reducing the total board area. Different Flash capacities could be offered by not stuffing portions of the MPM. The iPad itself might not need such an MPM as it is a much larger device, but future iPhones would benefit more.

To be clear: assembling an MPM is not something you can easily do when buying merchant silicon. The pins on one package have to be arranged so as to be easy to route to the pins on the other packages within the MPM.

Wrapping up

I'll say it again: I made this all up. I have no information on the specifics of the A4, just speculation. In the unlikely event that anyone reads this (instead of running away from yet another iPad blog post), don't copy it into Wikipedia as though it were verified information.

What about future iterations? Its tempting to consider a single chip containing the entire iPhone feature set, including radio and wireless networking. The A4 itself clearly doesn't do this, as GSM support is optional in the iPad. I suspect that even in future chips, Apple won't pull in the baseband radio. The front end portions of that chip are rather sensitive to noise, and generally don't work well when integrated in the corner of a gigantic ASIC. Also integrating the radio functionality would make it that much harder to keep up with advancements in wireless networks.

Another future possibility is to use this chip in Apple's other small form factor products, like the Airport Base Stations, Time Capsule, and AppleTV. This is certainly possible, but aside from obvious additional peripherals like SATA I'm not sure it adds many requirements to the chip.

Other articles about A4 you might find interesting:

  • The New York Times writes about the history of the A4 design team.
  • Louis Gray writes about Apple's heavy recruiting push to staff up their ASIC team.
  • Wikipedia already has an article, which will improve over time as more details emerge.

P.S.: While we're at it, the title of this post is a guess about the origins of the "A4" nomenclature: "Apple" is a capital A followed by four more letters.

Tuesday, February 2, 2010

Monday, February 1, 2010

Fuzzy Dice

Electric vehicle with fuzzy dice

Even electric vehicles can be made cooler with a simple application of fuzzy dice.