Coding Relic: mobile

Showing posts with label mobile. Show all posts

Thursday, May 5, 2011

Intel 22nm and Mobile Computing

Yesterday Intel announced details of their 22nm silicon process. There were a number of fascinating details on how the transistors are made, but one graph from the presentation presages what will happen in the mobile market over the next several years.

As the voltage goes to zero, the consumed current goes to zero. It sounds obvious, but really isn't. Even when nominally "off," transistors have always leaked current into the substrate. As silicon features have gotten smaller the power they consume while active has declined rapidly, but the leakage current less rapidly.

Other graphs in the presentation show a tradeoff between operating voltage and leakage current, which means power consumed while active versus power consumed at all times. Intel's production chips will likely tolerate a little leakage to get lower voltage, but still very low.

In 32nm silicon processes leakage current may already be the primary factor in power consumption. It is difficult to estimate how serious the effect is, but this article from March 2008 shows leakage current as relatively insignificant in 180 nm silicon but growing to nearly 40% of total power consumption in a 50 nm process. We're at 32nm now.

Except Intel just changed the game.

ARM has several advantages in the mobile space. Their products are available from many manufacturers and their support in software toolchains is nearly universal, but their biggest advantage has been low power consumption compared to x86 or other architectures. ARM did a great job designing chips which are very sparing of the power they consume while operating.

Except Intel just changed the game.

Intel now has a silicon process with radically lower leakage current. x86 consumes more power while actively operating, but leakage current is more significant. ARM's competitive advantage has shrunk substantially. Expect to see a lot more x86 CPUs in mobile devices, starting in late 2011.

Wednesday, August 18, 2010

x86 vs ARM Mobile CPUs

The ARM architecture dominates mobile computing. It is used in all popular mobile phones and in a huge percentage of battery powered devices generally. This is due partly to its good overall performance, but especially due to its performance per watt expended. ARM chips consume very little power when compared to x86, and ARM's power consumption still excels even when compared to other RISC chips. At one time even Intel manufactured ARM chips, the result of its purchase of the DEC semiconductor business and its excellent StrongARM design. In 2006 Intel sold its ARM products to Marvell Semiconductor, committing to x86 for every segment of the computing market.

Its easy to assume that this state of affairs will continue, and that Intel will never successfully compete in the mobile market. I suspect that is too simplistic an assumption. There are two main sources of power dissipation in modern microprocessors: the power consumed by transistors actively switching, and the power lost to leakage current.

active current, leakage current into substrate

x86 vs ARM: Active Power

It requires power to switch a CMOS transistor 0->1 or 1->0, so one way to reduce power consumption is to have fewer transistors and to switch them at a lower frequency. x86 is at a disadvantage here compared to ARM, which Intel and AMD's design teams have to cover with extra work and cleverness. The vagaries of the x86 instruction set burdens it with hardware logic which ARM does not require.

Since the Pentium Pro, Intel has decoded complex x86 instructions down to simpler micro-ops for execution. AMD uses a similar technique. This instruction decode logic is active whenever new opcodes are fetched from RAM. ARM has no need for this logic, as even its alternate Thumb encoding is a relatively straightforward mapping to regular ARM instructions.
x86_32 exposes only a few registers to the compiler. To achieve good performance, x86 CPUs implement a much larger number of hardware registers which are dynamically renamed as needed. ARM does not require such extensive register renaming logic.
Every ARM instruction is conditional, and simple if-then-else constructs can be handled without branches. x86 relies much more heavily on branches, but frequent branches can stall the pipeline on a processor. Good performance in x86 requires extensive branch prediction hardware, where ARM is served with a far simpler implementation.

x86 vs ARM: Leakage Current

Intel Nehalem processor die Leakage current became a significant contributor to power consumption in 2003 with the move from 0.18 to 0.13 micron feature sizes, and has become more significant in each subsequent generation. The industry is now moving into 0.032 micron technologies.

A capacitor is formed when two conductive materials are separated by an insulator, called the dielectric. The capacitance is determined by the quality of the insulating material, quantified by the dielectric constant k. Higher k means more capacitance. "Leakage" is current which is able to flow out of the ASIC transistors and into the silicon substrate. To reduce the current leaking out, one needs to make a better dielectric between the transistor and the bulk of the silicon. This is generically referred to as high-k silicon technology.

As we're now talking about silicon fabrication techniques, we have to start talking about Intel specifically rather than the x86 architecture in general. Intel began using a high-k dielectric in production in 2007, during the 45 nm generation of parts. The rest of the industry has been experimenting with such materials, but is only now rolling it into the 32 nm generation. Intel hasn't stopped working on the technique, their 32 nm process benefits from the last several years of experience.

x86 vs ARM: Predicting The Future

Leakage current becomes more significant with each generation of process technology. The power consumed by actively switching transistors has been radically reduced over the last few years, leaving leakage as the more significant source of current consumption. It is difficult to estimate how serious the effect is, but this article from March 2008 shows leakage current starting out relatively insignificant in 180 nm silicon but growing to nearly 40% of total power consumption in a 50 nm process.

So far as I can see, this trend will continue. Leakage current will soon become the dominant factor in CPU power consumption. In fact, in 32 nm processes it might already be the primary factor. This is where the game changes: the advantage for total power consumption shifts away from the efficiency of the CPU architecture and design, and to the process technology of the fab. Presumably, this trend informed Intel's decision to sell their ARM assets to Marvell: there is little reason to enrich a competitor if the advantages of doing so will diminish over time.

There is still room for clever design, of course. To reduce active power consumption, processor designs have long stopped the clock to unused portion of the CPU. To reduce leakage current, AMD is taking the next step to actually remove the power supply to those portions of the CPU. For ARM, that design choice makes even more sense. ARM has no control over the fab, their designs have to minimize assumptions about the underlying silicon technology.

Right now ARM reigns supreme in the mobile space, but the strengths which gave it an advantage over x86 are rapidly becoming less compelling. Having to compete directly on silicon process sophistication moves the game onto Intel's turf, which Intel is happy to capitalize on with its Medfield platform. Its a great time to be in the mobile space.

Tuesday, August 10, 2010

A Barrrgain Indeed

Barrr screenshot If you have an Android phone, Barrr is a wonderful game by FireDroid. It is available in the Android Market or direct from their site via a QR barcode. It was developed by two Dutch students as part of their degree program in Multimedia: Mariecke Kouwenberg and Roy van der Veen.

Barrr is a lighthearted simulation of a pirate bar complete with tattoo station, video games, karaoke, and a dart board. Each scenario takes a couple minutes to complete, making it a great diversion for short interludes.

Price: free.

Friday, July 16, 2010

Alternate Phone B

Image courtesy of Apple

When Gizmodo acquired an iPhone 4 prototype, I recall a mention that Apple execs were especially annoyed because there had been two versions of the iPhone 4. The one Gizmodo recovered was the more aggressive of the two designs, with the other version prepared as a backup. When Gizmodo showed the prototype to the world, it became impossible to go with the alternate design.

When developing a backup release for any product, software or hardware, you want to minimize the resources spent on it. You spend most of the effort on the primary design, the more aggressive version. The primary will ship, its just a question of when, so the resources spent on it will not go to waste. The backup, in contrast, will ideally never see the light of day. You'd prefer to only spend resources on it which can also benefit the primary effort, and minimize the amount of throw-away work.

So what might the "Alternate Plan B" version of the iPhone 4 have been? Using the A4 CPU seems assured, as not using it would have led to uncomfortable questions. It presumably would also have incorporated the current versions of the baseband radio circuitry, and possibly the new gyroscope as well. If the gyro didn't work out, it could be marked no-stuff in assembly and the software support disabled. I'd speculate that the Alternate Plan B for iPhone 4 was a new board with A4 CPU and gyro sized to fit in a minimally modified 3GS housing.

I wonder which version would Apple have gone with, had they a choice?

Thursday, April 29, 2010

Deep Pockets

In the tech industry, how often do we hear this?

"Great technology, they just didn't market it well."
"Its a shame to see that product die.
"They need somebody with deep pockets to see it through."

Of course this brings us to Palm, acquired by HP for $1.2 billion. Brian Humphries, an HP executive in business development, reportedly said: "Our intent is to double down on webOS." Palm managed to find their deep pocketed benefactor. Now we get to watch what happens.

This is the second time a savior has swooped in for Palm. In the early 1990s before the PDA had really established itself as a category, Palm nearly ran out of money. Its VCs were unwilling to put in more, but Palm was not generating enough revenue to operate. The company was purchased by US Robotics, which was later purchased by 3Com. Palm operated successfully for many years after that first brush with death.

We'll see what happens from here. HP likely believes that by owning the complete system, from hardware to OS to applications, they will be able to deliver compelling products and compete successfully with the iPhone. Time will tell.

Saturday, April 10, 2010

Bending the Rules

As of yesterday the iPhone Developer Program License reads:

3.3.1 - Applications may only use Documented APIs in the manner prescribed by Apple and must not use or call any private APIs. Applications must be originally written in Objective-C, C, C++, or JavaScript as executed by the iPhone OS WebKit engine, and only code written in C, C++, and Objective-C may compile and directly link against the Documented APIs (e.g., Applications that link to Documented APIs through an intermediary translation or compatibility layer or tool are prohibited).

The way the new requirement is phrased is interesting. Adobe's Packager for iPhone compiles Flash code to native ARM instructions. Assuming the Adobe tool can produce position independent code, one might consider something sneaky.

void run_disassembled_flash_binary()
{
  asm volatile("ldr r3, [pc, #192]\n\t"
               "mov r1, #3\n\t"
               "str r3, [sp, #32]\n\t"
               "mov r3, #0\n\t"
               "str r3, [sp, #4]\n\t"
               "str r3, [sp, #8]\n\t"
               "mov r2, #24\n\t"
               "mov r3, #48\n\t"
               "add r5, sp, #12\n\t"
               "str r0, [sp, #24]\n\t"
               "str r0, [sp, #0]\n\t"
               "add r0, sp, #12\n\t"
               ...

I make this suggestion in jest. Apple does allow embedded ARM assembly in iPhone apps, but an application consisting of nothing but opcodes produced by a foreign programming environment will not resemble an app compiled from more conventional source code. I very much doubt it would pass the App Store approval process. Though phrased as a technical requirement, section 3.3.1 of the iPhone Developer License is really a business imperative.

Friday, February 5, 2010

Apple == A, Plus Four More Letters

What the web needs right now is another blog post about the iPad.

Apple A4 chip No, don't run away! This will be different, I promise. We'll focus on Apple's A4, a custom CPU first used in the iPad. It has been widely assumed that A4 uses a licensed ARM Cortex A8 core. I have no reason to dispute this assertion, it seems like a fine choice. It has also been asserted that because the same core is used in parts from Samsung, TI, and Qualcomm, Apple should not have bothered making its own chip. Today, Gentle Reader, we'll explore that notion a bit.

Apple ASIC Expertise

Apple has a long history of ASIC design. Apple produced custom silicon for various Macintosh models since at least the late 1980s, when they designed the audio chips used in the Quadra 700 and 900 (a chip called "Batman"). Later, Apple designed entire chipsets to interface with the PowerPC 60x bus. Apple licensed a gigabit Ethernet MAC design from Sun, and used it plus IDE controller and other peripherals in chipsets for several Powermac models. With the switch to x86, Apple's efforts became much more constrained. The x86 bus interface is difficult to license, and Intel's own chipsets are quite reasonable. So far as I know, x86 Macintoshes no longer use custom Apple ASICs.

Custom chip design isn't a radical departure for Apple.

With that as background, what might Apple have done in the A4 chip? I have absolutely no inside information about the iPad or A4 processor, I'm going to make stuff up because its fun to speculate.

Graphics & OpenCL

PowerVR SGX CPU Overview Apple holds a nearly 10% stake in Imagination Technologies Group, which designs the PowerVR graphics accelerator and other IP relating to massively threaded processing. Apple uses their PowerVR SGX 535 in iPhone 3GS, and used various PowerVR graphics in earlier iPhone models. The A4 chip will certainly integrate a graphics core from PowerVR. As with essentially all GPU designs today, the PowerVR makes use of multiple, specialized CPU cores. There is relatively little information about its instruction set on the web, ~~only that it is called META MTX and uses 16 bit RISC-ish instruction words~~. Update: PowerVR SGX does not use the META architecture, it has a distinct architecture of its own. Additional information can be downloaded after registration.

Apple has also invested heavily in two relevant technologies: OpenCL and LLVM.

OpenCL allows processing to be distributed across multiple CPUs in the system, even if they have different instruction sets. OpenCL algorithms are written in a language with syntax very similar to C99, and the framework handles the rest.
LLVM is a compiler toolkit, one aspect of which is a machine independent instruction set. Source code can be compiled to the LLVM virtual machine, and from there be translated into the equivalent opcodes for the target CPU. The compilation can be done statically before running it, or by a Just-In-Time compiler while interpreting the LLVM bytecodes.

iPhone applications are compiled to ARM instructions, but it is not much of a stretch to imagine support for sections of LLVM bytecodes as well. If the hardware has sufficient GPU power, the bytecode could be translated to the GPU instruction set and offloaded. Devices with less sophisticated GPUs would use the ARM instead. Apple does not allow iPhone apps to include their own virtual machine in this way, but would be free to provide the VM function as part of the OS.

I suspect this is the most compelling reason for Apple to build its own chip as opposed to buying off the shelf. The rest of the mobile industry is satisfied to offload 3D graphics and video decoding to the GPU. Apple has greater ambitions, and could make use of significantly more GPU pipelines. By controlling the complete platform from CPU to software, Apple can make tradeoffs which are not practical for the rest of the market. For example: a very large GPU plus very fast ARM would generate more heat than can be dissipated in a small form factor like a phone. Apple has the option to dynamically throttle the ARM clock speed in order to open up more thermal envelope for the GPUs, if sufficient OpenCL workload is ready to run. When the GPUs are less busy, the ARM clock speed can be brought back up.

Multi Package Modules

The CPU in the iPhone 3GS is a Samsung S5PC100. This is a multi package module with CPU, I/O chip, and SDRAM sandwiched tightly together. Multi chip modules have been around for a long time, where multiple dies wired together in one big package. The amount of testing which can be done on a raw die is rather limited, so MCM yields suffer as one bad die ruins the whole assembly.

Multi package modules are relatively new: each chip is in its own package, but use very tight pin spacings and do not have a heat spreader. They are soldered together on a small PCB, which in turn has a Ball Grid Array on the bottom with normal pin spacings. Because each chip is packaged separately a full suite of test vectors can be run before the final assembly is put together, improving the yield considerably and lowering the cost of the final product.

If we examine the main board of an iPhone 3GS, the largest component is not the Samsung processor - it is the Flash memory, an MPM containing a number of flash chips. The Samsung CPU in the iPhone 3GS comes in a close second in size, and is also a multi package module with CPU, I/O chip, and SDRAM.

With the A4, Apple will probably have one die containing both CPU and I/O. Samsung uses different I/O chips to tailor their offering to many market segments, which is not a goal for Apple. By arranging the pinout carefully, Apple might be able to make an MPM containing CPU, SDRAM, and Flash, reducing the total board area. Different Flash capacities could be offered by not stuffing portions of the MPM. The iPad itself might not need such an MPM as it is a much larger device, but future iPhones would benefit more.

To be clear: assembling an MPM is not something you can easily do when buying merchant silicon. The pins on one package have to be arranged so as to be easy to route to the pins on the other packages within the MPM.

Wrapping up

I'll say it again: I made this all up. I have no information on the specifics of the A4, just speculation. In the unlikely event that anyone reads this (instead of running away from yet another iPad blog post), don't copy it into Wikipedia as though it were verified information.

What about future iterations? Its tempting to consider a single chip containing the entire iPhone feature set, including radio and wireless networking. The A4 itself clearly doesn't do this, as GSM support is optional in the iPad. I suspect that even in future chips, Apple won't pull in the baseband radio. The front end portions of that chip are rather sensitive to noise, and generally don't work well when integrated in the corner of a gigantic ASIC. Also integrating the radio functionality would make it that much harder to keep up with advancements in wireless networks.

Another future possibility is to use this chip in Apple's other small form factor products, like the Airport Base Stations, Time Capsule, and AppleTV. This is certainly possible, but aside from obvious additional peripherals like SATA I'm not sure it adds many requirements to the chip.

Other articles about A4 you might find interesting:

The New York Times writes about the history of the A4 design team.
Louis Gray writes about Apple's heavy recruiting push to staff up their ASIC team.
Wikipedia already has an article, which will improve over time as more details emerge.

P.S.: While we're at it, the title of this post is a guess about the origins of the "A4" nomenclature: "Apple" is a capital A followed by four more letters.

Wednesday, January 27, 2010

On the Ancestry of Consumer Electronics

Writing about the Newtpocalypse the other day inspired me to get my MessagePad 120 working again. In the comments of that post, Grant Hutchinson pointed out that my MP120 would not be affected and should still work. Indeed, it does! The backup lithium battery still had a little charge left, so my files are still present. The Newton pre-dates Flash memory, and stores all files in battery backed SRAM. The battery faithfully kept the SRAM powered for the last 13 years sitting in my garage, patiently waiting for me to use it again.

This morning Kara Swisher republished a Steve Jobs interview from 2004, talking about the products which Apple chooses to pursue and not to pursue. Its an interesting perspective on the eve of Apple's media event today. As she notes, in some respects Apple's new device will be "the Newton’s great-grandson."

Tuesday, January 5, 2010

A Big Day for Mobile Technology

Today, January 5, 2010, will be a big day in the history of mobile technology. A huge day. Any tech news site worthy of your attention should devote many resources to the events of today.

I am, of course, referring to the impending Newtpocalypse.

The Apple Newton was a groundbreaking device. The picture accompanying this post is my much-beloved Newton MP120.

NewtonScript tracks time intervals using a 30 bit signed integer, with an epoch of 1993. 2²⁹ bits from 1993 is January 5, 2010, at 6:48:31 pm. Tonight. When the NetwonScript time overflows into bit 30 it becomes a negative number, and Bad Things Happen. Fixes have been developed, but only for the later model Newtons. My beloved MP120 will stop working tonight. Sniff. Of course, it hasn't been charged since 1999 so its effective behavior will be unchanged.

PS: in addition to being the Newtpocalypse, I think Google has some kind of announcement today relating to mobile tech.

Friday, September 11, 2009

Motorola Mobile Meanderings

On September 10th Motorola announced the CLIQ, its first Android phone and a product which a good friend of mine has labored over for quite some time.

For many years I have used my trusty Sony-Ericsson T616, but I admit it might be getting a bit dated. I decided to compare the two devices to see whether its time for me to take up a new phone.

	T616	CLIQ
Cracked LCD screen?	Y	N
Broken down arrow key?	Y	N
Wallpaper picture of my Daughter?	Y	N

As you can see, clearly I cannot replace my T616 until Motorola rectifies these glaring omissions in the CLIQ. The cracked screen and non-functional arrow key could both be achieved via sufficiently rough handling. Frankly I don't know how Motorola would obtain pictures of my daughter to use as wallpaper, though I might be willing to accept this little flaw and add my own wallpaper later.

This chart shows that my T616 is the perfect phone, but I'm in a generous mood. The first person to offer me enough money for me to buy a CLIQ, takes it.