Thursday, August 26, 2010

Code Test Haiku

Function works first time
Briefly feel elation...
Did it really work?

Having code work the first time is more annoying than having it fail.

If it doesn't work I go poke around to fix the failing tests, and when it finally passes I feel confident that it really works.

When code works the first time I still go poke around in the unit tests, only this time I have no idea what I'm looking for. Did it really get tested? Does it really test what I think its testing? Surely there is something wrong somewhere.

Wednesday, August 18, 2010

x86 vs ARM Mobile CPUs

The ARM architecture dominates mobile computing. It is used in all popular mobile phones and in a huge percentage of battery powered devices generally. This is due partly to its good overall performance, but especially due to its performance per watt expended. ARM chips consume very little power when compared to x86, and ARM's power consumption still excels even when compared to other RISC chips. At one time even Intel manufactured ARM chips, the result of its purchase of the DEC semiconductor business and its excellent StrongARM design. In 2006 Intel sold its ARM products to Marvell Semiconductor, committing to x86 for every segment of the computing market.

Its easy to assume that this state of affairs will continue, and that Intel will never successfully compete in the mobile market. I suspect that is too simplistic an assumption. There are two main sources of power dissipation in modern microprocessors: the power consumed by transistors actively switching, and the power lost to leakage current.

active current, leakage current into substrate
x86 vs ARM: Active Power

It requires power to switch a CMOS transistor 0->1 or 1->0, so one way to reduce power consumption is to have fewer transistors and to switch them at a lower frequency. x86 is at a disadvantage here compared to ARM, which Intel and AMD's design teams have to cover with extra work and cleverness. The vagaries of the x86 instruction set burdens it with hardware logic which ARM does not require.

  • Since the Pentium Pro, Intel has decoded complex x86 instructions down to simpler micro-ops for execution. AMD uses a similar technique. This instruction decode logic is active whenever new opcodes are fetched from RAM. ARM has no need for this logic, as even its alternate Thumb encoding is a relatively straightforward mapping to regular ARM instructions.
  • x86_32 exposes only a few registers to the compiler. To achieve good performance, x86 CPUs implement a much larger number of hardware registers which are dynamically renamed as needed. ARM does not require such extensive register renaming logic.
  • Every ARM instruction is conditional, and simple if-then-else constructs can be handled without branches. x86 relies much more heavily on branches, but frequent branches can stall the pipeline on a processor. Good performance in x86 requires extensive branch prediction hardware, where ARM is served with a far simpler implementation.

x86 vs ARM: Leakage Current

Intel Nehalem processor dieLeakage current became a significant contributor to power consumption in 2003 with the move from 0.18 to 0.13 micron feature sizes, and has become more significant in each subsequent generation. The industry is now moving into 0.032 micron technologies.

A capacitor is formed when two conductive materials are separated by an insulator, called the dielectric. The capacitance is determined by the quality of the insulating material, quantified by the dielectric constant k. Higher k means more capacitance. "Leakage" is current which is able to flow out of the ASIC transistors and into the silicon substrate. To reduce the current leaking out, one needs to make a better dielectric between the transistor and the bulk of the silicon. This is generically referred to as high-k silicon technology.

As we're now talking about silicon fabrication techniques, we have to start talking about Intel specifically rather than the x86 architecture in general. Intel began using a high-k dielectric in production in 2007, during the 45 nm generation of parts. The rest of the industry has been experimenting with such materials, but is only now rolling it into the 32 nm generation. Intel hasn't stopped working on the technique, their 32 nm process benefits from the last several years of experience.


x86 vs ARM: Predicting The Future

Leakage current becomes more significant with each generation of process technology. The power consumed by actively switching transistors has been radically reduced over the last few years, leaving leakage as the more significant source of current consumption. It is difficult to estimate how serious the effect is, but this article from March 2008 shows leakage current starting out relatively insignificant in 180 nm silicon but growing to nearly 40% of total power consumption in a 50 nm process.

So far as I can see, this trend will continue. Leakage current will soon become the dominant factor in CPU power consumption. In fact, in 32 nm processes it might already be the primary factor. This is where the game changes: the advantage for total power consumption shifts away from the efficiency of the CPU architecture and design, and to the process technology of the fab. Presumably, this trend informed Intel's decision to sell their ARM assets to Marvell: there is little reason to enrich a competitor if the advantages of doing so will diminish over time.


There is still room for clever design, of course. To reduce active power consumption, processor designs have long stopped the clock to unused portion of the CPU. To reduce leakage current, AMD is taking the next step to actually remove the power supply to those portions of the CPU. For ARM, that design choice makes even more sense. ARM has no control over the fab, their designs have to minimize assumptions about the underlying silicon technology.

Right now ARM reigns supreme in the mobile space, but the strengths which gave it an advantage over x86 are rapidly becoming less compelling. Having to compete directly on silicon process sophistication moves the game onto Intel's turf, which Intel is happy to capitalize on with its Medfield platform. Its a great time to be in the mobile space.

Monday, August 16, 2010

A Scene from the Near Future


FURNITURE LICENSE AGREEMENT

THIS FURNITURE IS LICENSED, NOT SOLD. By using the furniture you signify that you have read and agree to all the terms of this license agreement.

THIS IS A LEGAL AND BINDING AGREEMENT BETWEEN YOU, HEREINAFTER ALSO REFERRED TO AS "USER", AND THE FURNITURE RETAILER, HEREINAFTER ALSO REFERRED TO AS THE "STORE". BY SITTING ON, LYING ON, OR OTHERWISE MAKING USE OF THE FURNITURE, HERINAFTER REFERRED AS THE "PRODUCT", (OR AUTHORIZING ANY OTHER PERSON TO DO SO), YOU INDICATE YOUR COMPLETE AND UNCONDITIONAL ACCEPTANCE OF ALL THE TERMS AND CONDITIONS OF THIS LICENSE AGREEMENT. THIS LICENSE AGREEMENT CONSTITUTES THE COMPLETE AGREEMENT BETWEEN YOU AND THE STORE. IF YOU DO NOT AGREE TO THE TERMS OF THIS LICENSE AGREEMENT, YOU MUST DESTROY THE ITEM OF FURNITURE (WITH ALL ACCOMPANYING MATERIALS).

IMPORTANT: CAREFULLY READ THIS LICENSE BEFORE USING THIS PRODUCT. SITTING ON, LYING ON, OR OTHERWISE USING THIS PRODUCT INDICATES YOUR ACKNOWLEDGMENT THAT YOU HAVE READ THIS LICENSE AND AGREE TO BE BOUND BY AND COMPLY WITH ITS TERMS. IF YOU DO NOT AGREE, RETURN THE COMPLETE PRODUCT TO THE STORE WITHIN 30 DAYS OF THE DATE YOU PURCHASED IT FOR A FULL REFUND. THIS LICENSE AGREEMENT IS YOUR PROOF OF LICENSE. PLEASE TREAT IT AS VALUABLE PROPERTY.


A. LICENSE:

The STORE provides the user with the furniture and separate cushions (together called the "PRODUCT") and we grant the user a license to use the PRODUCT in accordance with the terms of this License. Any supplemental materials provided to the user as part of support services provided by the STORE for the PRODUCT shall be considered part of the PRODUCT and subject to the terms and conditions of this License. The copyright and all other rights to the PRODUCT shall remain with the STORE or its licensors.


B. THE USER MAY:

  1. transfer the PRODUCT to the USER's primary residence, and place it within a single room therein.
  2. move the PROUCT to a different room within the same structure. The user must acquire and dedicate a supplementary license for each separate room in which the PRODUCT may be used. A license for the PRODUCT may not be shared or used concurrently in different rooms.

C. THE USER MAY NOT:

  1. use the PRODUCT or make copies of it except as permitted in this License.
  2. use the PRODUCT within a structure which is not the USER's primary residence. A separate commercial license applies to these cases.
  3. use the PRODUCT outside a structure, i.e. outdoors. A separate public performance license applies to these cases.
  4. reupholster, except within the initial TEN (10) days of a license term. Outside of the initial ten days, reupholstery will require an early renewal.
  5. deconstruct, reverse engineer, or disassemble the PRODUCT except to the extent the foregoing restriction is expressly prohibited by applicable law. The PRODUCT may not be used as a template from which to build additional furniture.
  6. rent, lease, assign, sell, or transfer the PRODUCT.
  7. modify the PRODUCT or merge all or any part of the PRODUCT with another item of furniture.
  8. separate the component parts of the PRODUCT for use in more than one room.

D. TERM:

This license shall remain in effect only for so long as the user is in compliance with the terms and conditions of this agreement. This license will terminate if the user fails to comply with any of its terms or conditions. The user agrees, upon termination, to destroy the PRODUCT.


E. U.S. GOVERNMENT RIGHTS:

With respect to any acquisition of the PRODUCT by or for any unit or agency of the United States Government (the "Government"), the Product shall be classified as "commercial furniture", as that term is defined in the applicable provisions of the Federal acquisition Regulation (the "FAR") and supplements thereto, including the Department of Defense (DoD) FAR Supplement (the "DFARS"). The Product was developed entirely at private expense, and no part of the Product was first produced in the performance of a Government contract. If the Product is supplied for use by DoD, the Product is delivered subject to the terms of this Agreement and either (i) in accordance with DFARS 3.1415-9 (a) and 2.71(a), or (ii) with restricted rights in accordance with DFARS 012-345-6789 (c)(1)(ii)(JUN 1970), as applicable. If the Product is supplied for use by a Federal agency other than DoD, the Product is commercial furniture delivered subject to the terms of this Agreement and (i) FAR 6.66(a); (ii) FAR 57.57-57; or (iii) FAR 12.345-67(ALT VIII), as applicable.


F. GENERAL:

This License is the entire agreement between the STORE and the USER, superseding any other agreement or discussions, oral or written, and may not be changed except by a signed agreement. This License shall be governed by and construed in accordance with the laws of the state of Delaware, USA, excluding that body of law applicable to choice of law and excluding the United Nations Convention on Contracts for the International Sale of Goods and any legislation implementing such Convention, if otherwise applicable. If any provision of this License is declared by a Court of competent jurisdiction to be invalid, illegal, or unenforceable, such a provision shall be severed from the License and the other provisions shall remain in full force and effect.

Friday, August 13, 2010

Bury Brigades as the Future of Media?

Broadcast transmission towerI'm currently reading Cognitive Surplus, by Clay Shirky. It builds upon his earlier Here Comes Everybody, detailing how the Internet fundamentally changes the media landscape to an extent not seen since Gutenberg. Before the Internet, when the cost of distribution was non-trivial, you ended up with publishers, producers, TV networks, and a whole host of powerful institutions built upon managing the production. When the cost of distributing media drops to essentially nothing, when everybody who wants to can become a publisher without having to ask permission or convince anybody of the value of their work, it completely disrupts the models which evolved in the prior era. A lot more material will be produced. Much of it will be trash, as we've moved the filtering function away from an editor before publication and onto the audience after publication.

Something will evolve to fill an institutional role in the New Media. The current period of creative chaos is unlikely to continue forever. A portion of the population is willing to wade through the trash in order to surface the truly great, but only a small portion. The rest of us need some filtering, or curation as the cool kids seem to call it.


Warning: Speculation Ahead

Are Digg Bury Brigades early precursors to a form of New Media institution? Organized groups, loosely connected by shared interests but not centrally funded or managed, they influence the spread of material online and therefore gain some control over media distribution. Bury Brigades are negative filters, suppressing material they don't agree with rather than surfacing material they want to promote. There will be equally a role for positive filters, entities which seek out and promote material. Motivation for groups to organize as positive filters is less clear, as simple altruism and a desire for recognition only go so far.

Tuesday, August 10, 2010

A Barrrgain Indeed

Barrr screenshotIf you have an Android phone, Barrr is a wonderful game by FireDroid. It is available in the Android Market or direct from their site via a QR barcode. It was developed by two Dutch students as part of their degree program in Multimedia: Mariecke Kouwenberg and Roy van der Veen.

Barrr is a lighthearted simulation of a pirate bar complete with tattoo station, video games, karaoke, and a dart board. Each scenario takes a couple minutes to complete, making it a great diversion for short interludes.

Price: free.

Monday, August 9, 2010

Mispsleled Words in Code

I ran into an interesting variation on the brain being able to recnogzie mispsleled wrods so long as the first and last letters are correct.

char *filename = "XXXXXX"; 
mkstemp(filename); 

One of the resulting runs produced:

-rwx------ 2 dgentry eng 4096 Jul 8 10:41 ufseuL

Why yes, it was useful. Thank you for your efforts, computer.

Wednesday, August 4, 2010

node.js from 30,000 feet

I attended a tech talk last week on node.js by Ryah Dahl. The video of the talk is up on YouTube.


 
Node.js Overall Structure

V8+libev+libeio+libcares at the bottom, node bindings in the middle, top layer node.js standard library in JavaScriptThe JavaScript implementation in node.js is Google's V8. As mentioned in an earlier article, V8 compiles the source JavaScript directly to machine code the first time it is executed. There is no intermediate bytecode format and no JS interpreter. In addition to V8, node.js relies on libev for its event loop, libeio for asynchronous I/O, and c-ares for asynchronous DNS support. Like everything else in the known universe, it relies on OpenSSL for cryptography and SSL/TLS support.

A standard library in JavaScript is supplied. This provides access to the underlying C++ implementation, and also has helpful bits like a URL parser and a REPL shell for easy experimentation. One thing it does not provide is the DOM. Node.js is not a browser, there is no HTML document to interact with.


 
Node.js Implementation

Entry points to the C++ code appear as a JavaScript variable named process. For example, here is an excerpt from dns.js:

var dns = process.binding('cares');

'cares' refers to the c-ares DNS support library. The dns variable allows JavaScript code to make calls to c-areas.

// Easy DNS A/AAAA look up
exports.lookup = function (domain, callback) {

Notice the signature of the function: input arguments and a callback when finished. There are never blocking operations in node, everything which might not complete immediately is a callback.

  var addressType = dns.isIP(domain);
  if (addressType) {
    process.nextTick(function () {
      callback(null, domain, addressType);
    });

dns.isIP() calls into C++ code, which makes a series of inet_pton(AF_INET*) calls to figure out if the argument is a valid numeric IP address. I've omitted the C++ code here, we dive into a more interesting example below.

  } else {
    if (/\w\.local\.?$/.test(domain) ) {
      // ANNOYING: In the case of mDNS domains use NSS in the thread pool.
      // I wish c-ares had better support.
      process.binding('net').getaddrinfo(domain, 4, function (err, domains4) {
        callback(err, domains4[0], 4);
      });

Node.js has two ways to implement support routines in C++. If the C++ code is structured to be asynchronous with a callback, it can be launched from the main thread using libev. Node.js makes heavy use of async I/O for this reason. Blocking C++ calls are handled by a pool of worker threads, which send an event to the main when their operation completes. In this code snippet the 'local' domain is handled by the thread pool as a special case, because c-ares doesn't handle mDNS.

We'll come back to the thread pool code path later, after examining the common case.

    } else {
      channel.getHostByName(domain, dns.AF_INET, function (err, domains4) {
        if (domains4 && domains4.length) {
          callback(null, domains4[0], 4);
        } else {
          channel.getHostByName(domain, dns.AF_INET6, function (err, domains6) {
            if (domains6 && domains6.length) {
              callback(null, domains6[0], 6);
            } else {
              callback(err, []);
            }
          });
        }
      });
      ... etc ...

"channel" is a JavaScript variable which links to a context in the c-ares library. The JS code to create channel is omitted for brevity. Now we'll peel back one layer to look at the C++ implementation.

Handle Channel::GetHostByName(const Arguments& args) {
  HandleScope scope;
  Channel *c = ObjectWrap::Unwrap(args.Holder());
  assert(c);

  if (!args[0]->IsString()) {
    return ThrowException(Exception::Error(
          String::New("First argument must be a name")));
  }

  if (!args[1]->IsInt32()) {
    return ThrowException(Exception::Error(
          String::New("Second argument must be a family")));
  }

  if (!args[2]->IsFunction()) {
    return ThrowException(Exception::Error(
          String::New("Third argument must be a callback")));
  }

  int family = args[1]->Int32Value();
  if (family != AF_INET6 && family != AF_INET) {
    return ThrowException(Exception::Error(
          String::New("Unsupported address family")));
  }

Argument unwrapping and validity checks when traversing the interface from one programming language are always tedious. You can never predict when someone will copy the channel.getHostByName invocation out of the standard library and mess with it, and you'd like the framework to do something sane no matter what they do.

  String::Utf8Value name(args[0]->ToString());

  ares_gethostbyname(c->channel, *name, family, HostByNameCb, cb_persist(args[2]));

  return Undefined();
}

Thats it. ares_gethostbyname() is in the C-ARES library, which we won't delve into here. HostByNameCb is the C++ callback function when resolution is done. HostByNameCb injects an event to the JavaScript code, to call the callback function passed in to the original call.

The JavaScript has an alternate code path for mDNS requests, using the getaddrinfo() method on process.binding('net'). Most of that code path consists of the same sort of argument unwrapping and checking as GetHostByName, which we will omit. The mDNS code path uses a blocking DNS request, serviced by the thread pool. The code to send work to the pool and arrange a callback later is pleasingly simple:

  eio_custom(Resolve, EIO_PRI_DEFAULT, AfterResolve, rreq);

Resolve is the function the worker thread is supposed to call. AfterResolve is the callback function in the main loop which the worker thread should trigger when done.


 
Final Thoughts

Node.js makes it easy to develop high performance applications by not offering APIs which would drastically lower performance. Everything is a callback, there are no blocking calls in the API (except for initialization calls such as module loading). Where the underlying C++ implementation is also based on callbacks, this is straightforward. Where the underlying C++ code would block, the implementation becomes a somewhat more difficult exercise in thread management.

The JavaScript API in node.js "feels" very much like JavaScript. I believe a main factor making this possible is the relatively small number of entry points required from the JavaScript down into the C++ code: sockets, DNS resolution, the http parsing library, etc. It was feasible for each interface to be lovingly crafted by hand, baking JavaScriptiness into the API.

Attempting this technique for software like GUIs, where the number of C/C++ APIs to bind to is enormous, would likely require a more automated linkage between JavaScript and C++. This is the world of things like SWIG to generate interfaces or libffi to make direct calls. SWIG and libffiare extremely useful in their niches, but definitely have the feel of a foreign intruder in the host language. I don't know that a node.js for GUIs would be as pleasant a thing to look upon, but we need a way to do so. Software needs to advance without having to continually reinvent and reimplement what has come before, and without requiring drastic amounts of manual effort.

Monday, August 2, 2010

Zeus SCM

Closeup of face from statue of ZeusThe last few years have seen much innovation in source code management systems. Personally I find these systems to be needlessly complex, and have long wished for a simpler option. Therefore it is with great pride that I announce a new source code management system: Zeus. Its name comes from a particular event in Greek mythology where Pallas Athena bursts forth, fully grown, from the forehead of Zeus. The guiding principle of Zeus is that source code should appear in its final form, and thus dispenses with such outmoded concepts as file versions and history. Source files simply spring forth from the repository, in all their glory.

Zeus relies on a single environment variable for all of its settings: ${REPO}. This is the central source repository where all files will reside. Zeus supports both local and remote filesystems such as NFS. The Zeus command structure is designed to leverage a developer's familiarity with existing shell commands. For example, checking code out from the repository is deliberately very similar to a copy command:

zeus cp ${REPO}/file.cc client/

Other available Zeus commands include, but are not limited to:

  • zeus ls ${REPO} : list files in the repository
  • zeus ls -l ${REPO} : detailed information about files in the repository
  • zeus echo "note" >> ${REPO}/notes.txt : maintain a feature request list in the repository
  • zeus cat ${REPO}/file.cc : print file.cc to the console, without checking it out

Another great thing about Zeus is its simple, straightforward implementation. Any developer can understand its workings, and implement changes to meet their needs.

#!/bin/sh

# Zeus source code management system
# Copyright 8/2010, Denton Gentry

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

exec $*

I hope you enjoy using Zeus as much as I do. Many thanks to Mark Essel for his contribution of ideas.