Coding Relic: August 2009

Monday, August 24, 2009

Plummeting Down the Chasm

Crossing the Chasm book cover Crossing the Chasm is a seminal book in technology marketing, whose ideas quickly spread through the industry. Originally written in 1991 by Geoffrey Moore, it showed a new take on the technology adoption lifecycle. The lifecycle starts with tech enthusiasts willing to buy an immature product and runs through the majority buyers who make up the bulk of the market, finally trailing off when market saturation is reached. It had been commonly depicted as a bell curve:

Moore's key observation is that while the bell curve implies there is a smooth transition from early market to majority, in reality the buyers in the early market are fundamentally different from the majority that comes later. Technology companies who don't appreciate this gap will stumble and often fail once they saturate the small pool of early buyers. Moore referred to this gap as "the chasm."

Early adopters are visionaries. They are willing to look at an immature product and figure out how to use it in their operations to get a competitive advantage. They will sponsor integration work within their IT departments, and generate long lists of product feedback to better fit their needs. They are fundamentally different from the majority buyers in that they will look at an interesting product and figure out what problem it can solve. The majority market comes from the opposite direction with a problem to solve, looking for a solution. Product planning and marketing which works in the early part of a product lifecycle will fail utterly later in the game.

Yon Chasm Approacheth

It is quite possible to get stuck in the early market, continuously trying to meet the needs of early adopters and never enjoying the big sales of the majority market. Having spent the last several years in this predicament, I'll offer my version of the lifecycle chart. What it lacks in precision, it makes up for in snark.

As a development engineer it can be difficult to tell how well the sales cycle is working as one rarely gets direct visibility, but the indirect evidence is plentiful.

If nearly every deal comes with a list of new product features to be implemented, you have not crossed the chasm.
If in every medium to large deal it is not clear whether the cost of getting the business is greater than the revenue it would bring, you have not crossed the chasm.
If every deal is "high touch," requiring multiple visits by a salesperson and sales engineer and possibly a consultation with the development team, you have not crossed the chasm.
If your product cannot be sold via a web site but instead always requires an evaluation period and report, you have not crossed the chasm.
If every customer is using a different subset of the product functionality, you have not crossed the chasm.

This is important: if the company does not realize that the real problem is in the approach to the market, all of these things will be blamed on the product. The reasoning will be that "if we just pound out a few more of these deals, we'll have finally implemented everything that everybody wants and sales will take off." There may even be an element of truth in this sentiment, if the product shipped early before its natural feature set was complete. However if the company has not crossed the chasm, the fundamental problem is elsewhere.

You are not getting a list of feature requirements because the product is incomplete, but because you are selling to the type of customer who generates lists of requirements.

You are getting that list of requirements because you are still selling into the visionary and early adopter segments of the market, the people who are willing to think about how best to integrate the product into their operations. If you were selling into the mainstream market there would be no list of requirements because the mainstream won't do an extensive integration on their own. The product either meets their needs or it doesn't, and you'll either get the sale or you won't. In the mainstream there will be no back and forth of what the product could do to win the business. At most, the mainstream buyer might tell you why you lost the business.

Don't get stuck wandering around in the chasm. Trust me, it sucks.

Friday, August 21, 2009

Smartbooks and Handheld PCs

Intel markets the Atom processor for netbooks. It trades lower processing power for very low power consumption, and is quite inexpensive in large quantities. These are product features where ARM/PowerPC/MIPS have long focussed, though they have aimed at non-PC form factor devices. Now, a new round of small laptops is hitting the market using non-x86 processors with either Windows CE or a Linux software stack like Google's Android or, eventually, Chrome OS. Most notably, Dell appears to be on the verge of introducing such a device - the true measure of whether a product category has entered the mainstream. These devices are generally called Smartbooks, owing to Qualcomm's extensive marketing push for its ARM Snapdragon chips in such a role.

Vadem Clio CL-1050 What strikes me about these devices is that we've been down a similar road before. In the late 1990s there was a flurry of activity around HandHeld PCs, which were relatively small and inexpensive compared to the laptops of the day. They typically used MIPS processors, ran Windows CE, and supplied basic word processing and communications software. Handheld PCs didn't last long on the market, and Microsoft ceased work on the software after just two years. The devices simply were not useful enough when compared to a full laptop.

This is another example of how infrastructure and market can make more of a contribution to the success of a new product as its design. If a product is too early, it won't get enough traction to drag the rest of the environment along. In 2009 these devices can depend on a fast wireless infrastructure and plentiful cloud-hosted applications, which did not exist in 1999. The world has changed.

Tuesday, August 18, 2009

Virtual Machines And Manual Transmissions

I've chosen a manual transmission for every vehicle I've purchased. It is a personal preference, I like the feeling of control over the engine and the ability to trade power for torque. Driving a stick shift was also an advantage in school: practically nobody knew how to drive it, so nobody could borrow my car.

For software development I code mostly in C, which is a rather thin layer on top of the machine. Even C++, while still considered a low-level language, nonetheless implements significantly more abstraction. Consider the following simple example of incrementing a variable in C and in C++:

int val = 0;

int incr() {
    val++;
}

C++:

class exi {
    int val;

    public:
        exi() { val = 0; };

        int incr() {
            val++;
        }
};

Note that in neither case is incr() declared to take an argument. We'll use one of my favorite techniques, disassembling the binary to see how it works. This time we're looking at PowerPC opcodes.

_main:
bl _incr	branch to incr()

_incr:
mfspr r9,lr	address of val
lwz r2,0xbc(r9)	fetch val
addi r2,r2,0x1	increment
stw r2,0xbc(r9)	store new val
blr	return

_main:
... much C++ object init removed...
addi r3,r1,0x38	object addr in arg0
bl __ZN6exi4incrEv	branch to exi::incr()

__ZN6exi4incrEv:
lwz r2,0x0(r3)	fetch val from arg0*
addi r2,r2,0x1	increment
stw r2,0x0(r3)	store new val
blr	return

Though the C++ source code does not show an argument to exi::incr(), at the machine level there nonetheless is one. The object address is passed as the first argument. Passing the object is necessary for C++ to handle "this" object - it has to have a pointer to operate on.

In low level languages you can generally see the relationship from source to the resulting machine code, even when significant compiler optimization is done. As we move to higher level languages, the abstractions between the source and machine code grow ever larger. C++ is somewhat higher level than C, and at the machine level the mapping from instructions back to source code is less clear. More abstract languages like Java, Python, and C# compile to a virtual machine running on top of the real system. If one gathered an instruction trace of CPU execution, one would be hard-pressed to correlate these instructions back to the source code they implement.

Foreshadowing

One can see the day coming when manual transmissions will be unavailable in most car models. Continuously variable transmissions were an early indication of this trend, with an essentially infinite number of gearing ratios that could only be effectively controlled by an engine computer. Hybrid vehicles have a complex transmission which meshes the output of two motors, and again can only be controlled by computer. Future vehicles will likely have an entirely electric drivetrain, with no need for a conventional transmission at all. The simple fact is that the engine computer can do a far better job of optimizing the behavior of the drivetrain than I can.

I'm currently digging in to the low level aspects of virtual machines. Running compilation just-in-time as part of a virtual machine has several notable advantages over static compilation with gcc:

gcc's optimization improves if you compile with profiling, run the program, and then compile again. This is so annoying that it is hardly ever done. A virtual machine always has profiling data available, as it interprets the bytecodes for a while before running the JIT.
gcc's profile-guided optimization is done in advance, on a representative corpus of input data which the programmer supplies. If the program operates on inputs which differ substantially from this, its performance will not be optimal. The JIT optimization is always done with the real data as profiling input.
gcc can optimize for a specific CPU pipeline, such as Core2 vs NetBurst vs i486. One is trading off performance improvement on the favored CPU versus degradation on other CPUs. The JIT can know the specific type of CPU being used and can optimize accordingly.
gcc can do static constant propagation across subroutines. That is, if a constant NULL is passed to a function gcc can create a version of that function which will eliminate any unreachable code. The JIT can create optimal versions of functions tuned for specific arguments dynamically, whether they are constant or variable. It just has to validate that the arguments still match the expected, and it is free to jump back to the interpreted bytecode on a mismatch.

This should be fun. Some initial thoughts:

modern CPUs have extensive branch prediction and speculative execution features, to keep it from spending all its time stalled for the outcome of a branch decision. What happens when we have a lot more big loops with straight line code, where the JIT has optimized all the conditionals up to sanity checks at the entry to the function?
Does widespread use of JIT mean that VLIW architectures become more viable? VLIW is particularly dependent on the compiler to match the code to available hardware resources, which a JIT is better positioned to tackle.

Tuesday, August 11, 2009

blog.8.11.2009 < /dev/random

A while ago I wrote a FeedFlare for FriendFeed, my second Google App Engine project. Monday morning, friendfeed.com announced that it has been acquired by Facebook. Anybody want to buy a slightly used FeedFlare?

A year ago I wrote about the importance of publishing the GPLd source used in commercial products, to avoid the public relations nightmare that comes of being accused of a GPL violation. One of the points in that article was of the SFLC suing Extreme Networks. I'm happy to report that the suit was settled in October 2008. I'm happy to report this because I imported busybox into the source tree at Extreme, and I really didn't want to get dragged in for depositions. Extreme's engineering management was convinced that waiting until somebody asked for the source code would be fine.

Thursday, August 6, 2009

Toward a More Robust Twitter Infrastructure

This morning's twitter outage due to DDoS attack reminds me: I wrote a guest post for the most excellent Cranky Product Manager, concerning how to make twitter more robust. It ran on April 1, 2009.

The Cranky Engineer Responds to a PRD