Coding Relic: Grand Unified Theories of Switching

This is the fourth and final article in a series discussing hash engines versus CAMs for network switch chips. The first, second, and third installments were published earlier this week.

Having spilled so much digital ink discussing hash engines in switch hardware, we finally come to the point of the exercise.

Elegance, Meet Implementation

We're currently witnessing a huge shift in networking, with technologies to let external entities take direct control of forwarding decisions in the switch. Current work on this area focusses on allowing flexible extraction of fields from the packet, and definition of flow rules to take action based on matches of any field.

The overwhelming temptation is to define a superset key containing every interesting field from the packet, and use masks to implement specific functions. For example, consider a 9-tuple:

We then populate the table with entries, and use masking to implement the specific functions desired.

Lookup:	ingress port	VLAN	Ethertype	L3 Src	L3 Dst	IP Proto	DSCP	L4 Src	L4 Dst
Nexthop/24	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	10.1.1.0 \| 0xffffff00	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0
L3 ACL	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	199.59.148.82 \| 0xffffffff	6 \| 0xff	0 \| 0x0	0 \| 0x0	80 \| 0xffff
QoS Assign	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	6 \| 0xff	1 \| 0xff	0 \| 0x0	22 \| 0xffff

As an abstraction this seems wonderful: a Grand Unified Theory of Switching. Any packet handling can be expressed using the same mechanism and notation, and we can even use set theory where rules overlap.

As a basis for designing switch hardware, this has some drawbacks. Not only does the reliance on masking mandate a CAM, it places quite a burden on that CAM. The key pictured here is enormously wide: the IPv4 version would be 152 bits, 344 bits for IPv6, and there are still more fields we might consider adding. The key format is variable, the hardware wouldn't know in advance how big it will be. A common key width for CAM components is 144 to 147 bits wide. Matching extremely wide keys requires chaining, where the CAM burns another lookup cycle to handle more key bits. A protocol encouraging very wide keys will seriously tax CAM capacity and performance.

Whither Hash Engines?

Hash engines store their patterns in relatively inexpensive memory like SRAM. At any given price point you can store a much larger number of entries in a hash table than a CAM. Consider some functions in network gear today which hash engines are suited for, as they require either no masking or a small set of fixed masks.

In provider networks we rely on the ingress router to handle all classification of a packet, determining the path it will follow and the QoS it will carry. It prepends MPLS labels for the core to rely on.
In corporate networks nowadays we use an extensive set of ACLs to protect the network from unauthorized access, as the industry has completely failed to deliver identity-based networks and NAC.
In very large L2 domains (using TRILL or similar solutions) we track a large number of MAC addresses. Masking of MAC addresses makes little sense, they are always straightforward lookups.

A Grand Unified Theory of Networking can handle all of these functions, but it does so by turning them into big keys with masks per entry. They become unsuitable for hash engine implementations.

Some Modest Suggestions

The crux of the disconnect of protocol specification to hardware implementation is that each entry carries a mask. At first glance this looks like it would map very well to CAMs. However this leads to very wide keys (bad for CAMs), because its so easy to define one key and mask off the fields which aren't needed. Additionally, the ability to add numerous mask patterns makes hash engine use almost impossible.

Suggestions:

Require that masks be defined separately from individual entries, and encourage them to be limited in number. CAM vendors are moving to an architecture with a pool of global masks, and hash engines require an even more limited number. The protocol should accommodate this.
Giving the switch the ability to reject new mask definitions would let hash engines optimize by expanding entries, but this makes applications hard to write: what do they do if the switch rejects something they require? Letting the switch advertise the number of masks it can handle would be another solution, and makes the applications easier to write.
Make it easy to define multiple key formats. It is straightforward for hardware to extract different keys from the packet for different lookups. Much of what a protocol might want to do with masking can be accomplished by defining its own key format instead. The switch would need to advertise the number of formats it can support (and most existing gear would support only 1).

This concludes this series of articles about hash engines in switch hardware. Summary: TL;DR, blah-blah-hash-blah.

footnote: this blog contains articles on a range of topics. If you want more posts like this, I suggest the Ethernet label.

Lookup:	ingress port	VLAN	Ethertype	L3 Src	L3 Dst	IP Proto	DSCP	L4 Src	L4 Dst
Nexthop/24	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	10.1.1.0 \| 0xffffff00	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0
L3 ACL	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	199.59.148.82 \| 0xffffffff	6 \| 0xff	0 \| 0x0	0 \| 0x0	80 \| 0xffff
QoS Assign	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	0 \| 0x0	6 \| 0xff	1 \| 0xff	0 \| 0x0	22 \| 0xffff

Thursday, June 9, 2011

Grand Unified Theories of Switching