% fortune -ae paul murphy

Buying for tomorrow: Cryptography and SPARC/CMT

Earlier this week I got a note from someone I don't know asking for advice about buying Sun gear for a company I know nothing about - not exactly a broad informational basis for advice making, if you know what I mean.

The one thing I was able to point out to him, and that I now want to spread a bit more widely, is the perception that cryptology is becoming significantly more important -and that Oracle's Sun CMT gear has some serious advantages over its competition, including the more traditional SPARC and Power gear, in this area.

Sun blogger Joerg Moellenkamp offers a clear summary on how Intel's approach to hardware cryptographic support differs from Sun's:

Intel

The AES-NI stuff is an extension to the x86 instruction set, that implements some of the important steps in hardware needed by AES ... and it just accelerates AES. Those instructions are an extension to the instructions not unlike SSE for example. So they are part of the normal flow of instructions in the pipeline. It's not a cryptographic coprocessor, it's the usual approach in x86 to extend the instruction set as needed and to use the normal cores for this tasks. It doesn't accelerate hashing and it doesn't accelerate public key cryptography. It's really just for the symmetric ciphers in the AES realm.

SPARC T3

The T3 implements cryptography in the form of cryptographic coprocessors. Each core provides one coprocessor. So a T3 has 16 cryptographic units. This accelerator is controlled by control words written into memory, it takes data from memory and writes it to another location - encrypted or decrypted depending on the direction you took. From the main processors perspective controlling the crypto unit is nothing more than some store instructions. From that point the crypto accelerators work in parallel at the clock speed of the processor. The core can work on something else while the crypto coprocessor is doing its work. That's why marketing has called this zero-cost cryptography - not only because the crypto accelerators are part of the CPU, but because they are integrated in a lightweight way working without using resources of the residual core.

In a footnote he provides, furthermore, a a particularly valuable reference and link to a 2009 Hotchips presentation on the "rainbow falls" cryptology co-processors done by Lawrence Spracklen that's well worth reading - and, if you want to see how Spracklen's predictions worked out in the hardware the Sun performance reporting blog compares cryptology throughput between the T2+ and T3 lines.

Unfortunately what that particular report illustrates rather better than the effects of the hardware change, is the lag between hardware change and matching software/skills change - and this is something you have to watch out for too. Hardware isn't magic: hardware can enable, but without the skills and software needed, your expensive new tools won't be magical, they'll just be expensive.

Oddly, there's another very recent report on the performance blog where this is a bit clearer: one trumpeting the joys of using SPARC to replace Lintel. Here's their summary:

One of Oracle's SPARC T3-2 servers was able to consolidate the database workloads off of thirty older x86 servers in a secure virtualized environment.

Thus their goal seems to have been to show people looking at racks of two and three year old servers that alternatives exist to just ordering replacement PCs - unfortunately they had to run the lintels at 10% utilization to get the result they wanted and that makes the story rather less than compelling for people who can divide 30 by 10.

I think the reason this happened to them, and thus the reason they couldn't max out those 30 lintels on I/O to make the case for real, was that they wanted to show a 1:1 mapping from lintel servers to Solaris containers - and because that falls afoul of the rule that the stronger you make the virtualization boundaries the less efficient the machine gets, they initially maxed out at somewhere around 5% utilization and then doubled that by using limited processor pools to cheat a little bit.

In effect they crippled their own demonstration by assuming that the customer would rather do something stupid than change - and, really, that's what Intel is doing with their 7 AES instructions too: assuming that customer intellectual inertia will force sales despite the disadvantages of doing things the 80s way.

But as we move into a world in which storage cryptology becomes an audit checkmark and https becomes "de rigueur" for just about everything web, more and more larger IT customers are going to have internal voices whispering about the advantages of ZFS, on board cryptology co-processors, and the use of Solaris to avoid PC style virtualization.

Basically, the cost of refusing to adopt better technologies is about to go up again - so my bottom line advice to the guy I don't know from the company I don't know was simple: bet on the world getting smarter, and consider the role cryptology seems likely to play in your life before today's new boxes hit retirement age.


Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.