Invisible computing.
The easiest way to make a large system implementation fail is to under power the hardware - particularly during user training and pre-implementation pilot operations.
The reason this works is that users want systems to be non frustrating - and the negative impact of one experience with something that's frustratingly slow or just stupidly click intensive will outweigh all the preaching, promising, and lying companies do to gain user buy-in.
It's a madness of crowds phenomenon - bad news is more credible, and repeated more often, than good news because its transmission reinforces emotional barriers to change already in place.
There's a hardware corollary to that - and it's a much bigger issue than you might think because user perception is often the dominant determinant of the success or failure of an attempt to change the way an organization does business.
Making a project succeed is, of course, a lot harder than making it fail, but one of the best things you can do to improve your project's chances is to make it be, or at least seem, fast. I like, for example, to take the pilot machine out of the data center and put it in the user offices - because that bypasses most of the network delay and isolates my application from DHCP and user boot or other start-up script delays.
In general, however, the closer your system is to being invisible to users, the better off you'll be - and the best way to approximate invisibility is not to intrude on their consciousness by making anything about the thing frustrating - and, in particular, by making it fast enough that they're never aware of waiting for it.
With that in mind, therefore, lets look at a Notesbench.org comparison between Domino on a Sun 1.4Ghz T1 UltraSPARC and the same software on an IBM dual processor (4 core) Xeon 5160 at 3.0Ghz - and then follow that with the headlines from a Sun presentation about the second generation T1 processor due out this summer.
Sun T2000; 64GB, eight core, 1.4Ghz | IBM System x3650 (Dual Intel Xeon 5160 3.0GHz; 24GB RAM) | |
Avg Response Time: | 0.692 (Sec) | 3.056 (Seconds) |
NotesMark (tpm) | 19,518 | 17,777 |
$/NotesMark | 5.33 | 4.30 |
IBM's total system cost, including SuSe Linux 9.0 came to $76,466.71 and included two Domino licenses.
Sun, which used the old Sun volume manager under Solaris 10 instead of ZFS - I'm guessing because the benchmarking rules fail to contemplate anything beyond traditional RAID - required three Domino licenses and came in with a total system cost at list price of $104,051.75.
Note, however, that this pricing is not realistic. The 1.4Ghz/64GB T2000 has an introductory list price of $84,995 - $20,200 more than the combined cost for two 1.2Ghz, 32GB units. The 1.2Ghz T2000, in an earlier test, produced average response times of 0.4 seconds with 16,061 completions per minute -90% of IBM's completions at one third the cost. What's going on with that price is market skimming combined with order backlogs and the early production nature of the 4GB memory modules. What's going on with performance is that Sun's inability to use ZFS led to write completion blockages on the RAID hardware - the Sun CPUs were underutilized in both tests where IBM's Xeon's were maxed out throughout.
On the surface, however, this looks like a clear win for the IBM Xeon: its four core, cumulative 12Ghz, machine cost about 25% less than Sun's, and produced about 90% of the transactions per minute.
Except that when you think about user perception, the Sun's 0.7 second response time would make it seem fast to users while IBM's 3.056 seconds would make them wait long enough to become conscious that they were waiting - and that perceptional difference doesn't count for the Sun so much as it counts against the IBM, because its apparent slowness will frustrate users, thereby dooming your project.
That difference in both real and perceived performance will shift further toward the CMT/SMP approach, and thus in favor of users, when the second "Niagara" generation hits the market this summer. Here are the headlines from the opening slide for a presentation by some Sun people at the most recent ISSC conference.
- 2nd generation CMT (Chip Multi-threading) processor, optimized for Space, Power, and Performance (SWaP)
- 8 SPARC cores, 4MB shared L2 cache; Supports concurrent execution of 64 threads.
- 2x UltraSPARC T1's throughput performance and performance/Watt.
- 10x improvement in Floating Point throughput performance
- Integrates important SOC components on chip:
- Two 10G Ethernet (XAUI) ports on chip
- Advanced Cryptographic support at wire speed
- On-chip PCI-Express, Ethernet, and FBDIMM memory Interfaces are SerDes based.
So what's the bottom line? Users care about performance, not capacity. If you're looking at your "platform decision" for the next few years you need to remember the elephant in the room - and it it's not power use that counts: it's the difference in usability perception between that 0.692 second average response time for today's UltraSPARC T1 and the IBM Xeon's 3.056 seconds.