- by Paul Murphy, -
Two weeks ago I looked at IBM's forthcoming cell processor architecture and last week speculated about the impact this might have on the x86 desktop. This week I want to go beyond that and look at the impact cell will have on the battle for server dominance over the next five years.
IBM isn't the only company coming out with a new CPU technology. Sun's throughput computing is equally revolutionary - and as little understood despite being closer to realization.
Look at both products from a distance and what you see is two companies using broadly similar technologies to implement radically different ideas about how server computing should be done. Both rely on Unix to make things work and both are building multi-CPU assemblies on single pieces of silicon. But, where Sun is pursuing the Unix ideal of large resources equally available to all via networked SMP, IBM is doing the opposite by using a GRID on a chip to provide easier and more secure partitioning, process isolation, resource management, and user activity tracking. Both companies are selling server to desktop strategies to data center management, but in IBM's case its technology strategies fit perfectly with customer beliefs about how computing should be managed while Sun's sales people have to fudge and shuffle because the right way to run Solaris is pretty much the opposite of what the traditional data center manager knows and understands.
These differences in vision, technology, and marketing have deep historical roots. In 1964 and 65, when MIT was developing its vision of future computing, two broad camps emerged. One group sought ways to evolve the academic traditions of openness, peer review, and community into the promised digital era by treating the computer as a communications device extending the individual user's reach across both time and space. In contrast, the other group saw the computer mainly as a machine for replacing clerks, offering the ability to get increasingly complex work done quickly and accurately.
In the end the academic side won the funding battle and the Multiplexed Information and Computing Service (Multics) development project was born with an open source agenda as described by designers Corbat and Vyssotsky, in 1965 when they wrote
It is expected that the Multics system will be published when it is operating substantially. ... Such publication is desirable for two reasons: First, the system should withstand public scrutiny and criticism volunteered by interested readers; second, in an age of increasing complexity, it is an obligation to present and future system designers to make the inner operating system as lucid as possible so as to reveal the basic system issues.
Unfortunately the academics who had won the funding battle lost the war as the various teams contracted to deliver on their ideas veered toward a more traditional, production oriented, understanding of what systems do and how they work. As a result Multics eventually became a nice interactive operating system that served several generations of Honeywell (and later Bull) users very well, but didn't meet the original agenda as a means of extending the power of the individual through communications.
Nine years later Dennis Ritchie found a positive way, in The Evolution of the Unix Time-sharing System, to express the fundamental differences between the Uniplexed Information and Computing Service (Unics) designers and Multics management as a statement of the Unix design motivation:
What we wanted to preserve was not just a good environment in which to do programming, but a system around which a fellowship could form. We knew from experience that the essence of communal computing, as supplied by remote-access, time-shared machines, is not just to type programs into a terminal instead of a keypunch, but to encourage close communication.
As a result Unix became a bottom up movement, a sort of guerrilla OS reluctantly sanctioned by management because users found it too useful to let go. From Bell labs it escaped into academia and ultimately became what we have now, three major research directions expressed as Solaris, Linux, and BSD but all carrying forward the original commitment to openness and the use of the technology to weld together communities of users by improving communications across both time and distance.
Thus Sun's technology strategy reflects key Unix ideas and relies on its long lead in Solaris SMP to deliver on them without breaking support for existing SPARC applications. That's why throughput computing provides for parallel light weight processes in a "system on a chip" SMP environment with tightly integrated on board memory and I/O controllers while Solaris 10 already sports improved lock management and support for a brand new I/O infrastructure built around uniform access to very large file systems.
More speculatively, Sun is rumoured to be working on a pair of software products which, if successful, could make a low megahertz, throughput oriented, CPU like the Niagara a power on the workstation computing front. The first of these seems to be a preprocessor whose operation is akin to having a very good engineer insert the openMP pragmas and so allow applications to take maximum advantage of available CPU parallelism. The second is thought to use markup inserted by the first to do the opposite - allow the executables produced to run effectively on more traditional SPARC gear.
At the sales level, furthermore, Sun has a desktop to server story no other company can match. The Java enterprise system comprises everything needed from data center to desktop with services deliverable on the secure, and low cost, Sunray or, for the more traditionally minded, on new or recycled x86 desktops running either Linux or Solaris.
Go back to the mid sixties to review IBM's responses to the MIT bid opportunity and what you see is very different. IBM management backed the more traditional approach, but the people who lost went on to invent VM and heavily influence the "future systems" design later released as the technically phenomenal System-38 (now iSeries).
VM was as much of a technical success as Unix and shared its heritage as a technology relying on user support against management deprecation for its existence, but was completely opposed to it in philosophy. Where Unix united users, VM separated them; where Unix freed users from boundaries, VM imposed resource limits and management controls; where Unix migrated to SMP and greater resource availability, VM evolved to work with finer grained hardware partitioning, systems virtualization, and ever tighter controls over resource use.
Within the IBM operating systems community, however, VM offered major advantages to users simply because the CMS shell did offer interactive service and it didn't take long for users to subvert part of the design by forcing ways to share disks and other communications channels with each other. As a result it quickly became one of IBM's most important offerings and its later evolution both influenced, and was influenced by, other development work within IBM.
GRID computing is much less resource efficient than SMP computing, but cheaper to build, ridiculously easy to partition, and highly scalable in terms of the number of tasks supported - making it the natural hardware expression for key VM ideas on process separation, resource management, and usage tracking. The cell technology lets IBM move this from theory to practice; implementing VM control ideas across large scale GRIDs.
Such machines have not, in the past, achieved very good processor utilisation with even the best of today's GRID style super-computers not typically maintaining even 30% of their theoretical LINPACK capacity when subjected to real workloads. IBM's machine, however, is expected to produce much higher ratios for two reasons. The most obvious of these is that the built in high speed communication combines with the GRID on a chip approach to allow very dense processor packaging in which propagation delay is minimised. Less obviously, however, performance losses usually increase non linearly with the number of CPUs in a grid, and partitioning a heavily loaded grid therefore improves performance by decreasing the number of "engines" allocated to each process.
That's counter-intuitive, and the exact opposite of what happens in an SMP environment where partitioning wastes resources, but will have the effect of allowing IBM to scale servers to benchmarks and consequently turn in some awesome numbers while ratifying its community's most deeply held perceptions about the one right way to manager servers.
The Achile's heel in IBM's strategy may turn out to be power use. This won't help Wintel which suffers from the same problem, but may offer Sun a significant advantage because SMP on a chip naturally reduces power consumption by eliminating redundancies and reducing transmission distances. Actual system power usage is usually largely determined by memory and therefore workload but, all other things being equal, Sun's processor advantage coupled with the use of higher density (lower power) memory such as that produced by Micron's new "6F" process could mean that a typical 16-way "Rock" based system running a mixed workload in 64GB of memory will draw about the same power as a four way IBM Cell system with 16GB of Toshiba's custom DDR product.
More speculatively, if work on the preprocessors succeeds Sun could opt for a memory embedded floating point array processor along the lines pioneered by Micron's "Yukon" technology without giving up either backwards software compatibility with previous SPARC products or its advantages in SMP throughput. That, of course, would give it the ability to at least match IBM on floating point performance without incurring the power costs that go with high megahertz rates.
Right now IBM seems to be several years behind Sun on both hardware and software, but GRIDs are simpler than SMP and cellular computing therefore offers IBM both a performance advantage and a way to catch up. As a result the coming battle for data center supremacy seems to be shaping up as a clear competition between the hardware expressions of key ideas on both sides.
In the IBM corner, traditional mainframe ideas about control and management animate GRID-on-a-Chip computing with ultra secure partitioning and per process resource allocation backed by unrivalled floating point performance, easy scalability, and low cost.
In the Unix corner, the movement toward information and community integration will continue as the network disappears into the computer with both SMP and I/O moving into hardware to provide low cost, high bandwidth, software services to entire communities of users.
That's two unopposable forces heading for collision - and grinding up Wintel between them.