|
Vol
19, Issue 9
|
 |
February 28, 2005
|
By Kevin Krewell
One challenge of being an analyst is separating the important from the superficial, the reality from the hyperbole. The recent unveiling of details of the Cell processor at ISSCC, produced coverage that was, to say the least, extravagant. Coverage of the Cell processor was all over the web and print media. The reason for the press frenzy was clearly related to Cell's use in Sony's next-generation game/ media console, replacing the PlayStation 2. It was not a good visual news storywith just a wafer, nonworking packaged part, and no "PlayStation 3" box to showso television coverage was modest. Another part of the media excitement may have to do with a perception that PC processors have become boring and repetitive. (The clock-frequency wars are over, and the race to dual-core processors isn't so exciting.) But in all the noise, the Cell processor, which is part of a family of processors IBM calls the Broadband Processor Architecture (BPA), was somewhat misrepresented, and coverage of Cell swamped all other news at ISSCC.
Some called Cell an Intel killer; which is completely ridiculous. Sony's next-generation game console (let's call it SNGGC, just to have some fun with acronyms) will have features that make it a reasonable media center (missing only a sizable hard drive to store lots of large music and video files) in addition to being an exceptional game machine. The only place where the Cell processor can be considered competition for Intel will be where the SNGGC competes with the Media Center PC (which still has many other competitors, including cable/satellite set-top boxes and TiVO) and for the consumer's money. You won't see Cell in mainstream servers (where the SIMD units are less useful) or in mainstream PCs (except for that rumor about Apple I'll cover later).
IBM touts the Cell chip as a supercomputer on a chip, and we admit that its single-precision floating-point math capability is quite impressive. Even the less optimized double-precision math function looks quite good at around 25GFLOPS peak. The measure we don't have accurate data on yet is GFLOPS/watt, because IBM has not released official power figures for the Cell processor. The IBM representatives at ISSCC would say only that the processor is designed to be air cooled. (We believe, with high confidence, that it will not be a passive heatsink solution.) We estimate that a 4GHz Cell processor will dissipate about 80W, not an unreasonable number considering the clock frequency and design complexity. We also can't say what sustained performance Cell will achieve running Linpack (the measurement used by the supercomputer rating site Top500.org).
A reasonable comparison of Cell's supercomputer potential is another supercomputer processor, the BlueGene/ L chip, used in IBM's top supercomputer. BlueGene/L has two PowerPC440 cores, each with dual double-precision (DP) floating-point units. BlueGene/L runs at only 700MHz and consumes only about 13W while producing 5.6GFLOPs (DP). If we judge each processor's DP floating-point performance on GFLOPS/W, then the Cell processor produces less than 0.3GFLOPS/W (if our power estimate proves true and assuming some derating for sustained peak DP floating-point operation), and the BlueGene/L processor is capable of 0.43GFLOPS/W (DP) peak. The BlueGene/L chip appears to be the more power-efficient solutionespecially if we count the complexity of programming the Cell processor and the little we know about Cell sustained performancebut a system built with Cell processors will also require fewer chips to reach similar performance levels. If the supercomputing application could use single-precision floating-point math, then the Cell processor would have far superior potential.
The other market for Cell that IBM has mentioned is handheld devices. IBM is quick to agree that the current instantiation of BPA has not been tuned for handheld operation. Dropping the core voltage to 1.00.9V and reducing the clock to 3.0GHz could lower power enough for a notebook form-factor design. Certainly, it could fit into a set-top box or high-end entertainment system, but it is probably overkill for those applications. IBM could design a smaller version of Cell, reducing the number of SIMD processors, called Synergistic Processing Elements (SPE). (Don't get me started on IBM's choice of names and acronyms!) The reduced version of the chip could also be made from manufacturing fallouts, where one or more SPEs have a defect and can be disabled. A true handheld version of Cell, however, will require a more significant reduction in power. A redesign would likely remove the high-bandwidth Rambus interfaces, reduce the number of SPEs, and use a different Power core, optimized for lower power and low clock frequencies.
Another rumor, to which I must admit I contributed, was that Cell could be used by Apple Computer. The advantages to Apple would be higher clock frequencies (marketed against Intel's Pentium 4); the SPEs, which can be used for powerful media processing; and relatively easy ports of SNGGC games to the Mac platform. The disadvantages include competing with Sony for chip supply and the fact that the far simpler, but faster, in-order dual-issue Power core in Cell would have a very different performance profile than that of the present PowerPC 970FX used in the Power Mac G5; Apple software has been tuned for the wide-issue PowerPC and would require retuning for the Cell processor. In addition, Apple would have to change programming tools to utilize the capabilities of the SPEs. It's one thing for game programmers to jump through hoops to fine-tune code for a major gaming console; it's another for applications programmers to go through a major rewrite of programs for the modest-volume Apple platform.
I've been writing about what the Cell processor is not. What the Cell processor is good at will be what it was primarily designed formedia and graphics. There will be other applications where Cell can make an impact, maybe in scientific fields such as geophysics research.
There are still many reasons to be excited about Cell. In fact, there is so much to the Cell processor that a bevy of books will probably be written about it. There will certainly be books about Cell processor programming, as that will be a significant challenge. IBM has some programming tools today, and we expect more from Sony soon. The use of multiple, uniform SIMD units offers opportunities and challenges for programmers. We plan to keep a careful eye on the development of programming tools and software for Cell.
We're so intrigued by the chip because it represents an alternative approach to building a powerful processor, yet it still has its roots in known high-performance system-design concepts. As a vector machine, it is not alone: MIT has a vector-threaded architecture called Scale, and the NEC Earth Simulator supercomputer is a vector machine. And then there are the historic examples of the CDC6600, Cray 1, the U.S. Air Force's JSTAR's PSP, and MIT's Project MAC. That's pretty good company.
We will explore different aspects of the chip and BPA in another story, diving more deeply into the chip design and tools. We will also have IBM's Jim Kahle giving a special presentation at Spring Processor Forum. We expect Kahle to talk more about how the Cell architecture can be used for the workloads of interest. What would you like to hear more about? Let me know, and you can help to define the content of future stories and of Kahle's presentation.
|