There is some more detail available on the Common Platform alliance 32nm from the advance material put together by the organisers of the VLSI Technology Symposium. It illustrates the kind of bets that the fabs and foundries are making on the next generation of silicon, although the details are still sketchy.
With the 32nm generation, IBM and its partners expect to be able to deliver a process that exceeds the industry consensus on what is needed at that point. The consensus is summed up in the pile of documents that go by the name of the International Technology Roadmap for Semiconductors (ITRS). Taken together, the documents are effectively the guidelines for what the industry needs to stay on Moore's Law.
Scattered throughout the PDFs are tables of specifications that semiconductors should get close to if they are to be useful at a given process geometry. The numbers are all colour coded: yellow means tricky but possible; red means nobody has an answer yet, or at least one they've shared in public.
If we ignore what high-end microprocessors needs - let's face it, Intel has that market more or less cornered - the main battleground is in what the ITRS terms the low operating power (LOP) and low standby power (LSTP) processes. There is a question mark over how long LSTP can stay in business. Today, transistors are leaky things, and current just dribbles out of them even when they are supposed to be off. The ITRS has upped what it considers acceptable leakage to 30pA/µm. No-one is quoting anywhere near that number.
Both IBM and TSMC's processes seem to aimed squarely at the LOP zone. Based on what TSMC’s deputy director of low-power technology Shien-Yang Wu said at IEDM, the Taiwanese foundry looks more or less on track in ITRS terms with a current drive of 700µA/µm against a leakage current of 1nA for an NMOS transistor. Current drive (Ion) is one of the figures of merit that process engineers use to work out whether a transistor is going to be good enough. The ITRS quotes 760µA/µm for the 2009 timeframe.
IBM's alliance looks in better shape with a figure 1000µA/µm against 1nA/µm. With metal gates, the process should yield better results and that seems to be the case. All we have right now is a few figures. But, typically, if you increase the threshold voltage, you can cut leakage by sacrificing some current drive. The metal-gate option has given IBM, Chartered Semiconductor Manufacturing and friends some wiggle room to cut leakage and maybe get the process into the LSTP box as well. That does not seem to be something that TSMC will be able to do with a 32nm process that uses more conventional polysilicon gates.
However, what TSMC seems to have done is introduce a second, thicker gate oxide for use on memory cells – chips today are full of these, so you don't want them to leak by crazy. It means an extra process step but the foundry seems to be making the tradeoff that a couple of mask steps here or there will still be cheaper than going to a metal gate.
It is possible that IBM has come up with a metal-gate process that is comparable in cost - but it seems unlikely. A single-metal process is way cheaper than having to use two but the treatment you have to do to the gate stacks probably means that the cost adder is higher than that of just using polysilicon. But, you should get higher performance for a given amount of leakage current than if you stick with a polysilicon gate-based process.
This is where the design bit comes in. What is happening now is that, if you do nothing else, doubling the number of transistors on a chip as you move down the geometry curve more than doubles the leakage power. A process tweak here or there is nice, but still doesn't do all that much for battery life.
However, if you ruthlessly shut entire blocks down when they are not doing anything, the leakage current does not matter anywhere near as much. What happens is that you run each block flat out for a few microseconds then pull the plug - saving your work each time, of course. Leakage does matter in the memory cells. But, even there, you can put them into so-called drowsy mode - they just about retain their state, perhaps with a little error-correction assistance - which can also cut power consumption. In this environment, the leakage current through a given transistor becomes less important, thanks to better design.
Looking at 32nm as a foundry process, the Common Platform process may be forced into the upper end of the LOP niche against TSMC, which holds the lion's share of the foundry business already, holding on with what is likely to be a cheaper process. Now, there are risks on TSMC's side. It has had to use copious amounts of strain engineering to get its transistors into the LOP zone. The benefits of strain, particularly with the already slower PMOS transistors, are tailing off as transistors get slower. To get performance, users may have to sacrifice density, which makes the 32nm process less attractive. However, as I wrote yesterday, raw clock speed is not as important as raw density in a lot of low-power designs, which tends to favour TSMC's approach.
More of the tradeoffs will become apparent as June rolls around.