Design: October 2009 Archives

Mike Muller apparently delivered a bit of Halloween scare material at his keynote at ARM Techcon in Santa Clara yesterday. Rick Merritt of EETimes summed up the warning:

In a decade, 11nm process technology could deliver devices with 16 times more transistors running 2.4 times as fast as today's parts, said Mike Muller in a keynote address. But those devices will only use a third as much energy as today's parts, leaving engineers with a power budget so pinched they may be able to activate only nine percent of those transistors, Muller said.

Muller's claim sounds scary but, in reality, the idea of dark silicon is happening today, not ten years from now.

If you put a temperature sensor over a dense chip designed for a portable device today, only a fraction of it would seem to be in use at any one time. At the conference, ARM launched its Cortex-A5. And one thing that vice president of marketing Eric Schorn stressed was the way you could deploy multiple cores and run them slowly, or not at all, to save power.

It's a technique that Apple seems likely to use in its own-design ARM processor. As Francis Sideco of iSuppli pointed out last year using the example of a seemingly over-powered V8 engine: "The car companies have done the same thing. If you are cruising down the highway, they shut down four out of the eight cylinders to save fuel."

Schorn's example was a browser running four threads that might take up 80 per cent of the cycles on a single core. Pass those out to four cores and you could wind down the clock frequency. Not only that, you can reduce the voltage that feeds each core, which is where you get the real energy saving. On a 40nm process, this technique should yield a 50 per cent energy reduction. There will be losses through leakage so, as voltage scaling gets harder - at this level, you are talking about changes of 0.2V or so - you may take the alternative route of running threads as fast as they will go and then simply shutting down until there is something else to do.

The upshot of all this is that, even now, designers assume that large chunks of the die will be doing sod all at any given point in time. The idea of designing chips that are only 10 per cent active at any one time does not seem that unusual or even undesirable. Dedicated hardware is, in general, way more efficient than software running on a general-purpose processor. But it's wasteful of die area. But, if you know that transistors are cheap and getting cheaper and you can never run everything at once without melting the chip, why not throw hardware at the problem?

In practice, you will see a compromise between dedicated hardware and general-purpose processing. But the principle will be that, most of the time, that silicon will indeed be dark. On top of that, there will be circuit-level tricks that will eke more useful work out of the watts available. ARM is working on its own implementations of the Razor system and it's possible that, with lots of transistors available, sub-threshold switching used today in some medical devices, could be practical for a larger variety of systems.