Chris Edwards: October 2009 Archives

Intel and Numonyx have taken the first step to building a stackable memory based on phase-change materials.

The team from the two companies has, so far, implemented only a single-layer 64Mb test chip but claimed the structure they have developed could lead to devices with many stacked layers, potentially satisfying demand for non-volatile memories with much higher densities than are available today. Greg Atwood, senior fellow at Numonyx, said the phase-change technology could potentially scale down to 5nm, whereas flash memory tends to run into problems at around 20nm.

A move into the third dimension could provide much higher capacities simply by layering slices of memory array on top of each other. “We can stack as high as we choose,” said Atwood. “But each layer will need more processing and the more processing you do the more risk you have of defects. So there will be a practical limit to the number of layers you can stack.”

Al Fazio, Intel fellow and technology and manufacturing group director of memory technology development, said the main requirement for a stackable memory is a crosspoint structure with a selection switch to ensure only the bit where two read lines cross is read or written. This can be achieved using silicon transistors or diodes but “it’s hard to stack a silicon diode”.

The team decided to use a material from the same family used in the phase-change memory element itself to act as a switch. When a voltage and a certain amount of current are applied, the material temporarily changes state and allows current through into the memory element itself. The resistance of the memory element determines whether the bit read is a one or zero.

The team has based its switch on a concept devised by the licensor of the original technology, Energy Conversion Devices, called the Ovonic threshold switch (OTS); the term is derived from the name of inventor Stanford Ovshinsky. The threshold switch uses a different mechanism to pass current to the memory devices, and is based on a different mix of the elements used in phase-change memories, allowing it to act as a temporary switch rather than a memory. However, a spokeswoman for Intel said the formulation of the material in the OTS for the stackable memory is Intel and Numonyx intellectual property.

For the technology to move to commercialisation, Fazio said more work needs to be done to explore how the layers will stack. “We see this work as a milestone towards realising the low-cost potential of this memory,” he said.

The structure will be described in a paper to be presented at the IEDM conference in Baltimore in December.

A few weeks ago, Sramana Mitra asked whether Intel might just buy ARM and have done with it. It's a supremely scary thought but one that's not beyond possibility. The purchase might raise antitrust issues and that's what a lot of ARM's customers would hope for. But the IP space is still reasonably fragmented.

Even though ARM's competitors in processor IP trail way behind, Intel could easily argue that their presence means that a purchase of ARM should be blocked. As long as MIPS Technologies stays afloat, Intel has a good argument. Plus you have Tensilica and Virage Logic, courtesy of the ARC buy, in the position to have a good poke at ARM's core market in handsets, particularly as ARM is pulling away from traditional DSP now that it's closing its Leuven centre - where the Optimode DSP technology was designed - and moving more aggressively into general-purpose and graphics processing.

ARM's current market cap is slightly south of £2bn ($3bn). The slump in the pound means $1bn has been wiped off the company's price to begin with. It's not cheap. Intel would have to pay a premium to fend off another chipmaker that wanted to keep a key supplier out of the x86 maker's hands. But could the ARM board afford to recommend a lowball white-knight deal, almost certainly a stock swap, when Intel could afford to offer a significant premium in crispy folding stuff?

Yes, shipments of ARM processors are on the up. Each time an ARM core is stamped onto silicon, the company earns somewhere between 5 and 6 cents. You'd have to promise a future where it will rain ARM-core chips every day to win the certain cash versus potential future earnings argument. ARM's numbers held up reasonably well in a crushing recession but this is all relative. But the difference between a good and a bad quarter is something like $50m. For Intel, it's more like a few billion.

An Intel offer would have to be a cash deal. Intel would almost certainly destroy ARM as a coherent business unit: it just does not have the corporate mentality to run an IP company the way ARM has done. But that destruction could just as easily send more billions Intel's way while it capitalises on a much stronger position for the x86 and the other IP suppliers scrabble for the bits Intel does not care for. Even if Intel chose to be careful about integrating ARM, it could learn a lot as a company about working with other companies in the fabless chipmaking environment - something it dealt with uneasily in the past decade.

The problem for ARM is that, right now, Intel does not have a great deal to win by mounting a bid today other than a cheaper price. But if ARM does make inroads into the netbook business - if OEMs decide Chrome OS on ARM is worth supporting against simply taking the market-development money from Intel for an x86-only portfolio - then it becomes more of a target for a hostile takeover attempt by Intel. And if ARM is not successful at displacing Intel from its home territory, then its threat is gone and the reasons for doing a deal ebb away.

Dealing with an 800lb gorilla at any time is hard. Dealing with one that has more cash on hand than you can earn in 20 years of trading is another matter.

Mike Muller apparently delivered a bit of Halloween scare material at his keynote at ARM Techcon in Santa Clara yesterday. Rick Merritt of EETimes summed up the warning:

In a decade, 11nm process technology could deliver devices with 16 times more transistors running 2.4 times as fast as today's parts, said Mike Muller in a keynote address. But those devices will only use a third as much energy as today's parts, leaving engineers with a power budget so pinched they may be able to activate only nine percent of those transistors, Muller said.

Muller's claim sounds scary but, in reality, the idea of dark silicon is happening today, not ten years from now.

If you put a temperature sensor over a dense chip designed for a portable device today, only a fraction of it would seem to be in use at any one time. At the conference, ARM launched its Cortex-A5. And one thing that vice president of marketing Eric Schorn stressed was the way you could deploy multiple cores and run them slowly, or not at all, to save power.

It's a technique that Apple seems likely to use in its own-design ARM processor. As Francis Sideco of iSuppli pointed out last year using the example of a seemingly over-powered V8 engine: "The car companies have done the same thing. If you are cruising down the highway, they shut down four out of the eight cylinders to save fuel."

Schorn's example was a browser running four threads that might take up 80 per cent of the cycles on a single core. Pass those out to four cores and you could wind down the clock frequency. Not only that, you can reduce the voltage that feeds each core, which is where you get the real energy saving. On a 40nm process, this technique should yield a 50 per cent energy reduction. There will be losses through leakage so, as voltage scaling gets harder - at this level, you are talking about changes of 0.2V or so - you may take the alternative route of running threads as fast as they will go and then simply shutting down until there is something else to do.

The upshot of all this is that, even now, designers assume that large chunks of the die will be doing sod all at any given point in time. The idea of designing chips that are only 10 per cent active at any one time does not seem that unusual or even undesirable. Dedicated hardware is, in general, way more efficient than software running on a general-purpose processor. But it's wasteful of die area. But, if you know that transistors are cheap and getting cheaper and you can never run everything at once without melting the chip, why not throw hardware at the problem?

In practice, you will see a compromise between dedicated hardware and general-purpose processing. But the principle will be that, most of the time, that silicon will indeed be dark. On top of that, there will be circuit-level tricks that will eke more useful work out of the watts available. ARM is working on its own implementations of the Razor system and it's possible that, with lots of transistors available, sub-threshold switching used today in some medical devices, could be practical for a larger variety of systems.