Forget petascale, here comes the exascale supercomputer


Belgium-based research institute IMEC has teamed up with Intel and a group of local universities on a programme that is intended to pave the way for exascale computers – supercomputers that are close to a thousand times more powerful than those being commissioned today.

“In 1997, we saw the first terascale machines. A few years ago, petascale appeared. We will hit exascale in around 2018,” said Wilfried Verachtert, high-performance computing project manager at IMEC, explaining that these machines will be able to perform 1018 floating-point calculations per second.

The most powerful supercomputer being made today is the Cray XT5 Jaguar with a rated performance of close to 2 petaflops

At a presentation held to celebrate the opening of a new cleanroom at IMEC and the foundation of the ExaScience lab, Martin Curley, senior principal engineer and director of Intel Labs Europe, said: “We are focused on creating the future of supercomputing. We have a job to do of creating a sustainable future. Exascale computing can really change our world.”

Curley said a the two main problems will be power consumption and the difficulty of writing highly parallel software. The performance required is the equivalent of 50 million laptops which would demand thousands of megawatts of power.

He explained that, by the time exascale computers are likely to appear, silicon-chip geometries will have dropped to 10nm. Although these devices can potentially run at tens of gigahertz, Curley said power consumption concerns would force supercomputer makers to run them much more slowly and potentially even slower than today’s processors. The move will demand billions of processing units in one supercomputer. “How are we going to achieve that? The only way is through billion-operation parallelism.”

Curley added: “Even with just 10 to 12 cores, we see the performance of commercial microprocessors begin to degrade. The biggest single challenge is parallelism.”

The ExaScience lab will, as its test application, work on software to predict the damage caused by the powerful magnetic fields that follow solar flares in the hope of providing more accurate information to satellite operators and the power-grid companies.

With current-generation supercomputers, the mesh used to analyse field strength has elements that are a million kilometres across, far larger than the Earth itself. An exascale machine would make it possible to scale the mesh size down to elements that are 10,000km across.

Verachtert said the project aims to get the power consumption of a machine from 7000MW – based on today’s technology - to 50MW, “and that is still higher than we want”.

One problem with a supercomputer than contains millions of discrete processors, each one containing thousands of processing elements, is the expected failure rate. “My optimistic projection is that there will be a failure every minute. It’s possible that there will be a failure every second. We have to do something about that.”

The failure rate will have a knock-on effect on programming. Today, it is possible to break up applications so that portions can be re-run after a hardware failure, which may happen once a day. That is impossible as the size of the machine scales up. Verachtert said the methods programmers use will have to take account of processors failing, using checkpoints and other techniques such as transactional memory – which Intel has researched heavily already – to allow code to be re-run automatically without disrupting other parts of the application.