Top scientist urges 'ambitious' U.S. exascale supercomputer plan

There is an international race to build an exascale supercomputer, and one of the people leading it is Peter Beckman, a top computer scientist at the U.S. Department of Energy's Argonne National Laboratory.

The DOE has been working on exascale computingplanning for two years, said Beckman, but the funding to actually build such powerful systems has not been approved. And unless the U.S. makes a push for exascale computing, he said, it's not going happen. The estimated cost of an exascale project will be in the billions of dollars; an exact cost has not been announced by the department.

The most powerful systems today are measured in petaflops, meaning they're capable of quadrillions of operations per second. The fastest system, according to the latest Top500 supercomputing list, released this month, is China's 2.5 petaflop Tianhe-1A. An exascale system is measured in exaflops; an exaflop is 1 quintillion (or 1 million trillion) floating point operations per second. China, Europe and Japan are all working on exascale computing platforms.

Beckman, recently named director of the newly created Exascale Technology and Computing Institute and the Leadership Computing Facility at Argonne, spoke to Computerworld about some of the challenges ahead.

What is the exascale effort at this point? It is the realization or the understanding that we need to move the hardware, software and the applications to a new model. The DOE and others are looking to fund this but have only started with initial planning funding at this point.

The software effort that I'm leading with Jack Dongarra [a professor of computer science at the University of Tennessee and a distinguished research staff member at Oak Ridge National Laboratory] and some of the co-design pieces have planning money to get started, but the next step is for the government to put forward with a real ambitious plan and a real funded plan to do this.

What's happening, and I'm sure your readers and others know, is power constraints, budgets, architecture, clock speeds, have transformed what happens at every level of computing. In the past, where you had one CPU, maybe two, you are now looking at laptops with four cores, eight cores, and we just see this ramp happening where parallelism is going to explode. We have to adjust the algorithms and applications to use that parallelism.

At the same time, from a hardware and systems software perspective, there's a tremendous shift with power management and data center issues -- everything that's happening in the standard Web server space is happening in high-performance computing. But in high-performance computing, we are looking forward three to five years.

Think of it as a time machine. What happens in high-performance computing then happens in high-performance technical servers, and finally your laptop.

We're looking at that big change and saying what we need is a real organized effort on the hardware, software and applications to tackle this. It can't just be one of those. In the past, the vendors have designed a new system and then in some sense it comes out, and users look at it and ask: "How do I port my code to this?" or "What we're looking at is improving that model to 'co-design'" -- a notion that comes from the embedded computing space, where the users of the system, the hardware architects and the software people, all get together and make trade-offs with what the best optimized supercomputer will look like to answer science questions.

In the end, it's about answering fundamental science questions, designing more fuel-efficient cars, designing better lithium batteries¸ understanding our climate, new drugs, all of that.