The US may have just pulled even with China in the race to build supercomputing’s next big thing
There was much celebrating in America last month when the US Department of Energy unveiled Summit, the world’s fastest supercomputer. Now the race is on to achieve the next significant milestone in processing power: exascale computing.
This involves building a machine within the next few years that’s capable of a billion billion calculations per second, or one exaflop, which would make it five times faster than Summit (see chart). Every person on Earth would have to do a calculation every second of every day for just over four years to match what an exascale machine will be able to do in a flash.
This phenomenal power will enable researchers to run massively complex simulations that spark advances in many fields, from climate science to genomics, renewable energy, and artificial intelligence. “Exascale computers are powerful scientific instruments, much like [particle] colliders or giant telescopes,” says Jack Dongarra, a supercomputing expert at the University of Tennessee.
The machines will also be useful in industry, where they will be used for things like speeding up product design and identifying new materials. The military and intelligence agencies will be keen to get their hands on the computers, which will be used for national security applications, too.
The race to hit the exascale milestone is part of a burgeoning competition for technological leadership between China and the US. (Japan and Europe are also working on their own computers; the Japanese hope to have a machine running in 2021 and the Europeans in 2023.)
In 2015, China unveiled a plan to produce an exascale machine by the end of 2020, and multiple reports over the past year or so have suggested it’s on track to achieve its ambitious goal. But in an interview with MIT Technology Review, Depei Qian, a professor at Beihang University in Beijing who helps manage the country’s exascale effort, explained it could fall behind schedule. “I don’t know if we can still make it by the end of 2020,” he said. “There may be a year or half a year’s delay.”
Teams in China have been working on three prototype exascale machines, two of which use homegrown chips derived from work on existing supercomputers the country has developed. The third uses licensed processor technology. Qian says that the pros and cons of each approach are still being evaluated, and that a call for proposals to build a fully functioning exascale computer has been pushed back.
Given the huge challenges involved in creating such a powerful computer, timetables can easily slip, which could make an opening for the US. China’s initial goal forced the American government to accelerate its own road map and commit to delivering its first exascale computer in 2021, two years ahead of its original target. The American machine, called Aurora, is being developed for the Department of Energy’s Argonne National Laboratory in Illinois. Supercomputing company Cray is building the system for Argonne, and Intel is making chips for the machine.
To boost supercomputers’ performance, engineers working on exascale systems around the world are using parallelism, which involves packing many thousands of chips into millions of processing units known as cores. Finding the best way to get all these to work in harmony requires time-consuming experimentation.
Moving data between processors, and into and out of storage, also soaks up a lot of energy, which means the cost of operating a machine over its lifetime can exceed the cost of building it. The DoE has set an upper limit of 40 megawatts of power for an exascale computer, which would roughly translate into an electricity budget of $40 million a year.
To lower power consumption, engineers are placing three-dimensional stacks of memory chips as close as possible to compute cores to reduce the distance data has to travel, explains Steve Scott, the chief technology officer of Cray. And they’re increasingly using flash memory, which uses less power than alternative systems such as disk storage. Reducing these power needs makes it cheaper to store data at various points during a calculation, and that saved data can help an exascale machine recover quickly if a glitch occurs.
Such advances have helped the team behind Aurora. “We’re confident of [our] ability to deliver it in 2021,” says Scott.
More US machines will follow. In April the DoE announced a request for proposals worth up to $1.8 billion for two more exascale computers to come online between 2021 and 2023. These are expected to cost $400 million to $600 million each, with the remaining money being used to upgrade Aurora or even create a follow-on machine.
Both China and America are also funding work on software for exascale machines. China reportedly has teams working on some 15 application areas, while in the US, teams are working on 25, including applications in fields such as astrophysics and materials science. “Our goal is to deliver as many breakthroughs as possible,” says Katherine Yelick, the associate director for computing sciences at Lawrence Berkeley National Laboratory, who is part of the leadership team coordinating the US initiative.
While there’s plenty of national pride wrapped up in the race to get to exascale first, the work Yelick and other researchers are doing is a reminder that raw exascale computing power isn’t the true test of success here; what really matters is how well it’s harnessed to solve some of the world’s toughest problems.