The ability to scale current computing designs is reaching a breaking point, and chipmakers like Intel, Qualcomm, and AMD are putting their minds together on an alternative architecture to advance computing.
Chip makers are uniting around a sparse computational approach, which involves bringing computing to data instead of vice versa, which is what current computing is built on.
The concept is still a long way off, but a new design is needed because the current model of computing used to scale the world’s fastest supercomputers is unsustainable in the long run, said William Harrod, program manager at the Intelligence Advanced Research Projects Activity ( IARPA), during a keynote at the SC22 conference last week.
The current model is inefficient as it cannot keep up with the proliferation of data. Users have to wait hours for data results sent to compute hubs with accelerators and other resources. The new approach will reduce the distance data travels, process information more efficiently and intelligently, and generate results faster, Harrod said during the keynote.
“There needs to be an open discussion because we are moving from a world of dense computation… to a world of sparse computation. It’s a big transition and companies won’t move forward with changing designs until we can test and validate these ideas,” Harrod said.
One of the goals behind the sparse computing* approach is to generate results in near real time or in a short time, and see results as the data changes, said Harrod, who previously ran research programs at the Department of Health. ‘Energy that ultimately led to the development of exascale systems.
Current computing architecture pushes all data and processing problems, large and small, over networks in a network of processors, accelerators, and memory substructures. There are more efficient ways to solve problems, Harrod said.
The intent of a sparse computer system is to solve the problem of data movement. Current network designs and interfaces could bog down computing by moving data over long distances. Sparse computing reduces the distance data travels by intelligently processing it on closest chips and placing equal emphasis on software and hardware.
“I don’t see that the future depends on just getting a better accelerator, because getting a better accelerator isn’t going to solve the data transfer problem. In fact, most likely, the accelerator will be some sort of standard interface to the rest of the system that isn’t designed for this problem at all,” Harrod said.
Harrod learned a lot from designing exascale systems. One benefit was that increasing computational speed with the current computational architecture, which is modeled on von Neumann’s architecture, would not have been feasible in the long run.
Another conclusion was that the energy costs of moving data long distances were wasteful. The Department of Energy’s original goal was to create an exascale system in 2015-2016 that would run at 20 megawatts, but it took much longer. The world’s first exascale system, Frontier, which hit the Top500 earlier this year, draws 21 megawatts.
“We have incredibly sparse datasets and very few operations that are performed on the datasets. So you do a lot of data movement, but you don’t get a lot of operations from it. What you really want to do is move data efficiently,” Harrod said.
Not all computer problems are created equal, and attacking GPU problems big and small isn’t always the answer, Harrod said. In a dense computational model, moving smaller problems to high-performance accelerators is inefficient.
IARPA’s computing initiative, called AGILE (short for Advanced Graphical Intelligence Logical Computing Environment), is designed to “define the future of computing based on the data movement problem, not the floating point units of ALUs.” stated Harrod.
Computing is typically based on generating results from unstructured data distributed across a large network of sources. The sparse computing model involves breaking up the dense model into a more distributed and asynchronous computing system where the computing gets to the data where it is needed. The assumption is that localized computing does a better job of reducing data travel time.
The software weighs in the same way, with a focus on applications like chart analysis, where the strength between data connections is continuously analyzed. The sparse model of computation also applies to machine learning, statistical methods, linear algebra, and data filtering.
IARPA has signed six contracts with organizations including AMD, Georgia Tech, Indiana University, Intel Federal LLC, Qualcomm, University of Chicago on the best approach to non-von Neumann computing model development.
“There will be an open discussion about the ideas being funded,” Harrod said.
The proposals suggest technological approaches such as developing elements of data-driven computing, and some of these technologies are already there, such as CPUs with HBM memory and memory modules on substrates, Harrod said, adding “it doesn’t solve all the problems you have here, but it’s a step in that direction.
The second technological approach involves intelligent mechanisms for moving data. “It’s not just a matter of a floating point sitting there loading storage, it’s not a clever mechanism to move data around,” Harrod said.
More importantly, you need to focus on the runtime system as orchestrator of the sparse computing system.
“The assumption here is that these systems are always doing something. You really need to have something that’s looking to see what’s going on. You don’t want to be a programmer who takes total control of all of this, then we’re all in serious trouble,” Harrod said.
The runtime will be important in creating the real-time nature of the computing environment.
“We want to be in a predictive environment versus a forensic environment,” Harrod said.
Proposals will need to be tested and validated using tools like FireSim, which measure the performance of new architectures, Harrod said.
Approaches of the Six Partners (aka Performers in IARPA-speak):
* Sparse computation here is distinct from the well-established concept of “sparcity” in HPC and AI, where a matrix structure is sparse if it contains mostly zeros.