High-performance computing (HPC) has become an essential tool for processing large datasets and simulating nature’s most complex systems. However, researchers face difficulties in developing more intensive models because Moore’s Law – which states that computational power doubles every two years – is slowing, and memory bandwidth still cannot keep up with it.
A team led by computer scientist Hatem Ltaief are tackling this problem head-on by employing hardware designed for artificial intelligence (AI) to help scientists make their code more efficient. They now report making simulations up to 150 times faster in the diverse fields of climate modeling, astronomy, seismic imaging and wireless communications[1].
Previously, Ltaief and co-workers showed that many scientists were riding the wave of hardware development and “over-solving” their models, carrying out lots of unnecessary calculations.
“With the increasing energy cost of data movement and hardware limitations in terms of energy efficiency, we need algorithmic innovations to rescue the scientific community, which is in panic mode,” explains Ltaief. “Reducing data movement becomes like reducing fuel consumption for airlines — a must. What if we could solve a huge memory footprint problem by only operating on the most significant information, and yet still achieve the required accuracy?”
Ltaief and co-workers started their quest to reduce data movement about five years ago. Their approach involves restructuring the HPC workloads so that they can run on AI-focused Intelligence Processing Units (IPUs) made by Graphcore, a company providing invaluable technical support. Crucially, the team organize the code into matrices – single mathematical objects that work efficiently with numerical libraries that are optimized for the IPUs.
“We can perform compression on the matrix operator that describes the physics of the problem, while maintaining a satisfactory accuracy level as if no compression was done,” says Ltaief. “We can still manipulate the resulting compressed data structures by executing linear algebra matrix operations and leverage the high bandwidth of the IPUs.”
This approach saves on memory footprint, data transfer and algorithmic complexity. It is already increasing the speed at which scientists can tackle problems such as adapting astronomical telescopes to real-time changes in the atmosphere or post-processing field data in seismic imaging. “I am lucky to work with scientists that embrace hardware technologies and understand how impactful multidisciplinary work can be,” says Ltaief.
“Linear algebra operations are the bottleneck of many applications,” says Ltaief’s colleague David Keyes. “You can expect to see more headlines about compression technology as it permeates new applications and hardware.”
REFERENCE
- Ltaief, H., Hong, Y., Dabah, A., Alomairy, R., Abdulah, S., Goreczny, C., Gepner, P., Ravasi, M., Gratadour, D. & Keyes, D. Steering customized AI architectures for HPC scientific applications. International Supercomputing Conference, Hamburg, Germany (2023). article