This thesis addresses the exponential growth of experimental data and the resulting computational complexity seen in two major scientific applications, which account for most cycles consumed on today's supercomputers.
The first application is about deploying the next-generation ground-based telescopes, code-named Extremely Large Telescopes (ELT), that will enable us to gain much more knowledge of the universe. But it also confronts the astronomy community with daunting real-time computation requirements. The second application deals with emerging frequency-domain methods, such as Seismic Redatuming by Inversion, which are game-changers in exploration geophysics. They are used in the context of geothermal exploration and carbon capture. However, these methods may not be attractive as commonly posed since they are computationally intensive and expensive in terms of memory footprint.
We tackle the aforementioned challenges by designing and implementing high-performance algebraic and stochastic algorithms, which exploit the data sparsity structure of the matrix operator. Data sparsity structures arises in randomized optimization algorithms such as stochastic gradient descent and Adam, which become popular in the machine learning community. They also appear when solving large covariance matrix problems that capture the correlations between regularly-spaced sources and receivers for seismic imaging or wavefront sensors detecting the atmospheric turbulence for ground-based computational astronomy. Algebraic compression based on low-rank approximations can be used to retain the most significant spectrum of the operator and provide numerical solutions at the required application's accuracy. In addition to algebraic compression, multiple precisions can further reduce the data volume by trading off application accuracy for memory footprint.
We exploit the data sparsity of several data matrices representative of these two scientific applications and design three algorithms to accelerate the corresponding computation workload. Firstly, we design a stochastic Levenberg-Marquardt method to accelerate the soft real-time control of ground-based telescopes. Secondly, we design a tile low-rank matrix-vector multiplication algorithm used in both hard real-time control of ground-based telescopes and seismic redatuming. Lastly, we leverage multiple precisions to mitigate communication overheads within the memory subsystem and further improve the performance of seismic redatuming applications. In both applications, we achieve an order of magnitude in time to solution and data size reduction.
We deploy our algorithms on various hardware platforms such as x86_64, ARM A64FX, NEC vector engine, GPU-based accelerators and customized AI accelerators. Our algorithmic solutions deliver significant performance enhancement, while maintaining application accuracy.
Yuxi Hong is a PhD student in the Extreme Computing Research Center at King Abdullah University of Science and Technology (KAUST). He received an MS degree in Electronics Engineering and a BSc in Electrical Engineering from Tsinghua University. His current research interests include High Performance Computing, Numerical Linear Algebra, GPU programming, HPC Applications Performance Optimization, Efficient Machine Learning/Deep learning.