• ECRC has developed highly efficient and concurrent GPU and manycore CPU implementations of hierarchically low-rank methods of linear algebra ("H-matrices"), which are its secret sauce for many applications in integral equations in wave scattering, covariances in statistics, Hessians in optimization, and Schur complements in electrodynamics and mechanics; these accommodate much larger problems in the fast but small memories of GPUs than any other method and run at close to the limiting resource, whether it be memory bandwidth or compute capacity.
  • ECRC has pioneered the use of the data-sparse technologies in developing second-order optimization methods for seismic inversion (and generically other problems of PDE-constrained optimization) without first requiring the construction or storage of Hessian matrices; this is a key technology for future growth of the center, in both simulation and machine learning contexts.
  • ECRC has developed highly efficient GPU and manycore CPU implementations of tile low-rank (TLR) methods of linear algebra for the same applications listed above; these readily leverage massively parallel task0based dynamic runtime systems in shared and distributed memory.
  • ECRC has developed a universal application programmer interface (API) for asynchronous task- based runtime systems that allow runtime-dynamic switches between different runtime systems that possess complementary advantages, to employ the best performing system for each phase and each hardware-application combination without recoding.
  • ECRC has pioneered and demonstrated high-order PDE discretizations for the Navier-Stokes equations of fluid dynamics that are discretely entropy-stable and thus able to exploit emerging high- performance architectures while retaining the desired stability properties of low-order methods that are memory-bandwidth bound to low performance on emerging architectures.
  • ECRC has pioneered and demonstrated energy-conserving PDE discretizations for linear wave propagation problems that preserve stability at discontinuous changes of discretization type, such as finite-element-to-finite-difference interfaces, acoustic-to-elastic interfaces, and grid resolution differences; these are important in practice for multiphysics and multiscale phenomena.
  • ECRC has applied many of the techniques above to scaling the performance of practical problems in computational science and engineering – in integral equations, differential equations, generalized eigenproblems (e.g., Schroedinger), and statistics, and is now pursuing applications in large-scale data analytics beyond simple statistical regression and in machine learning applications.
  • ECRC-funded researchers have published approximately 120 papers in refereed journals and conference proceedings, without benefit of a CCF until April 2018 (15 months ago) or of any Centre Partnership Funds or Wedge Funds; this is essentially the work of the software engineer research scientists on the base budget and their collaborations with KAUST doctoral students, plus the small faculty-researcher teams at CCF partners University of Texas and New York University.

 

Translational Accomplishments

  • ECRC has published on github.org 8 pieces of open-source software, and will add software from at least 3 more projects developed over the past 5 years when it is mature and when the PhD theses under which it was developed are securely accepted.
  • KBLAS is a package of many linear algebra routines for dense symmetric matrices on GPUs that outperformed all other packages and continues to do so; NVIDIA adopted it and it distributes it in cuBLAS with every scientific-purpose GPU globally.
  • QDWH is a polar decomposition and QDWH_SVD a singular value decomposition for distributed memory that outperformed all other packages for the corresponding solvers in Cray’s LibSci; Cray adopted it and distributes it in LibSci with every Cray licensed globally.
  • ExaGeoStat and ExaGeoStatR are front ends for Maximum Likelihood Estimation that layer on ECRC linear algebra software exploiting concurrency, mixed precision, and data sparsity to offer covariance matrix manipulations key to geospatial statistics at unprecedented data sizes; these are now used at KAUST with large data sets and are being picked up by external R users (R is the lingua franca “MATLAB” of statistics).
  • GIRIH is a mesh traversal package for shared memory manycore processors that beat all known packages when released and has been widely adopted; Intel has hired GIRIH's author out of his Berkeley post-doc to develop their wave propagation solvers for Aramco’s seismic inversion.
  • MOAO is a package for several dense matrix manipulations in multi-objective adaptive optics; it is in use in the real-time application of the Subaru telescope of Japan in Hawaii and is being used to develop and is destined for field use in the European Extreme Large Telescope (E-ELT) being built on a mountaintop in Chile.
  • MLBS (not yet on GitHub) is a multi-layer buffer system for speeding up I/O using processor resources in excess that was co-developed with Aramco in ECRC collaboration and is being incorporated into Aramco’s GeoDrive.
  • ECRC-funded post-doc Longfei Gao has collaborated with Aramco on new wave propagation techniques for the treatment of salt bodies that resolves transverse (elastic) waves heretofore neglected in purely compressive (acoustic) models of immersed salt bodies.
  • Two ECRC-funded research scientists are among the 14 developers globally of the Portable Extensible Toolkit for Scientific Computation (PETSc), which won a US R&D 100 Award, is distributed in Cray’s LibSci and is incorporated into literally dozens of open source and commercially licensed packages for the parallel solution of partial differential equations (only Argonne National Lab employs more PETSc developers than the ECRC).
  • An ECRC-funded research scientist is part of the US DOE MFEM Exascale Computing Project for high-order mimetic finite element modeling for vector-field computations that preserve at the discrete level various continuous properties (such as div B=0 in electromagnetism).
  • An ECRC-funded research scientist is the lead software architect behind the unstructured high- order code that has been licensed to McLaren as part of the KAUST aerodynamics collaboration with this Formula-1 racing team.
  • Together with the group of Professor Slim Alouini, the ECRC has patented a “sphere detector” for massive multiple-input multiple-output (MIMO) decoding of wireless communications, satisfying real- time and lower-power constraints.