2020

Cao, Q., Pei, Y., Akbudak, K., Mikhalev, A., Bosilca, G., Ltaief, H., … Dongarra, J. (2020). Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications. Proceedings of the Platform for Advanced Scientific Computing Conference. https://doi.org/10.1145/3394277.3401846
Al-Harthi, N., Alomairy, R., Akbudak, K., Chen, R., Ltaief, H., Bagci, H., & Keyes, D. (2020). Solving Acoustic Boundary Integral Equations Using High Performance Tile Low-Rank LU Factorization. High Performance Computing, 209–229. https://doi.org/10.1007/978-3-030-50743-5_11
Akbudak, K., Ltaief, H., Etienne, V., Abdelkhalak, R., Tonellot, T., & Keyes, D. (2020). Asynchronous computations for solving the acoustic wave propagation equation. The International Journal of High Performance Computing Applications, 34(4), 377–393. https://doi.org/10.1177/1094342020923027
Alomairy, R., Ltaief, H., Abduljabbar, M., & Keyes, D. (2020). Abstraction Layer For Standardizing APIs of Task-Based Engines. IEEE Transactions on Parallel and Distributed Systems, 31(11), 2482–2495. https://doi.org/10.1109/tpds.2020.2992923
Litvinenko, A., Logashenko, D., Tempone, R., Wittum, G., & Keyes, D. (2020). Solution of the 3D density-driven groundwater flow problem with uncertain porosity and permeability. GEM - International Journal on Geomathematics, 11(1). https://doi.org/10.1007/s13137-020-0147-1
Alturkestani, T., Tonellot, T., Ltaief, H., Abdelkhalak, R., Etienne, V., & Keyes, D. (2019). MLBS: Transparent Data Caching in Hierarchical Storage for Out-of-Core HPC Applications. 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC). https://doi.org/10.1109/hipc.2019.00046
Abdulah, S., Ltaief, H., Sun, Y., Genton, M. G., & Keyes, D. E. (2019). Geostatistical Modeling and Prediction Using Mixed Precision Tile Cholesky Factorization. 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC). https://doi.org/10.1109/hipc.2019.00028
Luo, L., Liu, L., Cai, X.-C., & Keyes, D. E. (2020). Fully implicit hybrid two-level domain decomposition algorithms for two-phase flows in porous media on 3D unstructured grids. Journal of Computational Physics, 409, 109312. https://doi.org/10.1016/j.jcp.2020.109312
Keyes, D. E., Ltaief, H., & Turkiyyah, G. (2020). Hierarchical algorithms on hierarchical architectures. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 378(2166), 20190055. https://doi.org/10.1098/rsta.2019.0055
Cao, Q., Pei, Y., Herauldt, T., Akbudak, K., Mikhalev, A., Bosilca, G., … Dongarra, J. (2019). Performance Analysis of Tile Low-Rank Cholesky Factorization Using PaRSEC Instrumentation Tools. 2019 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools). https://doi.org/10.1109/protools49597.2019.00009

2019

Doucet, N., Ltaief, H., Gratadour, D., & Keyes, D. (2019). Mixed-Precision Tomographic Reconstructor Computations on Hardware Accelerators. 2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3). https://doi.org/10.1109/ia349570.2019.00011
Sukkari, D., Ltaief, H., Keyes, D., & Faverge, M. (2019). Leveraging Task-Based Polar Decomposition Using PARSEC on Massively Parallel Systems. 2019 IEEE International Conference on Cluster Computing (CLUSTER). https://doi.org/10.1109/cluster.2019.8891024
AlOnazi, A., Ltaief, H., Keyes, D., Said, I., & Thibault, S. (2019). Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry. 2019 IEEE International Conference on Cluster Computing (CLUSTER). https://doi.org/10.1109/cluster.2019.8891054
Dalcin, L., Rojas, D., Zampini, S., Del Rey Fernández, D. C., Carpenter, M. H., & Parsani, M. (2019). Conservative and entropy stable solid wall boundary conditions for the compressible Navier–Stokes equations: Adiabatic wall and heat entropy transfer. Journal of Computational Physics, 397, 108775. https://doi.org/10.1016/j.jcp.2019.06.051
Litvinenko, A., Kriemann, R., Genton, M. G., Sun, Y., & Keyes, D. E. (2019). HLIBCov: Parallel hierarchical matrix approximation of large covariance matrices and likelihoods with applications in parameter identification. MethodsX, 7, 100600. https://doi.org/10.1016/j.mex.2019.07.001
Boukaram, W., Turkiyyah, G., & Keyes, D. (2019). Randomized GPU Algorithms for the Construction of Hierarchical Matrices from Matrix-Vector Operations. SIAM Journal on Scientific Computing, 41(4), C339–C366. https://doi.org/10.1137/18m1210101
Ltaief, H., Sukkari, D., Esposito, A., Nakatsukasa, Y., & Keyes, D. (2019). Massively Parallel Polar Decomposition on Distributed-memory Systems. ACM Transactions on Parallel Computing, 6(1), 1–15. https://doi.org/10.1145/3328723
Charara, A., Keyes, D., & Ltaief, H. (2019). Batched Triangular Dense Linear Algebra Kernels for Very Small Matrix Sizes on GPUs. ACM Transactions on Mathematical Software, 45(2), 1–28. https://doi.org/10.1145/3267101
Sukkari, D., Ltaief, H., Esposito, A., & Keyes, D. (2019). A QDWH-based SVD Software Framework on Distributed-memory Manycore Systems. ACM Transactions on Mathematical Software, 45(2), 1–21. https://doi.org/10.1145/3309548
Mortensen, M., Dalcin, L., & Keyes, D. (2019). mpi4py-fft: Parallel Fast Fourier Transforms with MPI for Python. Journal of Open Source Software, 4(36), 1340. https://doi.org/10.21105/joss.01340
Abdelkhalak, R., Akbudak, K., Etienne, V., Ltaief, H., Tonellot, T., & Keyes, D. (2019). Application of High Performance Asynchronous Acoustic Wave Equation Stencil Solver into a Land Survey. SPE Middle East Oil and Gas Show and Conference. https://doi.org/10.2118/194722-ms
Dalcin, L., Mortensen, M., & Keyes, D. E. (2019). Fast parallel multidimensional FFT using advanced MPI. Journal of Parallel and Distributed Computing, 128, 137–150. https://doi.org/10.1016/j.jpdc.2019.02.006
Boukaram, W., Turkiyyah, G., & Keyes, D. (2019). Hierarchical Matrix Operations on GPUs. ACM Transactions on Mathematical Software, 45(1), 1–28. https://doi.org/10.1145/3232850
Genduso, G., Litwiller, E., Ma, X., Zampini, S., & Pinnau, I. (2019). Mixed-gas sorption in polymers via a new barometric test system: sorption and diffusion of CO2-CH4 mixtures in polydimethylsiloxane (PDMS). Journal of Membrane Science, 577, 195–204. https://doi.org/10.1016/j.memsci.2019.01.046
Franzone, P. C., Pavarino, L. F., Scacchi, S., & Zampini, S. (2018). Scalable Cardiac Electro-Mechanical Solvers and Reentry Dynamics. Domain Decomposition Methods in Science and Engineering XXIV, 31–43. https://doi.org/10.1007/978-3-319-93873-8_3
Zampini, S., Vassilevski, P., Dobrev, V., & Kolev, T. (2018). Balancing Domain Decomposition by Constraints Algorithms for Curl-Conforming Spaces of Arbitrary Order. Domain Decomposition Methods in Science and Engineering XXIV, 103–116. https://doi.org/10.1007/978-3-319-93873-8_8

2018

AlOnazi, A., Rogowski, M., Al-Zawawi, A., & Keyes, D. (2018). Performance Assessment of Hybrid Parallelism for Large-Scale Reservoir Simulation on Multi- and Many-core Architectures. 2018 IEEE High Performance Extreme Computing Conference (HPEC). https://doi.org/10.1109/hpec.2018.8547565
Abdulah, S., Ltaief, H., Sun, Y., Genton, M. G., & Keyes, D. E. (2018). Parallel Approximation of the Maximum Likelihood Estimation for the Prediction of Large-Scale Geostatistics Simulations. 2018 IEEE International Conference on Cluster Computing (CLUSTER). https://doi.org/10.1109/cluster.2018.00089
Gharti, H. N., Tromp, J., & Zampini, S. (2018). Spectral-infinite-element simulations of gravity anomalies. Geophysical Journal International, 215(2), 1098–1117. https://doi.org/10.1093/gji/ggy324
Ltaief, H., Charara, A., Gratadour, D., Doucet, N., Hadri, B., Gendron, E., … Keyes, D. (2018). Real-Time Massively Distributed Multi-object Adaptive Optics Simulations for the European Extremely Large Telescope. 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). https://doi.org/10.1109/ipdps.2018.00018
Akbudak, K., Ltaief, H., Mikhalev, A., Charara, A., Esposito, A., & Keyes, D. (2018). Exploiting Data Sparsity for Large-Scale Matrix Computations. Lecture Notes in Computer Science, 721–734. https://doi.org/10.1007/978-3-319-96983-1_51
Charara, A., Keyes, D., & Ltaief, H. (2018). Tile Low-Rank GEMM Using Batched Operations on GPUs. Lecture Notes in Computer Science, 811–825. https://doi.org/10.1007/978-3-319-96983-1_57
Cao, J., Genton, M. G., Keyes, D. E., & Turkiyyah, G. M. (2018). Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities. Statistics and Computing, 29(3), 585–598. https://doi.org/10.1007/s11222-018-9825-3
Doucet, N., Gratadour, D., Ltaief, H., Kriemann, R., Gendron, E., & Keyes, D. (2018). Scalable soft real-time supervisor for tomographic AO. Adaptive Optics Systems VI. https://doi.org/10.1117/12.2313273
Ltaief, H., Sukkari, D., Guyon, O., & Keyes, D. (2018). Extreme Computing for Extreme Adaptive Optics. Proceedings of the Platform for Advanced Scientific Computing Conference. https://doi.org/10.1145/3218176.3218225
Abdulah, S., Ltaief, H., Sun, Y., Genton, M. G., & Keyes, D. E. (2018). ExaGeoStat: A High Performance Unified Software for Geostatistics on Manycore Systems. IEEE Transactions on Parallel and Distributed Systems, 29(12), 2771–2784. https://doi.org/10.1109/tpds.2018.2850749
Pavarino, L. F., Scacchi, S., Widlund, O. B., & Zampini, S. (2018). Isogeometric BDDC deluxe preconditioners for linear elasticity. Mathematical Models and Methods in Applied Sciences, 28(07), 1337–1370. https://doi.org/10.1142/s0218202518500367
Genton, M. G., Keyes, D. E., & Turkiyyah, G. (2018). Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities. Journal of Computational and Graphical Statistics, 27(2), 268–277. https://doi.org/10.1080/10618600.2017.1375936
Liu, L., Keyes, D. E., & Krause, R. (2018). A Note on Adaptive Nonlinear Preconditioning Techniques. SIAM Journal on Scientific Computing, 40(2), A1171–A1186. https://doi.org/10.1137/17m1128502
Al Farhan, M. A., & Keyes, D. E. (2018). Optimizations of Unstructured Aerodynamics Computations for Many-core Architectures. IEEE Transactions on Parallel and Distributed Systems, 29(10), 2317–2332. https://doi.org/10.1109/tpds.2018.2826533

2017

Malas, T. M., Hager, G., Ltaief, H., & Keyes, D. E. (2018). Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations. ACM Transactions on Parallel Computing, 4(3), 1–32. https://doi.org/10.1145/3155290
Chávez, G., Turkiyyah, G., Zampini, S., Ltaief, H., & Keyes, D. (2018). Accelerated Cyclic Reduction: A distributed-memory fast solver for structured linear systems. Parallel Computing, 74, 65–83. https://doi.org/10.1016/j.parco.2017.12.001
Chávez, G., Turkiyyah, G., Zampini, S., & Keyes, D. (2018). Parallel accelerated cyclic reduction preconditioner for three-dimensional elliptic PDEs with variable coefficients. Journal of Computational and Applied Mathematics, 344, 760–781. https://doi.org/10.1016/j.cam.2017.11.035
Peng, C., Zhang, Z., Wong, K.-C., Zhang, X., & Keyes, D. E. (2017). A scalable community detection algorithm for large graphs using stochastic block models. Intelligent Data Analysis, 21(6), 1463–1485. https://doi.org/10.3233/IDA-163156
Ibeid, H., Yokota, R., Pestana, J., & Keyes, D. (2017). Fast multipole preconditioners for sparse matrices arising from elliptic equations. Computing and Visualization in Science, 18(6), 213–229. https://doi.org/10.1007/s00791-017-0287-5
Gao, L., Ketcheson, D., & Keyes, D. (2017). On long-time instabilities in staggered finite difference simulations of the seismic acoustic wave equations on discontinuous grids. Geophysical Journal International, 212(2), 1098–1110. https://doi.org/10.1093/gji/ggx470
Yokota, R., Ibeid, H., & Keyes, D. (2017). Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation. Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing, 267–286. https://doi.org/10.1007/978-3-319-62426-6_17
Sukkari, D., Ltaief, H., Faverge, M., & Keyes, D. (2018). Asynchronous Task-Based Polar Decomposition on Single Node Manycore Architectures. IEEE Transactions on Parallel and Distributed Systems, 29(2), 312–323. https://doi.org/10.1109/tpds.2017.2755655
Boukaram, W. H., Turkiyyah, G., Ltaief, H., & Keyes, D. E. (2018). Batched QR and SVD algorithms on GPUs with applications in hierarchical matrix compression. Parallel Computing, 74, 19–33. https://doi.org/10.1016/j.parco.2017.09.001
Zampini, S., & Tu, X. (2017). Multilevel Balancing Domain Decomposition by Constraints Deluxe Algorithms with Adaptive Coarse Spaces for Flow in Porous Media. SIAM Journal on Scientific Computing, 39(4), A1389–A1415. https://doi.org/10.1137/16m1080653
Abduljabbar, M., Al Farhan, M., Yokota, R., & Keyes, D. (2017). Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture. Euro-Par 2017: Parallel Processing, 553–564. https://doi.org/10.1007/978-3-319-64203-1_40
AlOnazi, A., Markomanolis, G. S., & Keyes, D. (2017). Asynchronous Task-Based Parallelization of Algebraic Multigrid. Proceedings of the Platform for Advanced Scientific Computing Conference on - PASC ’17. https://doi.org/10.1145/3093172.3093230
Oh, D.-S., Widlund, O. B., Zampini, S., & Dohrmann, C. R. (2017). BDDC Algorithms with deluxe scaling and adaptive selection of primal constraints for Raviart-Thomas vector fields. Mathematics of Computation, 87(310), 659–692. https://doi.org/10.1090/mcom/3254
Charara, A., Keyes, D., & Ltaief, H. (2017). A framework for dense triangular matrix kernels on various manycore architectures. Concurrency and Computation: Practice and Experience, 29(15), e4187. https://doi.org/10.1002/cpe.4187
Unat, D., Dubey, A., Hoefler, T., Shalf, J., Abraham, M., Bianco, M., … Pericas, M. (2017). Trends in Data Locality Abstractions for HPC Systems. IEEE Transactions on Parallel and Distributed Systems, 28(10), 3007–3020. https://doi.org/10.1109/tpds.2017.2703149
Abduljabbar, M., Markomanolis, G. S., Ibeid, H., Yokota, R., & Keyes, D. (2017). Communication Reducing Algorithms for Distributed Hierarchical N-Body Problems with Boundary Distributions. High Performance Computing, 79–96. https://doi.org/10.1007/978-3-319-58667-0_5
Akbudak, K., Ltaief, H., Mikhalev, A., & Keyes, D. (2017). Tile Low Rank Cholesky Factorization for Climate/Weather Modeling Applications on Manycore Architectures. High Performance Computing, 22–40. https://doi.org/10.1007/978-3-319-58667-0_2
Pavarino, L. F., Scacchi, S., Verdi, C., Zampieri, E., & Zampini, S. (2017). Scalable BDDC Algorithms for Cardiac Electromechanical Coupling. Domain Decomposition Methods in Science and Engineering XXIII, 261–268. https://doi.org/10.1007/978-3-319-52389-7_26
Zampini, S. (2017). Adaptive BDDC Deluxe Methods for H(curl). Domain Decomposition Methods in Science and Engineering XXIII, 285–292. https://doi.org/10.1007/978-3-319-52389-7_29
Liu, L., Zhang, W., & Keyes, D. E. (2017). Nonlinear Multiplicative Schwarz Preconditioning in Natural Convection Cavity Flow. Domain Decomposition Methods in Science and Engineering XXIII, 227–235. https://doi.org/10.1007/978-3-319-52389-7_22
Chávez, G., Turkiyyah, G., & Keyes, D. E. (2017). A Direct Elliptic Solver Based on Hierarchically Low-Rank Schur Complements. Domain Decomposition Methods in Science and Engineering XXIII, 135–143. https://doi.org/10.1007/978-3-319-52389-7_12
Da Veiga, L. B., Pavarino, L. F., Scacchi, S., Widlund, O. B., & Zampini, S. (2017). Parallel Sum Primal Spaces for Isogeometric Deluxe BDDC Preconditioners. Domain Decomposition Methods in Science and Engineering XXIII, 17–29. https://doi.org/10.1007/978-3-319-52389-7_2
Da Veiga, L. B., Pavarino, L. F., Scacchi, S., Widlund, O. B., & Zampini, S. (2017). Adaptive Selection of Primal Constraints for Isogeometric BDDC Deluxe Preconditioners. SIAM Journal on Scientific Computing, 39(1), A281–A302. https://doi.org/10.1137/15m1054675

2016

Chen, Y., Keyes, D., Law, K. J. H., & Ltaief, H. (2016). Accelerated Dimension-Independent Adaptive Metropolis. SIAM Journal on Scientific Computing, 38(5), S539–S565. https://doi.org/10.1137/15m1026432
Zampini, S. (2016). PCBDDC: A Class of Robust Dual-Primal Methods in PETSc. SIAM Journal on Scientific Computing, 38(5), S282–S306. https://doi.org/10.1137/15m1025785
Liu, L., & Keyes, D. E. (2016). Convergence Analysis for the Multiplicative Schwarz Preconditioned Inexact Newton Algorithm. SIAM Journal on Numerical Analysis, 54(5), 3145–3166. https://doi.org/10.1137/15m1028182
Litvinenko, A., Genton, M., Sun, Y., & Keyes, D. (2016). ℋ-matrix techniques for approximating large covariance matrices and estimating its parameters. PAMM, 16(1), 731–732. https://doi.org/10.1002/pamm.201610354
Sukkari, D., Ltaief, H., & Keyes, D. (2016). A High Performance QDWH-SVD Solver Using Hardware Accelerators. ACM Transactions on Mathematical Software, 43(1), 1–25. https://doi.org/10.1145/2894747
Sukkari, D., Ltaief, H., & Keyes, D. (2016). High Performance Polar Decomposition on Distributed Memory Systems. Lecture Notes in Computer Science, 605–616. https://doi.org/10.1007/978-3-319-43659-3_44
Charara, A., Ltaief, H., & Keyes, D. (2016). Redesigning Triangular Dense Matrix Computations on GPUs. Lecture Notes in Computer Science, 477–489. https://doi.org/10.1007/978-3-319-43659-3_35
Ibeid, H., Yokota, R., & Keyes, D. (2016). A performance model for the communication in fast multipole methods on high-performance computing platforms. The International Journal of High Performance Computing Applications, 30(4), 423–437. https://doi.org/10.1177/1094342016634819
Malas, T. M., Hornich, J., Hager, G., Ltaief, H., Pflaum, C., & Keyes, D. E. (2016). Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). https://doi.org/10.1109/ipdps.2016.87
Hutchinson, M., Heinecke, A., Pabst, H., Henry, G., Parsani, M., & Keyes, D. (2016). Efficiency of High Order Spectral Element Methods on Petascale Architectures. High Performance Computing, 449–466. https://doi.org/10.1007/978-3-319-41321-1_23
Bao, K., Yan, M., Allen, R., Salama, A., Lu, L., Jordan, K. E., … Keyes, D. (2016). High-Performance Modeling of Carbon Dioxide Sequestration by Coupling Reservoir Simulation and Molecular Dynamics. SPE Journal, 21(03), 0853–0863. https://doi.org/10.2118/163621-pa
Al Farhan, M. A., Kaushik, D. K., & Keyes, D. E. (2016). Unstructured computational aerodynamics on many integrated core architecture. Parallel Computing, 59, 97–118. https://doi.org/10.1016/j.parco.2016.06.001
Ltaief, H., Gratadour, D., Charara, A., & Gendron, E. (2016). Adaptive Optics Simulation for the World’s Largest Telescope on Multicore Architectures with Multiple GPUs. Proceedings of the Platform for Advanced Scientific Computing Conference on - PASC ’16. https://doi.org/10.1145/2929908.2929920
Zampini, S., & Keyes, D. E. (2016). On the Robustness and Prospects of Adaptive BDDC Methods for Finite Element Discretizations of Elliptic PDEs with High-Contrast Coefficients. Proceedings of the Platform for Advanced Scientific Computing Conference on - PASC ’16. https://doi.org/10.1145/2929908.2929919
Arfaoui, M.-A., Ltaief, H., Rezki, Z., Alouini, M.-S., & Keyes, D. (2016). Efficient Sphere Detector Algorithm for Massive MIMO Using GPU Hardware Accelerator. Procedia Computer Science, 80, 2169–2180. https://doi.org/10.1016/j.procs.2016.05.377
Dalcin, L., Collier, N., Vignal, P., Côrtes, A. M. A., & Calo, V. M. (2016). PetIGA: A framework for high-performance isogeometric analysis. Computer Methods in Applied Mechanics and Engineering, 308, 151–181. https://doi.org/10.1016/j.cma.2016.05.011
Espath, L. F. R., Sarmiento, A. F., Vignal, P., Varga, B. O. N., Cortes, A. M. A., Dalcin, L., & Calo, V. M. (2016). Energy exchange analysis in droplet dynamics via the Navier–Stokes–Cahn–Hilliard model. Journal of Fluid Mechanics, 797, 389–430. https://doi.org/10.1017/jfm.2016.277
Abdelfattah, A., Ltaief, H., Keyes, D., & Dongarra, J. (2016). Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs. Concurrency and Computation: Practice and Experience, 28(12), 3447–3465. https://doi.org/10.1002/cpe.3874
Abdelfattah, A., Keyes, D., & Ltaief, H. (2016). KBLAS. ACM Transactions on Mathematical Software, 42(3), 1–31. https://doi.org/10.1145/2818311
Da Veiga, L. B., Pavarino, L. F., Scacchi, S., Widlund, O. B., & Zampini, S. (2016). BDDC Deluxe for Isogeometric Analysis. Domain Decomposition Methods in Science and Engineering XXII, 15–28. https://doi.org/10.1007/978-3-319-18827-0_2

2015

Pardo, D., Álvarez-Aramberri, J., Paszynski, M., Dalcin, L., & Calo, V. M. (2015). Impact of element-level static condensation on iterative solver performance. Computers & Mathematics with Applications, 70(10), 2331–2341. https://doi.org/10.1016/j.camwa.2015.09.005
Malas, T., Hager, G., Ltaief, H., & Keyes, D. (2015). Towards Fast Reverse Time Migration Kernels using Multi-threaded Wavefront Diamond Tiling. Second EAGE Workshop on High Performance Computing for Upstream. https://doi.org/10.3997/2214-4609.201414025
Zampini, S., Widlund, O. B., & Keyes, D. E. (2015). Scalable and Robust BDDC Preconditioners for Reservoir and Electromagnetics Modeling. Second EAGE Workshop on High Performance Computing for Upstream. https://doi.org/10.3997/2214-4609.201414030
Abdelfattah, A., Ltaief, H., & Keyes, D. (2015). High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications. Euro-Par 2015: Parallel Processing, 601–612. https://doi.org/10.1007/978-3-662-48096-0_46
Vignal, P., Dalcin, L., Brown, D. L., Collier, N., & Calo, V. M. (2015). An energy-stable convex splitting for the phase-field crystal equation. Computers & Structures, 158, 355–368. https://doi.org/10.1016/j.compstruc.2015.05.029
Mudigere, D., Sridharan, S., Deshpande, A., Park, J., Heinecke, A., Smelyanskiy, M., … Keyes, D. (2015). Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems. 2015 IEEE International Parallel and Distributed Processing Symposium. https://doi.org/10.1109/ipdps.2015.114
Pavarino, L. F., Scacchi, S., & Zampini, S. (2015). Newton–Krylov-BDDC solvers for nonlinear cardiac mechanics. Computer Methods in Applied Mechanics and Engineering, 295, 562–580. https://doi.org/10.1016/j.cma.2015.07.009
Malas, T., Hager, G., Ltaief, H., Stengel, H., Wellein, G., & Keyes, D. (2015). Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates. SIAM Journal on Scientific Computing, 37(4), C439–C464. https://doi.org/10.1137/140991133
Al-Omairy, R., Miranda, G., Ltaief, H., Badia, R., Martorell, X., Labarta, J., & Keyes, D. (2015). Dense Matrix Computations on NUMA Architectures with Distance-Aware Work Stealing. (2015). Supercomputing Frontiers and Innovations, 2(1). https://doi.org/10.14529/jsfi150103
Liu, L., & Keyes, D. E. (2015). Field-Split Preconditioned Inexact Newton Algorithms. SIAM Journal on Scientific Computing, 37(3), A1388–A1409. https://doi.org/10.1137/140970379
Schaefer, R., Smołka, M., Dalcin, L., & Paszyński, M. (2015). A New Time Integration Scheme for Cahn-hilliard Equations. Procedia Computer Science, 51, 1003–1012. https://doi.org/10.1016/j.procs.2015.05.244
Paszyńska, A., Jopek, K., Banaś, K., Paszyński, M., Gurgul, P., Lenerth, A., … Calo, V. (2015). Telescopic Hybrid Fast Solver for 3D Elliptic Problems with Point Singularities. Procedia Computer Science, 51, 2744–2748. https://doi.org/10.1016/j.procs.2015.05.415
Łoś, M., Woźniak, M., Paszyński, M., Dalcin, L., & Calo, V. M. (2015). Dynamics with Matrices Possessing Kronecker Product Structure. Procedia Computer Science, 51, 286–295. https://doi.org/10.1016/j.procs.2015.05.243
Vignal, P., Sarmiento, A., Côrtes, A. M. A., Dalcin, L., & Calo, V. M. (2015). Coupling Navier-stokes and Cahn-hilliard Equations in a Two-dimensional Annular flow Configuration. Procedia Computer Science, 51, 934–943. https://doi.org/10.1016/j.procs.2015.05.228
Côrtes, A. M. A., Coutinho, A. L. G. A., Dalcin, L., & Calo, V. M. (2015). Performance evaluation of block-diagonal preconditioners for the divergence-conforming B-spline discretization of the Stokes system. Journal of Computational Science, 11, 123–136. https://doi.org/10.1016/j.jocs.2015.01.005
Espath, L. F. R., Braun, A. L., Awruch, A. M., & Dalcin, L. D. (2015). A NURBS-based finite element model applied to geometrically nonlinear elastodynamics using a corotational approach. International Journal for Numerical Methods in Engineering, 102(13), 1839–1868. https://doi.org/10.1002/nme.4870
Charara, A., Ltaief, H., Gratadour, D., Keyes, D., Sevin, A., Abdelfattah, A., … Vidal, F. (2014). Pipelining Computational Stages of the Tomographic Reconstructor for Multi-Object Adaptive Optics on a Multi-GPU System. SC14: International Conference for High Performance Computing, Networking, Storage and Analysis. https://doi.org/10.1109/sc.2014.27
Zheng, X., Yang, C., Cai, X.-C., & Keyes, D. (2015). A parallel domain decomposition-based implicit method for the Cahn–Hilliard–Cook phase-field equation in 3D. Journal of Computational Physics, 285, 55–70. https://doi.org/10.1016/j.jcp.2015.01.016

2014

Woźniak, M., Paszyński, M., Pardo, D., Dalcin, L., & Calo, V. M. (2015). Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines. Computer Methods in Applied Mechanics and Engineering, 284, 971–987. https://doi.org/10.1016/j.cma.2014.11.020
Yan, Y., & Keyes, D. E. (2015). Smooth and robust solutions for Dirichlet boundary control of fluid–solid conjugate heat transfer problems. Journal of Computational Physics, 281, 759–786. https://doi.org/10.1016/j.jcp.2014.10.049
Côrtes, A. M. A., Vignal, P., Sarmiento, A., García, D., Collier, N., Dalcin, L., & Calo, V. M. (2014). Solving Nonlinear, High-Order Partial Differential Equations Using a High-Performance Isogeometric Analysis Framework. High Performance Computing, 236–247. https://doi.org/10.1007/978-3-662-45483-1_17
Collier, N., Dalcin, L., & Calo, V. M. (2014). On the computational efficiency of isogeometric methods for smooth elliptic problems using direct solvers. International Journal for Numerical Methods in Engineering, 100(8), 620–632. https://doi.org/10.1002/nme.4769
Abdelfattah, A., Gendron, E., Gratadour, D., Keyes, D., Ltaief, H., Sevin, A., & Vidal, F. (2014). High Performance Pseudo-analytical Simulation of Multi-Object Adaptive Optics over Multi-GPU Systems. Euro-Par 2014 Parallel Processing, 704–715. https://doi.org/10.1007/978-3-319-09873-9_59
Gendron, É., Charara, A., Abdelfattah, A., Gratadour, D., Keyes, D., Ltaief, H., … Rousset, G. (2014). A novel fast and accurate pseudo-analytical simulation approach for MOAO. Adaptive Optics Systems IV. https://doi.org/10.1117/12.2055911
Sarmiento, A., Garcia, D., Dalcin, L., Collier, N., & Calo, V. (2014). Micropolar Fluids Using B-spline Divergence Conforming Spaces. Procedia Computer Science, 29, 991–1001. https://doi.org/10.1016/j.procs.2014.05.089
Vignal, P., Dalcin, L., Collier, N. O., & Calo, V. M. (2014). Modeling Phase-transitions Using a High-performance, Isogeometric Analysis Framework. Procedia Computer Science, 29, 980–990. https://doi.org/10.1016/j.procs.2014.05.088
Yang, C., Cai, X.-C., Keyes, D. E., & Pernice, M. (2014). NKS Method for the Implicit Solution of a Coupled Allen-Cahn/Cahn-Hilliard System. Domain Decomposition Methods in Science and Engineering XXI, 819–827. https://doi.org/10.1007/978-3-319-05789-7_79

2013

Dongarra, J., Faverge, M., Ltaief, H., & Luszczek, P. (2013). Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting. Concurrency and Computation: Practice and Experience, 26(7), 1408–1431. https://doi.org/10.1002/cpe.3110
Ltaief, H., & Yokota, R. (2013). Data-driven execution of fast multipole methods. Concurrency and Computation: Practice and Experience, 26(11), 1935–1946. https://doi.org/10.1002/cpe.3132
Liu, L., Keyes, D. E., & Sun, S. (2013). Fully Implicit Two-phase Reservoir Simulation With the Additive Schwarz Preconditioned Inexact Newton Method. SPE Reservoir Characterization and Simulation Conference and Exhibition. https://doi.org/10.2118/166062-ms
Downes, T. P., Roller, S., Seitsonen, A. P., Valcke, S., Keyes, D., Sawley, M.-C., … Shalf, J. (2013). Topic 14+16: High-Performance and Scientific Applications and Extreme-Scale Computing. Lecture Notes in Computer Science, 737–738. https://doi.org/10.1007/978-3-642-40047-6_73
Abdelfattah, A., Dongarra, J., Keyes, D., & Ltaief, H. (2013). Optimizing Memory-Bound SYMV Kernel on GPU Hardware Accelerators. High Performance Computing for Computational Science - VECPAR 2012, 72–79. https://doi.org/10.1007/978-3-642-38718-0_10
Ltaief, H., Luszczek, P., & Dongarra, J. (2013). High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures. ACM Transactions on Mathematical Software, 39(3), 1–22. https://doi.org/10.1145/2450153.2450154
Pardo, D., Paszynski, M., Collier, N., Alvarez, J., Dalcin, L., & Calo, V. M. (2012). A survey on direct solvers for Galerkin methods. SeMA Journal, 57(1), 107–134. https://doi.org/10.1007/bf03322602
Collier, N., Dalcin, L., Pardo, D., & Calo, V. M. (2013). The Cost of Continuity: Performance of Iterative Solvers on Isogeometric Finite Elements. SIAM Journal on Scientific Computing, 35(2), A767–A784. https://doi.org/10.1137/120881038
Dongarra, J., Ltaief, H., Luszczek, P., & Weaver, V. M. (2012). Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures. 2012 Second International Conference on Cloud and Green Computing. https://doi.org/10.1109/cgc.2012.113
Abdelfattah, A., Keyes, D., & Ltaief, H. (2013). Systematic Approach in Optimizing Numerical Memory-Bound Kernels on GPU. Euro-Par 2012: Parallel Processing Workshops, 207–216. https://doi.org/10.1007/978-3-642-36949-0_23
Keyes, D. E., McInnes, L. C., Woodward, C., Gropp, W., Myra, E., Pernice, M., … Wohlmuth, B. (2013). Multiphysics simulations. The International Journal of High Performance Computing Applications, 27(1), 4–83. https://doi.org/10.1177/1094342012468181
Peng, C., Wong, K.-C., Rockwood, A., Zhang, X., Jiang, J., & Keyes, D. (2012). Multiplicative Algorithms for Constrained Non-negative Matrix Factorization. 2012 IEEE 12th International Conference on Data Mining. https://doi.org/10.1109/icdm.2012.106
Yuan, X., Li, X. S., Yamazaki, I., Jardin, S. C., Koniges, A. E., & Keyes, D. E. (2013). Application of PDSLin to the magnetic reconnection problem. Computational Science & Discovery, 6(1), 014002. https://doi.org/10.1088/1749-4699/6/1/014002

2012

Haidar, A., Ltaief, H., & Dongarra, J. (2012). Toward a High Performance Tile Divide and Conquer Algorithm for the Dense Symmetric Eigenvalue Problem. SIAM Journal on Scientific Computing, 34(6), C249–C274. https://doi.org/10.1137/110823699
He, Y., & Keyes, D. E. (2012). Large-scale parameter extraction in electrocardiology models through Born approximation. Inverse Problems, 29(1), 015001. https://doi.org/10.1088/0266-5611/29/1/015001
Bosilca, G., Ltaief, H., & Dongarra, J. (2012). Power profiling of Cholesky and QR factorizations on distributed memory systems. Computer Science - Research and Development, 29(2), 139–147. https://doi.org/10.1007/s00450-012-0224-2
Haidar, A., Ltaief, H., Luszczek, P., & Dongarra, J. (2012). A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction. 2012 IEEE 26th International Parallel and Distributed Processing Symposium. https://doi.org/10.1109/ipdps.2012.13
Niemi, A. H., Collier, N., Dalcin, L., Ghommem, M., & Calo, V. M. (n.d.). Isogeometric Shell Formulation based on a Classical Shell Model. Proceedings of the Eleventh International Conference on Computational Structures Technology. https://doi.org/10.4203/ccp.99.221
Ltaief, H., Luszczek, P., & Dongarra, J. (2012). Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures Using Tree Reduction. Lecture Notes in Computer Science, 661–670. https://doi.org/10.1007/978-3-642-31464-3_67
Alvarez-Aramberri, J., Pardo, D., Paszynski, M., Collier, N., Dalcin, L., & Calo, V. M. (2012). On Round-off Error for Adaptive Finite Element Methods. Procedia Computer Science, 9, 1474–1483. https://doi.org/10.1016/j.procs.2012.04.162
Malas, T., Ahmadia, A. J., Brown, J., Gunnels, J. A., & Keyes, D. E. (2012). Optimizing the performance of streaming numerical kernels on the IBM Blue Gene/P PowerPC 450 processor. The International Journal of High Performance Computing Applications, 27(2), 193–209. https://doi.org/10.1177/1094342012444795
Yuan, X., Jardin, S. C., & Keyes, D. E. (2012). Numerical simulation of four-field extended magnetohydrodynamics in dynamically adaptive curvilinear coordinates via Newton–Krylov–Schwarz. Journal of Computational Physics, 231(17), 5822–5853. https://doi.org/10.1016/j.jcp.2012.05.009
Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Langou, J., Ltaief, H., & Tomov, S. (2011). LU factorization for accelerator-based systems. 2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA). https://doi.org/10.1109/aiccsa.2011.6126599
Dongarra, J., Faverge, M., Ltaief, H., & Luszczek, P. (2011). High performance matrix inversion based on LU factorization for multicore architectures. Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers - MTAGS ’11. https://doi.org/10.1145/2132876.2132885

2011

Collier, N., Pardo, D., Dalcin, L., Paszynski, M., & Calo, V. M. (2012). The cost of continuity: A study of the performance of isogeometric finite elements using direct solvers. Computer Methods in Applied Mechanics and Engineering, 213-216, 353–361. https://doi.org/10.1016/j.cma.2011.11.002
Haidar, A., Ltaief, H., & Dongarra, J. (2011). Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels. Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC ’11. https://doi.org/10.1145/2063384.2063394
Ltaief, H., Luszczek, P., & Dongarra, J. (2011). Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency. Computer Science - Research and Development, 27(4), 277–287. https://doi.org/10.1007/s00450-011-0191-z
Collier, N., Radwan, H., Dalcin, L., & Calo, V. M. (2013). Time adaptivity in the diffusive wave approximation to the shallow water equations. Journal of Computational Science, 4(3), 152–156. https://doi.org/10.1016/j.jocs.2011.07.004
Haidar, A., Ltaief, H., YarKhan, A., & Dongarra, J. (2011). Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures. Concurrency and Computation: Practice and Experience, 24(3), 305–321. https://doi.org/10.1002/cpe.1829
Kaushik, D., Keyes, D., Balay, S., & Smith, B. (2011). Hybrid Programming Model for Implicit PDE Simulations on Multicore Architectures. Lecture Notes in Computer Science, 12–21. https://doi.org/10.1007/978-3-642-21487-5_2
Collier, N., Radwan, H., Dalcin, L., & Calo, V. M. (2011). Diffusive Wave Approximation to the Shallow Water Equations: Computational Approach. Procedia Computer Science, 4, 1828–1833. https://doi.org/10.1016/j.procs.2011.04.198
Keyes, D. E. (2011). Exaflop/s: The why and the how. Comptes Rendus Mécanique, 339(2-3), 70–77. https://doi.org/10.1016/j.crme.2010.11.002

2010

Kaushik, D., Keyes, D., Allsopp, N., Balay, S., Smith, B., Simos, T. E., … Tsitouras, C. (2010). Hierarchical Programming Models for Exascale Computing—Potential and Challenges. https://doi.org/10.1063/1.3498226
Bhowmick, S., Eijkhout, V., Freund, Y., Fuentes, E., & Keyes, D. (2010). Application of Alternating Decision Trees in Selecting Sparse Linear Solvers. Software Automatic Tuning, 153–173. https://doi.org/10.1007/978-1-4419-6935-4_10
Yuan, X., Jardin, S. C., & Keyes, D. E. (2011). Moving grids for magnetic reconnection via Newton–Krylov methods. Computer Physics Communications, 182(1), 173–176. https://doi.org/10.1016/j.cpc.2010.06.009