计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (10): 2265-2277.DOI: 10.3778/j.issn.1673-9418.2303076

• 前沿·综述 • 上一篇    下一篇

面向E级计算的线性代数解法器研究综述

何连花,徐顺,金钟   

  1. 中国科学院 计算机网络信息中心,北京 100190
  • 出版日期:2023-10-01 发布日期:2023-10-01

Survey of Linear Algebra Solvers for Exascale Computing

HE Lianhua, XU Shun, JIN Zhong   

  1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
  • Online:2023-10-01 Published:2023-10-01

摘要: 基于E级计算的科学工程计算应用给数值线性代数算法的发展,既提供了更多机遇,又带来了更大挑战。首先分析了E级计算的特点,包括:针对大规模异构并行体系结构的并行编程成为主流方式;降低运行大规模应用带来的极高能耗成本成为了主要考虑问题;多精度的异构计算硬件引发了混合精度算法进一步的研发。其次综述了主流稠密及稀疏线性代数解法器面向高性能计算体系架构进行的功能及性能方面的优化工作,对比分析了各解法器的特点及优势。随后总结分析了线性代数解法器核心技术进展,主要包括:隔离异构计算模块和设计新的统一编程框架,以实现软件算法的性能可移植;在保证科学工程计算应用的整体需求之下,利用混合精度方法提升数值计算和数据存储的性能水平;结合硬件多级cache和网络通讯特征发展先进并行计算算法,避免或减少效率低下的大规模数据通讯。最后对未来研究进行了展望。

关键词: 高性能计算, E级计算, 数值线性代数, 混合精度

Abstract: The application of scientific engineering computing based on exascale computing not only offers oppor-tunities but also creates challenges for the development of numerical linear algebra algorithms. Firstly, the charac-teristics of exascale computing are analyzed, including: parallel programming for large-scale heterogeneous parallel architecture has become the mainstream approach; reducing the extremely high energy costs associated with running large-scale applications is a major concern; multi-precision heterogeneous computing hardware has triggered further research of mixed precision computing. Secondly, the optimization work of mainstream dense and sparse linear algebra solvers for high-performance computing architectures is reviewed, and the characteristics and advantages of each solver are compared. Then, the main technology progress of linear algebra solvers is summarized, mainly including: isolating heterogeneous computing modules and designing a new unified programming framework to achieve performance portability of software algorithms; improving the performance level of numerical computing and data storage using mixed precision methods while ensuring the overall requirements of scientific engineering computing applications; combined with hardware multi-level cache and network communication characteristics, advanced parallel computing algorithms are developed to avoid or reduce inefficient large-scale data communication. Finally, this paper provides an outlook on the future research trends in this direction.

Key words: high-performance computing, exascale computing, numerical linear algebra, mixed precision