Journal of Frontiers of Computer Science and Technology ›› 2012, Vol. 6 ›› Issue (2): 118-124.DOI: 10.3778/j.issn.1673-9418.2012.02.003

Previous Articles     Next Articles

Optimization of Parallel Principal Eigenvectors Computing for Large-Scale Sparse Matrixes

WANG Wei, CHEN Jianping, ZENG Guosun, YU Lihua, TAN Yiming   

  1. 1. Department of Computer Science and Engineering, Tongji University, Shanghai 200092, China 2. Tongji Branch, National Engineering & Technology Center of High Performance, Shanghai 200092, China 3. Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai 200092, China
  • Online:2012-02-01 Published:2012-02-01

大规模稀疏矩阵的主特征向量计算优化方法

王 伟, 陈建平, 曾国荪, 俞莉花, 谭一鸣   

  1. 1. 同济大学 计算机科学与技术系, 上海 200092
    2. 国家高性能计算机工程技术中心 同济分中心, 上海 200092
    3. 同济大学 嵌入式系统与服务计算教育部重点实验室, 上海 200092

Abstract: The principal eigenvectors computing (PEC) is a paramount operation in engineering and scientific computing. Since the general-purpose computing on graphics processing unit (GPGPU) emerges for the outstanding acceleration factors, PEC implementations on graphics processing unit (GPU) have appeared on the scene. This paper analyzes PEC performance bottleneck from the characteristic of application and GPU architecture, and therefore proposes a new implementation of PEC based on a new matrix storage format, called GPU-ELL, and an optimized thread mapping strategy of GPU. It evaluates the proposed approach over ATI HD Radeon 5850 GPU, and the experimental results show its good performance with average 200 times acceleration of other existing algorithm on CPU, and 2 times of that on GPU.

Key words: general-purpose computing on graphics processing unit (GPGPU), principal eigenvectors computing (PEC), sparse matrix vector (SpMV), thread optimization

摘要: 矩阵主特征向量(principal eigenvectors computing, PEC)的求解是科学与工程计算中的一个重要问题。随着图形处理单元通用计算(general-purpose computing on graphics processing unit, GPGPU)的兴起, 利用GPU来优化大规模稀疏矩阵的图形处理单元求解得到了广泛关注。分别从应用特征和GPU体系结构特征两方面分析了PEC运算的性能瓶颈, 提出了一种面向GPU的稀疏矩阵存储格式——GPU-ELL和一个针对GPU的线程优化映射策略, 并设计了相应的PEC优化执行算法。在ATI HD Radeon 5850上的实验结果表明, 相对于传统CPU, 该方案获得了最多200倍左右的加速, 相对于已有GPU上的实现, 也获得了2倍的加速。

关键词: 图形处理单元通用计算(GPGPU), 主特征向量计算, 稀疏矩阵向量乘, 线程优化