计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (7): 1576-1585.DOI: 10.3778/j.issn.1673-9418.2111062

• 理论·算法 • 上一篇    下一篇

面向大图子图匹配的多GPU编程模型

李岑浩,崔鹏杰,袁野,王国仁   

  1. 1. 东北大学 计算机科学与工程学院,沈阳 110000
    2. 北京理工大学 计算机学院,北京 100081
  • 出版日期:2023-07-01 发布日期:2023-07-01

Multi-GPU Programming Model for Subgraph Matching in Large Graphs

LI Cenhao, CUI Pengjie, YUAN Ye, WANG Guoren   

  1. 1. School of Computer Science and Engineering, Northeastern University, Shenyang 110000, China
    2. School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
  • Online:2023-07-01 Published:2023-07-01

摘要: 子图匹配是复杂网络中进行数据挖掘的重要手段。近年来,基于图形处理器(GPU)的子图匹配算法已展现明显的速度优势。然而,由于大图数据的规模宏大以及子图匹配的大量中间结果,单块GPU的内存容量很快成为了处理大图子图匹配算法的主要瓶颈。因此,提出了一种面向大图子图匹配的多GPU编程模型。首先,提出了基于多GPU的子图匹配算法框架,实现了子图匹配算法在多GPU上的协同操作,解决了GPU大图子图匹配的图规模问题。其次,采用了一种基于查询图的动态调节技术来处理跨分区子图集,解决了图划分导致的跨分区子图匹配难题。最后,结合GPU单指令多线程(SIMT)架构特性,提出一种优先级调度策略保证GPU的内部负载均衡,并设计了共享内存的流水线机制优化多核并发的缓存争用。实验表明,多GPU编程模型能够在数十亿级别的数据集上得到正确的匹配结果,与最新的基于GPU的解决方案相比,该算法框架能够获得1.2~2.6倍的加速比。

关键词: 图分析, 多GPU, 大图子图匹配, 优先级调度, 并行编程模型

Abstract: Subgraph matching is an important method of data mining in complex networks. In recent years, the subgraph matching algorithm based on GPU (graphics processing units) has shown obvious speed advantages.However, due to the large scale of graph data and a large number of intermediate results of subgraph matching, the memory capacity of a single GPU soon becomes the main bottleneck for processing subgraph matching algorithm of large graph. Therefore, this paper proposes a multi-GPU programming model for large graph subgraph matching. Firstly, the framework of subgraph matching algorithm based on multi-GPU is proposed, and the cooperative operation of subgraph matching algorithm on multi-GPU is realized, which solves the problem of graph scale of subgraph matching on GPU. Secondly, a dynamic adjustment technique based on query graph is used to deal with cross-partition subgraph sets, which solves the cross-partition subgraph matching problem caused by graph segmentation. Finally, based on the characteristics of SIMT (single instruction multiple threads) architecture on GPU, a priority scheduling strategy is proposed to ensure the internal load balancing of GPU, and a pipeline mechanism of shared memory is designed to optimize the cache contention of multi-core concurrency. Experiments show that the proposed multi-GPU programming model can get the correct matching results on billions of datasets. Compared with the latest GPU-based solution, the proposed algorithm framework can achieve 1.2 to 2.6 times of acceleration ratio.

Key words: graph analysis, multi-GPU, subgraph matching in large graphs, priority scheduling, concurrent programming model