计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (8): 1214-1224.DOI: 10.3778/j.issn.1673-9418.1709043

• 数据库技术 • 上一篇    下一篇

基于结构分解的动态图增量匹配算法

许  嘉1,2,3,张千桢1,赵  翔4+,吕  品1,2,3,李陶深1,2   

  1. 1. 广西大学 计算机与电子信息学院,南宁 530004
    2. 广西高校并行分布计算技术重点实验室,南宁 530004
    3. 广西多媒体通信与网络技术重点实验室,南宁 530004
    4. 国防科技大学 信息系统与管理学院,长沙 410073
  • 出版日期:2018-08-01 发布日期:2018-08-09

Incremental Pattern Matching Algorithm for Dynamic Graph Based on Structure Decomposition

XU Jia1,2,3, ZHANG Qianzhen1, ZHAO Xiang4+, LV Pin1,2,3, LI Taoshen1,2   

  1. 1. School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China
    2. Guangxi Colleges and Universities Key Laboratory of Parallel and Distributed Computing, Nanning 530004, China
    3. Guangxi Key Laboratory of Multimedia Communications Network Technology, Nanning 530004, China
    4. College of Information System and Management, National University of Defense Technology, Changsha 410073, China
  • Online:2018-08-01 Published:2018-08-09

摘要: 在大数据时代,图数据的规模急剧增长,增量图模式匹配技术能够在数据图发生变化时避免重新对整个数据图进行匹配,进而减少匹配时间,提高整体执行效率,因此成为研究热点。然而,现有的增量匹配算法处理规模较大的模式图时效率会降低。针对该问题,提出了一种基于结构分解的增量图模式匹配算法Inc_CFLS。在匹配过程中,为中间匹配结果构建高效索引,用于后续的模式匹配计算。基于构建的索引信息对数据图增加边事件进行分类,进而为每类增加边事件设计查询剪枝优化策略,从而有效提高匹配效率。在真实数据集上进行实验,结果表明Inc_CFLS算法比目前最好的增量匹配算法在执行效率上平均提升了1~2倍,能更有效支持大规模动态图上的模式匹配。

关键词: 动态图, 图模式匹配, 增量算法, 结构分解, 大图数据

Abstract: In the age of big data, the scale of data graphs expands rapidly. The incremental graph pattern matching technique becomes a research focus, since it avoids re-computing matches in the whole data graph, as a result reducing the matching time and improving the overall execution performance when the data graph changes. However, the efficiency of existing incremental matching algorithms will reduce when dealing with large scale patterns. To tackle this problem, this paper proposes a novel incremental graph pattern matching algorithm based on structural decomposition named Inc_CFLS. The Inc_CFLS algorithm constructs an efficient index based on intermediate matched results, and the index is utilized to speed up subsequent pattern matching calculations. The Inc_CFLS algorithm further makes use of the index information to classify the edge insertion events over the data graph into different event classes, and designs query pruning optimization strategies for each of these event classes, which effectively improves the matching efficiency. The experimental results over physical-world datasets show that the proposed Inc_CFLS algorithm runs 1—2 times faster than the state-of-the-art incremental graph matching algorithm. Especially, the Inc_CFLS algorithm can effectively handle graph pattern matching analysis over large dynamic graphs.

Key words: dynamic graphs, graph pattern matching, incremental algorithm, structure decomposition, big graph data