计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (11): 1740-1747.DOI: 10.3778/j.issn.1673-9418.1709048

• 数据库技术 • 上一篇    下一篇

大规模动态图中标签约束的频繁子图Top-K查询

单晓欢,王广香,宋宝燕,丁琳琳,许岩   

  1. 辽宁大学 信息学院,沈阳 110036
  • 出版日期:2018-11-01 发布日期:2018-11-12

Frequent Subgraph Top-K Query with Label Constraint on Large-Scale Dynamic Graph

SHAN Xiaohuan, WANG Guangxiang, SONG Baoyan, DING Linlin, XU Yan   

  1. School of Information, Liaoning University, Shenyang 110036, China
  • Online:2018-11-01 Published:2018-11-12

摘要:

Top-K子图查询作为重要的图搜索技术,因可更具针对性地为用户返回查询结果而被广泛应用于社交网、生物信息网等新兴领域。随着图规模增大且动态演变,用户通常希望通过增加约束条件而快速、准确获得查询结果。鉴于上述查询需求,提出了一种标签约束的频繁子图Top-K查询方法(LVC-FS Top-K)。该方法通过建立频繁结构映射与标签值聚合的二级索引(FSM-LVA),快速准确地锁定查询图结构并根据约束限制剪枝过滤,缩小查询范围,提高查询效率;利用FSM-LVA索引对同构于查询图的频繁结构进行查找以实现频繁结构查询,同时结合查询图的约束条件及K值限制对频繁子图进行匹配筛选,缩小比较空间,加快查询效率。实验结果表明提出的方法能快速准确地在大规模动态图中进行具有约束限制的频繁子图Top-K查询。

关键词: 大规模动态图, 标签约束, 聚合划分, Top-K查询

Abstract:

As an important technology of graph search, Top-K subgraph query is widely used in some emerging fields such as social network, biological information networks and so on, because of its pointedly returning query  results to users. With the increasing of the scale and dynamic evolution of the graph, users often hope to obtain the query results quickly and accurately by adding constraints. In view of the above query requirement, this paper proposes a method called frequent subgraph Top-K query method under label value constrained (LVC-FS Top-K). This method establishes the two level index with the frequent structure mapping and label value aggregation (FSM-LVA), which will lock the query graph structure quickly and accurately, and prune and filter according to the constraint limit so that it can narrow the search range and improve the query efficiency. It uses FSM-LVA index to find isomorphic query structure, and combines with the constraint conditions and K to filter frequent subgraphs that can narrow the space and speed up the query efficiency. Experiments have proven that the method can quickly and accurately perform the frequent subgraph Top-K query with constraint limit on large-scale dynamic graphs.

Key words: large-scale dynamic graph, label constraint, aggregation partition, Top-K query