计算机科学与探索 ›› 2016, Vol. 10 ›› Issue (4): 451-465.DOI: 10.3778/j.issn.1673-9418.1507072

• 学术研究 • 上一篇    下一篇

MPPIE:基于消息传递的RDFS并行推理框架

吕小玲1,2,王  鑫1,2,3+,冯志勇1,2,饶国政1,2,张小旺1,2,许光全1,2   

  1. 1. 天津大学 计算机科学与技术学院,天津 300072
    2. 天津市认知计算与应用重点实验室,天津 300072
    3. 南大通用数据技术股份有限公司,天津 300384
  • 出版日期:2016-04-01 发布日期:2016-04-01

MPPIE: RDFS Parallel Inference Framework Based on Message Passing

LV Xiaoling1,2, WANG Xin1,2,3+, FENG Zhiyong1,2, RAO Guozheng1,2, ZHANG Xiaowang1,2, XU Guangquan1,2   

  1. 1. School of Computer Science and Technology, Tianjin University, Tianjin 300072, China
    2. Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300072, China
    3. General Data Technology Co., Ltd., Tianjin 300384, China
  • Online:2016-04-01 Published:2016-04-01

摘要: 随着语义Web的快速发展,RDF(resource description framework)语义数据规模呈现爆炸性增长趋势,大规模语义数据上的推理工作面临严峻挑战。基于消息传递机制提出了一种新的RDFS(RDF schema)并行推理方案。利用RDF图数据结构,建立RDFS推理过程的图上加边模型。以顶点为计算中心,根据不同推理模型,向其他顶点传递推理消息,完成推理操作。当所有推导出的新三元组以边的形式加入原RDF图中时,整个推理过程结束。在基于消息传递模型的开源框架Giraph上,实现了RDFS并行推理框架MPPIE(message passing parallel inference engine)。实验结果表明,在标准数据集LUBM和真实数据集DBpedia上,MPPIE执行速度均比当前性能最好的语义推理引擎WebPIE快一个数量级,且展现了良好的可伸展性。

关键词: 资源描述框架(RDF), RDFS推理, 消息传递, Pregel, 并行推理

Abstract: Reasoning over semantic data poses a challenge, since large volumes of RDF (resource description framework) data have been published with the rapid development of the Semantic Web. This paper proposes an RDFS (RDF schema) parallel inference framework based on message passing mechanism. The graph structure of RDF data is exploited to abstract inference process to an edge addition model. Vertices execute the parallel inference algorithm, which can send reasoning messages to other vertices to complete inference process. When all derivations are regarded as new edges of initial RDF graph, the computation terminates. MPPIE (message passing parallel inference engine), the RDFS parallel inference framework, is implemented on top of open source framework Giraph. The experimental   results on both benchmark dataset LUBM and real world dataset DBpedia show that the performance of the proposed method outperforms WebPIE, the state-of-art semantic scalable inference engine. Furthermore, the proposed method provides good scalability.

Key words: resource description framework (RDF), RDFS inference, message passing, Pregel, parallel inference