计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (7): 1064-1074.DOI: 10.3778/j.issn.1673-9418.1709092

• 数据库技术 • 上一篇    下一篇

基于互信息的知识图谱实体关联关系建模与补全

夏维,王珊蕾,尹子都,岳昆   

  1. 云南大学 信息学院,昆明 650500
  • 出版日期:2018-07-01 发布日期:2018-07-06

Mutual Information Based Modeling and Completion of Correlations in Knowledge Graphs

XIA Wei, WANG Shanlei, YIN Zidu, YUE Kun   

  1. School of Information Science and Engineering, Yunnan University, Kunming 650500, China
  • Online:2018-07-01 Published:2018-07-06

摘要:

知识图谱(knowledge graph,KG)中实体间缺失关系的补全,是目前KG领域研究的热点之一。随着Web2.0的快速发展,用户生成数据(user-generated data,UGD)中体现出来的实体间的关联关系是KG所描述知识的有益补充。目前基于路径的KG知识推理方法,由于存在稀疏或者错误实体关系,且连通性差,从而导致实体间关系抽取不准确。针对该问题,提出一种借助UGD中实体间关联关系来补全KG的方法。首先从UGD出发,使用互信息来计算实体节点间的关联关系,从而构建实体节点关联图(entity association graph, EAG);然后给出关联影响叠加方法来定量计算EAG中互不相邻实体间的潜在关联关系,从而得到一个关联影响值;最后对不相邻的实体节点之间的多个关联影响值再次进行叠加计算,从而判断实体间是否存在强的潜在关联关系,实现KG的补全。建立在真实数据之上的实验结果表明,所提方法对KG的补全是有效的。

关键词: 知识图谱, 补全, 用户生成数据, 互信息, 关联影响

Abstract:

The completion of missing relationships between entities in knowledge graph (KG) is the topic with great attention in the field of KG research. With the rapid development of Web2.0, the association between entities reflected by the user-generated data (UGD) is complementary to the knowledge described in KG. In the knowledge reasoning method based on KG path, there are sparse or wrong entity relations and poor connectivity, which leads to the inaccurate relationship extracted from entities. For this problem, this paper proposes a method for complementing KG by using correlation between entities in UGD. Firstly, based on the UGD, this paper uses mutual information to calculate the relationship between entity nodes and build the entity association graph (EAG), and then proposes a superposition method to quantify the potential correlation between non-adjacent entities in the EAG, so the association impact values are obtained. Finally, the multiple correlation effects between non-adjacent entity nodes are superposed to determine whether there is a strong correlation between the entities. By adding the edges between non-adjacent entity nodes with associations, KG completion can be fulfilled. The experimental results based on real data sets show the efficiency and effectiveness of the proposed KG completion.

Key words: knowledge graph, completion, user-generated data, mutual information, association impact