计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (4): 511-519.DOI: 10.3778/j.issn.1673-9418.1509084

• 学术研究 • 上一篇    下一篇

融合马尔可夫聚类的实体间关系消解方法

常雨骁1,2+,庞  琳3,贾岩涛1,林海伦1,2,王元卓1,刘  悦1,刘春阳3   

  1. 1. 中国科学院 计算技术研究所 网络数据科学与技术重点实验室,北京 100190
    2. 中国科学院大学,北京 100049
    3. 国家计算机网络应急技术处理协调中心,北京 100029
  • 出版日期:2017-04-12 发布日期:2017-04-12

Entity Relation Resolution Method by Integrating Markov Cluster Algorithm

CHANG Yuxiao1,2+, PANG Lin3, JIA Yantao1, LIN Hailun1,2, WANG Yuanzhuo1, LIU Yue1, LIU Chunyang3   

  1. 1. Research Center of Web Data Science & Engineering, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
  • Online:2017-04-12 Published:2017-04-12

摘要: 随着面向网络大数据的知识库的不断出现,它们各自都包含海量的实体以及实体间的关系。然而许多有相同含义的关系并没有统一名称,针对这种情况,提出了一种基于马尔可夫聚类(Markov cluster algorithm,MCL)的实体间关系融合方法。该方法首先计算关系间的语义相似度,然后利用关系间的语义相似度作为有边的权重,构建无向图,并利用马尔可夫聚类算法进行聚类。实验表明,该方法相比层次聚类和k-means聚类方法在聚类纯度上有一定提高,并且更加方便使用。

关键词: 马尔可夫聚类, 知识库, 实体间关系

Abstract: Recent years, the development of knowledge bases is very fast. They store large scale of entities and the relations between entities. However, most of the relations which have the same meanings are not in the same form. It is necessary to resolute the relations. For this purpose, this paper proposes an approach based on Markov cluster algorithm to cluster the relation with same meanings. Firstly, this paper calculates the semantic similarity between every two relations, and then it uses the relation similarity as weighted-edge to build a graph. Finally, this paper runs a Markov cluster algorithm on the graph and gets the result of relation clusters. Experiments show that the proposed approach has a higher purity than hierarchy cluster and k-means cluster.

Key words: Markov cluster, knowledge base, relation between entities