计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (12): 2999-3009.DOI: 10.3778/j.issn.1673-9418.2208105

• 人工智能·模式识别 • 上一篇    下一篇

噪声知识图谱表示学习:一种规则增强的方法

邵天阳,肖卫东,赵翔   

  1. 国防科技大学 信息系统工程重点实验室,长沙 410073
  • 出版日期:2023-12-01 发布日期:2023-12-01

Noisy Knowledge Graph Representation Learning: a Rule-Enhanced Method

SHAO Tianyang, XIAO Weidong, ZHAO Xiang   

  1. Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China
  • Online:2023-12-01 Published:2023-12-01

摘要: 知识图谱用于存储结构化事实,这些事实以三元组的形式表示,即(头实体,关系,尾实体)。当前大规模知识图谱的构建通常采用(半)自动化的方法进行知识抽取,过程中不可避免地会引入噪声,这可能会影响知识表示的效果。然而,多数传统表示学习方法假设知识图谱中的三元组都是正确的,并据此对知识进行分布式表示。因此,对知识图谱进行噪声检测是一项至关重要的工作。此外,知识图谱的不完整问题也备受人们关注。对以上问题进行了研究,提出了一种逻辑规则和关系路径信息相结合的知识表示学习框架,它在检测可能存在的噪声的同时,还能生成无噪的知识表示,实现相互辅助增强的效果。具体而言,该框架分为三元组嵌入模块和三元组可信度估计模块。在三元组嵌入模块中,在三元组结构信息的基础上引入关系路径信息和逻辑规则信息以构造更为完善的知识表示,其中后者用于增强关系路径推理的能力和表示学习的可解释性;在三元组可信度估计模块中,进一步利用三种信息对三元组进行可信度判断以检测可能存在的噪声。在三个公开评测数据集上进行了实验验证,结果表明,与所有的基线方法相比,该模型在知识图谱噪声检测和知识补全等任务上均取得了显著的性能提升。

关键词: 知识图谱, 知识图谱补全, 噪声检测, 三元组可信度, 三元组嵌入

Abstract: Knowledge graphs are used to store structured facts, which are presented in the form of triples, i.e., (head entity, relation, tail entity). Current large-scale knowledge graphs are usually constructed with (semi-) automated methods for knowledge extraction and the process inevitably introduces noise, which may affect the effectiveness of the knowledge representation. However, most traditional representation learning methods assume that the triples in knowledge graphs are correct and represent knowledge in a distributed manner accordingly. Therefore, noise detection on knowledge graphs is a crucial task. In addition, the incompleteness of knowledge graphs has also attracted people’s attention. The above problems are studied and a knowledge representation learning framework combining logical rules and relation path information is proposed, which accomplishes knowledge representation learning and achieves a mutual enhancement effect while detecting possible noise. Specifically, the framework is divided into a triple embedding part and a triple trustworthiness estimation part. In the triple embedding part, relation path information and logical rule information are introduced to construct a better knowledge representation based on the triple structure information, the latter of which is used to enhance the ability of relation path reasoning and the interpretability of the representation learning. In the triple trustworthiness estimation part, three types of information are further utilized to detect possible noise. Experiments are conducted on three public evaluated datasets and the results show that the model achieves significant performance improvement in tasks such as knowledge graph noise detection and knowledge complementation compared with all baseline methods.

Key words: knowledge graph, knowledge graph completion, noise detection, triple trustworthiness, triple embedding