计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (2): 486-495.DOI: 10.3778/j.issn.1673-9418.2211011

• 人工智能·模式识别 • 上一篇    下一篇

融合描述信息和结构特征的知识图谱链接预测

陈加兴,胡志伟,李茹,韩孝奇,卢江,闫智超   

  1. 1. 山西大学 计算机与信息技术学院,太原 030006
    2. 山西大学 计算智能与中文信息处理教育部重点实验室,太原 030006
  • 出版日期:2024-02-01 发布日期:2024-02-01

Knowledge Graph Link Prediction Fusing Description and Structural Features

CHEN Jiaxing, HU Zhiwei, LI Ru, HAN Xiaoqi, LU Jiang, YAN Zhichao   

  1. 1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
    2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
  • Online:2024-02-01 Published:2024-02-01

摘要: 知识图谱普遍存在知识不完整的问题,这使得链接预测成为知识图谱的重要研究内容。现有模型仅关注三元组的嵌入表示,一方面,在模型的输入仅对实体和关系的嵌入表示进行随机初始化,并未融入实体及关系的描述信息,会缺失语义信息;另一方面,在解码时忽略三元组自身结构特征对链接预测结果的影响。针对上述问题,提出一种融合描述信息与结构特征的知识图谱链接预测模型BFGAT。BFGAT模型利用BERT预训练模型编码实体和关系的描述信息,并将描述信息融入到实体与关系的嵌入表示中,解决缺失语义信息的问题;在编码过程使用图注意力机制聚合邻接节点的信息,解决目标节点获得更为丰富信息的问题;在解码过程把三元组的嵌入表示拼接成矩阵,采用基于CNN卷积池化的方法,解决三元组结构特征的问题。该模型在公开数据集FB15k-237和WN18RR上进行了详细的实验,实验表明BFGAT模型能有效提高知识图谱链接预测的效果。

关键词: 知识图谱, 链接预测, BERT, 卷积神经网络(CNN)

Abstract: Knowledge graph generally has the problem of incomplete knowledge, which makes link prediction an important research content of knowledge graph. Existing models only focus on the embedding representation of triples. On the one hand, in terms of model input, only the embedding representation of entities and relations is randomly initialized, and the description information of entities and relations is not incorporated, which will lack semantic information; on the other hand, in decoding, the influence of the structural features of the triplet itself on the link prediction results is ignored. Aiming at the above problems, this paper proposes a knowledge graph link prediction model BFGAT (graph attention network link prediction based on fusion of description information and structural features) that integrates description information and structural features. The BFGAT model uses the BERT pretraining model to encode the description information of entities and relations, and integrates the description information into the embedding representation of entities and relations to solve the problem of missing semantic information. In the coding process, graph attention mechanism is used to aggregate the information of adjacent nodes to solve the problem that the target node can obtain more information. The embedding representation of triples is spliced into a matrix in the decoding process, using a method based on CNN convolution pooling to solve the problem of triple structural features. The model is subjected to detailed experiments on the public datasets FB15k-237 and WN18RR, and the experiments show that the BFGAT model can effectively improve the effect of knowledge graph link prediction.

Key words: knowledge graph, link prediction, BERT, convolutional neural networks (CNN)