计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (10): 1870-1879.DOI: 10.3778/j.issn.1673-9418.2106094

• 知识问答系统 • 上一篇    下一篇

融合子图结构的神经推理式知识库问答方法

陈子阳,廖劲智,赵翔,陈盈果   

  1. 国防科技大学 信息系统工程重点实验室,长沙 410073
  • 出版日期:2021-10-01 发布日期:2021-09-30

Incorporating Subgraph Structure Knowledge Base Question Answering via Neural Reasoning

CHEN Ziyang, LIAO Jinzhi, ZHAO Xiang, CHEN Yingguo   

  1. Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China
  • Online:2021-10-01 Published:2021-09-30

摘要:

知识库(或知识图谱)作为一种对现实世界的有效表征模式,引起了学术界和工业界广泛关注。近年来,随着大规模知识库的出现,知识库问答技术作为知识库的基础应用技术同样备受关注。基于语义解析的代表方法通过对查询句的解析将问题转化为图上的答案检索,但知识库中往往存在缺失的链接,导致上述过程无法顺利开展;基于神经推理的代表模型通过对问题进行编码来进行实体相似度排序,但其无法解决动态场景下的实体冷启动问题。针对上述问题,提出了一种融合子图结构的神经推理式知识库问答方法,实现了在问答推理过程中兼顾实体的语义与结构信息,从而进行更充分的推理。首先,通过预训练模型RoBERTa将问句转换为包含语义的向量;其次,根据问句中的实体构建相应的问答子图,并利用图神经网络提取子图的结构信息;再次,基于背景知识库进行实体表示预训练,并与对应的结构表示进行融合;最后,根据融合后的向量对候选答案进行评分,将评分最高的实体作为答案。在WebQuestionsSP数据集上进行了对比测试,实验结果表明,提出的模型优于其他基准模型。

关键词: 知识库问答, 神经推理, 子图结构, 图卷积网络

Abstract:

As an effective representation model of the real world knowledge, knowledge base (or knowledge graph) has attracted wide attention from academia and industry. In recent years, with the emergence of large-scale knowledge bases, knowledge base question answering has also attracted attention as a basic application technology of knowledge bases. Among them, the typical method based on semantic parsing transforms questions into answer retrieval on graphs by parsing query sentences,  however, which neglects that there are often missing links in knowledge bases. As a result, the above process might fall short in some cases. The typical model based on neural reasoning performs entity similarity ranking by encoding questions, but it cannot solve the cold start problem of given entities in dynamic scenarios. To address the above problems, a neural inference knowledge base question-and-answer method incorporating subgraph structures is proposed to achieve a more adequate inference by taking into account the semantic and structural information of entities in the question-and-answer inference process. Firstly, the question and answer are converted into vectors containing semantic information by the pre-training model RoBERTa. Secondly, the corresponding question and answer subgraphs are constructed based on the entities in the question and answer, and the structural information of the subgraphs is extracted using graph neural networks. Then, the entity representations are pre-trained based on the background knowledge base and fused with the corresponding structural representations.  Finally, the candidate answers are rated based on the fused vectors, and the entity with the highest rating is considered as the answer. Extensive experiments are conducted on the WebQuestionsSP dataset, and the experimental results show that the proposed model outperforms other benchmark models.

Key words: knowledge base question answering, neural reasoning, subgraph structure, graph convolutional network