Journal of Frontiers of Computer Science and Technology ›› 2021, Vol. 15 ›› Issue (10): 1843-1869.DOI: 10.3778/j.issn.1673-9418.2106095

• Knowledge-Based Question Answering Systems • Previous Articles     Next Articles

Survey of Open-Domain Knowledge Graph Question Answering

CHEN Zirui, WANG Xin, WANG Lin, XU Dawei, JIA Yongzhe   

  1. 1. College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
    2. Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
    3. Tianjin TechFantasy Co., Ltd., Tianjin 300457, China
  • Online:2021-10-01 Published:2021-09-30

开放领域知识图谱问答研究综述

陈子睿王鑫王林徐大为贾勇哲   

  1. 1. 天津大学 智能与计算学部,天津 300350
    2. 天津市认知计算与应用重点实验室,天津 300350
    3. 天津泰凡科技有限公司,天津 300457

Abstract:

Knowledge graph question answering (KGQA) is the procedure of processing natural language questions posed by users to obtain relevant answers from knowledge graph (KG) based on some form of KG. Due to the limitation of knowledge scale, computing power and natural language processing capability, the early knowledge base question answering systems were limited to closed-domain questions. In recent years, with the development of KG and the construction of open-domain question answering (QA) datasets, KG has been used for open-domain QA research and practice. In this paper, in accordance with the development of technology, the open-domain KGQA is summarized. Firstly, five rule and template based KGQA methods are reviewed, including traditional semantic parsing, traditional information retrieval, triplet matching, utterance template, and query template. This type of methods mainly relies on manually defined rules and templates to complete QA task. Secondly, five deep learning based KGQA methods are introduced, which use neural network models to complete the subtasks of QA process, including knowledge graph embedding, memory network, neural network-based semantic parsing, neural network-based query graph, and neural network-based information retrieval method. Thirdly, four general domain KG and eleven open-domain QA datasets, which KGQA commonly used are described. Fourthly, three classic KGQA datasets are selected according to the difficulty of questions to compare and analyze the performance metric of each KGQA system, and the effect between above methods. Finally, this paper looks forward to the future research directions on this topic.

Key words: knowledge graph question answering (KGQA), open-domain, deep learning, semantic parsing, information retrieval

摘要:

知识图谱问答是通过处理用户提出的自然语言问题,基于知识图谱的某种形式,从中获取相关答案的过程。由于知识规模、计算能力及自然语言处理能力的制约,早期知识库问答系统被应用于限定领域。近年来,随着知识图谱的发展,以及开放领域问答数据集的陆续提出,知识图谱已用于开放领域问答研究与实践。以技术发展为主线,对开放领域知识图谱问答进行综述。首先,介绍五种基于规则模板的开放领域知识图谱问答方法:传统语义解析、传统信息检索、三元组匹配、话语模板和查询模板,这类方法主要依赖人工定义的规则模板完成问答工作。其次,描述五种基于深度学习的方法,这类方法采用神经网络模型完成问答过程的各类子任务,包括知识图谱嵌入、记忆网络、基于神经网络的语义解析、基于神经网络的查询图、基于神经网络的信息检索。接着,介绍开放领域知识图谱问答常用的4个通用领域知识图谱和11个开放领域问答数据集。随后,按照问题的难易程度选择3个经典问答数据集比较各问答系统的性能指标,对比不同方法间的性能差异并进行分析。最后,展望开放领域知识图谱问答的未来研究方向。

关键词: 知识图谱问答(KGQA), 开放领域, 深度学习, 语义解析, 信息检索