计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (7): 1439-1461.DOI: 10.3778/j.issn.1673-9418.2108105

• 综述·探索 • 上一篇    下一篇

知识增强型预训练语言模型综述

韩毅1, 乔林波2, 李东升2, 廖湘科2,+()   

  1. 1.国防科技大学 气象海洋学院,长沙 410073
    2.国防科学大学 计算机学院,长沙 410073
  • 收稿日期:2021-08-30 修回日期:2022-03-22 出版日期:2022-07-01 发布日期:2022-07-25
  • 作者简介:韩毅(1993—),男,山东青岛人,博士,讲师,主要研究方向为自然语言处理、知识图谱等。
    HAN Yi, born in 1993, Ph.D., lecturer. His research interests include natural language processing, knowledge graph, etc.
    乔林波(1987—),男,重庆万州人,博士,助理研究员,主要研究方向为结构化稀疏学习、分布式优化、深度学习等。
    QIAO Linbo, born in 1987, Ph.D., research assistant. His research interests include structured sparse learning, distributed optimization, deep learning, etc.
    李东升(1978—),男,安徽桐城人,博士,研究员,博士生导师,主要研究方向为并行与分布式计算、云计算、大规模数据管理等。
    LI Dongsheng, born in 1978, Ph.D., professor, Ph.D. supervisor. His research interests include parallel and distributed computing, cloud computing, large-scale data management, etc.
    廖湘科(1963—),男,湖南涟源人,博士,研究员,博士生导师,主要研究方向为并行与分布式计算、高性能计算机系统、操作系统等。
    LIAO Xiangke, born in 1963, Ph.D., professor, Ph.D. supervisor. His research interests include parallel and distributed computing, high-performance computer systems, operating systems, etc.
  • 基金资助:
    并行与分布处理重点实验室(PDL)科技开放基金(6142110200203);并行与分布处理重点实验室(PDL)科技开放基金(WDZC20205500101)

Review of Knowledge-Enhanced Pre-trained Language Models

HAN Yi1, QIAO Linbo2, LI Dongsheng2, LIAO Xiangke2,+()   

  1. 1. College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
    2. College of Computer, National University of Defense Technology, Changsha 410073, China
  • Received:2021-08-30 Revised:2022-03-22 Online:2022-07-01 Published:2022-07-25
  • Supported by:
    the Open Fund of Science and Technology on Parallel and Distributed Processing Laboratory (PDL)(6142110200203);the Open Fund of Science and Technology on Parallel and Distributed Processing Laboratory (PDL)(WDZC20205500101)

摘要:

知识增强型预训练语言模型旨在利用知识图谱中的结构化知识来强化预训练语言模型,使之既能学习到自由文本中的通用语义知识,又能够学习到文本背后的现实实体知识,从而有效应对下游知识驱动型任务。虽然该方向研究潜力巨大,但相关工作目前尚处初期探索阶段,并未出现全面的总结和系统的梳理。为填补该方向综述性文章的空白,在归纳整理大量相关文献的基础上,首先从引入知识的原因、引入知识的优势、引入知识的难点三方面说明了知识增强型预训练语言模型产生的背景信息,总结了其中涉及的基本概念;随后列举了利用知识扩充输入特征、利用知识改进模型架构以及利用知识约束训练任务等三大类知识增强方法;最后统计了各类知识增强型预训练语言模型在评估任务上的得分情况,分析了知识增强模型的性能指标、目前面临的困难挑战以及未来可能的发展方向。

关键词: 知识图谱, 预训练语言模型, 自然语言处理

Abstract:

The knowledge-enhanced pre-trained language models attempt to use the structured knowledge stored in the knowledge graph to strengthen the pre-trained language models, so that they can learn not only the general semantic knowledge from the free text, but also the factual entity knowledge behind the text. In this way, the enhanced models can effectively solve downstream knowledge-driven tasks. Although this is a promising research direction, the current works are still in the exploratory stage, and there is no comprehensive summary and systematic arrangement. This paper aims to address the lack of comprehensive reviews of this direction. To this end, on the basis of summarizing and sorting out a large number of relevant works, this paper firstly explains the background information from three aspects: the reasons, the advantages, and the difficulties of introducing knowledge, summarizes the basic concepts involved in the knowledge-enhanced pre-trained language models. Then, it discusses three types of knowledge enhancement methods: using knowledge to expand input features, using knowledge to modify model architecture, and using knowledge to constrain training tasks. Finally, it counts the scores of various knowledge enhanced pre-trained language models on several evaluation tasks, analyzes the performance, the current challenges, and possible future directions of knowledge-enhanced pre-trained language models.

Key words: knowledge graph, pre-trained language models, natural language processing

中图分类号: