计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (8): 1359-1389.DOI: 10.3778/j.issn.1673-9418.2012109

• 综述·探索 • 上一篇    下一篇

自然语言处理预训练技术综述

陈德光,马金林,马自萍,周洁   

  1. 1. 北方民族大学 计算机科学与工程学院,银川 750021
    2. 北方民族大学 数学与信息科学学院,银川 750021
    3. 图像图形智能处理国家民委重点实验室,银川 750021
  • 出版日期:2021-08-01 发布日期:2021-08-02

Review of Pre-training Techniques for Natural Language Processing

CHEN Deguang, MA Jinlin, MA Ziping, ZHOU Jie   

  1. 1. School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China
    2. School of Mathematics and Information Science, North Minzu University, Yinchuan 750021, China
    3. Key Laboratory for Intelligent Processing of Computer Images and Graphics of National Ethnic Affairs Commission of the PRC, Yinchuan 750021, China
  • Online:2021-08-01 Published:2021-08-02

摘要:

在目前已发表的自然语言处理预训练技术综述中,大多数文章仅介绍神经网络预训练技术或者极简单介绍传统预训练技术,存在人为割裂自然语言预训练发展历程。为此,以自然语言预训练发展历程为主线,从以下四方面展开工作:首先,依据预训练技术更新路线,介绍了传统自然语言预训练技术与神经网络预训练技术,并对相关技术特点进行分析、比较,从中归纳出自然语言处理技术的发展脉络与趋势;其次,主要从两方面介绍了基于BERT改进的自然语言处理模型,并对这些模型从预训练机制、优缺点、性能等方面进行总结;再者,对自然语言处理的主要应用领域发展进行了介绍,并阐述了自然语言处理目前面临的挑战与相应解决办法;最后,总结工作,预测了自然语言处理的未来发展方向。旨在帮助科研工作者更全面地了解自然语言预训练技术发展历程,继而为新模型、新预训练方法的提出提供一定思路。

关键词: 预训练技术, 自然语言处理, 神经网络

Abstract:

In the published reviews of natural language pre-training technology, most literatures only elaborate neural network pre-training technologies or a brief introduction to traditional pre-training technologies, which may result in the development process of natural language pre-training dissected artificially from natural language processing. Therefore, in order to avoid this phenomenon, this paper covers the process of natural language pre-training with four points as follows. Firstly, the traditional natural language pre-training technologies and neural network pre-training technologies are introduced according to the updating route of pre-training technology. With the characteristics of related technologies analyzed, compared, this paper sums up the process of development context and trend of natural language processing technology. Secondly, based on the improved BERT (bidirectional encoder representation from transformers), this paper mainly introduces the latest natural language processing models from two aspects and sums up these models from pre-training mechanism, advantages and disadvantages, performance and so on. The main application fields of natural language processing are presented. Furthermore, this paper explores the challenges and corresponding solutions to natural language processing models. Finally, this paper summarizes the work of this paper and prospects the future development direction, which can help researchers understand the development of pre-training technologies of natural language more comprehensively and provide some ideas to design new models and new pre-training methods.

Key words: pre-training techniques, natural language processing, neural network