计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (7): 1725-1747.DOI: 10.3778/j.issn.1673-9418.2311027

• 前沿·综述 • 上一篇    下一篇

基于知识蒸馏的神经机器翻译综述

马畅,田永红,郑晓莉,孙康康   

  1. 内蒙古工业大学 数据科学与应用学院,呼和浩特 010000
  • 出版日期:2024-07-01 发布日期:2024-06-28

Survey of Neural Machine Translation Based on Knowledge Distillation

MA Chang, TIAN Yonghong, ZHENG Xiaoli, SUN Kangkang   

  1. College of Data Science and Application, Inner Mongolia University of Technology, Hohhot 010000, China
  • Online:2024-07-01 Published:2024-06-28

摘要: 机器翻译(MT)是利用计算机将一种语言转换为与其同语义的另一种语言的过程。随着神经网络的提出,神经机器翻译(NMT)作为一种强大的机器翻译技术,在自动翻译领域和人工智能方向上取得了显著成功。由于传统神经翻译模型存在参数、结构冗余的问题,提出了使用知识蒸馏(KD)技术手段对神经机器翻译进行模型压缩和加速推理,该方法在机器学习和自然语言处理领域引起了广泛的关注。主要从评价指标、技术创新等角度对各种引入知识蒸馏的翻译模型进行了系统的考察和比较。首先简要回顾了机器翻译的发展历程、主流框架和评价指标;接着详细介绍了知识蒸馏技术;然后分别从多语言模型、多模态翻译、低资源语言以及自回归和非自回归四个角度详述了基于知识蒸馏的神经机器翻译发展方向,并简要介绍其他领域的研究现状;最后针对现有的大语言模型、零资源语言及多模态机器翻译所存在的问题进行分析,展望神经机器翻译发展趋势。

关键词: 机器翻译, 神经机器翻译, 知识蒸馏, 模型压缩

Abstract: Machine translation (MT) is the process of using a computer to convert one language into another language with the same semantics. With the introduction of neural network, neural machine translation (NMT), as a powerful machine translation technology, has achieved remarkable success in the field of automatic translation and artificial intelligence. Due to the problem of redundant parameters and structure in traditional neural translation models, knowledge distillation (KD) technology is proposed to compress the model and accelerate the inference of neural machine translation, which has attracted wide attention in the field of machine learning and natural language processing. This paper systematically investigates and compares various translation models with introduction of  know-ledge distillation from the perspectives of evaluation indicators and technical innovations. Firstly, this paper briefly reviews the development process, mainstream frameworks and evaluation indicators of machine translation. Secondly, the knowledge distillation technology is introduced in detail. Thirdly, the development direction of neural machine translation based on knowledge distillation is detailed from four perspectives: multi-language model, multi-modal translation, low-resource language, autoregressive and non-autoregressive, and the research status of other fields is briefly introduced. Finally, the problems of existing large language models, zero-resource languages and multi-modal machine translation are analyzed, and the development trend of neural machine translation is prospected.

Key words: machine translation, neural machine translation, knowledge distillation, model compression