计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (3): 780-794.DOI: 10.3778/j.issn.1673-9418.2211052

• 人工智能·模式识别 • 上一篇    下一篇

k-best维特比解耦合知识蒸馏的命名实体识别模型

赵红磊,唐焕玲,张玉,孙雪源,鲁明羽   

  1. 1. 山东工商学院 信息与电子工程学院,山东 烟台 264005
    2. 山东工商学院 计算机科学与技术学院,山东 烟台 264005
    3. 山东省高等学校协同创新中心:未来智能计算,山东 烟台 264005
    4. 山东省高校智能信息处理重点实验室(山东工商学院),山东 烟台 264005
    5. 大连海事大学 信息科学技术学院,辽宁 大连 116026
  • 出版日期:2024-03-01 发布日期:2024-03-01

Named Entity Recognition Model Based on k-best Viterbi Decoupling Knowledge Distillation

ZHAO Honglei, TANG Huanling, ZHANG Yu, SUN Xueyuan, LU Mingyu   

  1. 1. School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai, Shandong 264005, China
    2. School of Computer Science and Technology, Shandong Technology and Business University, Yantai, Shandong 264005, China
    3. Co-innovation Center of Shandong Colleges and Universities:Future Intelligent Computing, Yantai, Shandong 264005, China
    4. Key Laboratory of Intelligent Information Processing in Universities of Shandong (Shandong Technology and Business University), Yantai, Shandong 264005, China
    5. Information Science and Technology College, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Online:2024-03-01 Published:2024-03-01

摘要: 为提升命名实体识别(NER)模型的性能,可采用知识蒸馏方法,但是传统知识蒸馏损失函数因内部存在的耦合关系会导致蒸馏效果较差。为了解除耦合关系,有效提升输出层特征知识蒸馏的效果,提出一种结合k-best维特比解码的解耦合知识蒸馏方法(kvDKD),该方法利用k-best维特比算法提高计算效率,能够有效提升模型性能。另外,基于深度学习的命名实体识别在数据增强时易引入噪声,因此提出了融合数据筛选和实体再平衡算法的数据增强方法,旨在减少因原数据集引入噪声和增强数据错误标注的问题,提高数据集质量,减少过度拟合。最后在上述方法的基础上,提出了一种新的命名实体识别模型NER-kvDKD。在MSRA、Resume、Weibo、CLUENER和CoNLL-2003数据集上的对比实验结果表明,该方法能够提高模型的泛化能力,同时也有效提高了学生模型性能。

关键词: 命名实体识别(NER), 知识蒸馏, k-best维特比解码, 数据增强

Abstract: Knowledge distillation is a general approach to improve the performance of the named entity recognition (NER) models. However, the classical knowledge distillation loss functions are coupled, which leads to poor logit distillation. In order to decouple and effectively improve the performance of logit distillation, this paper proposes an approach, k-best Viterbi decoupling knowledge distillation (kvDKD), which combines k-best Viterbi decoding to improve the computational efficiency, effectively improving the model performance. Additionally, the NER based on deep learning is easy to introduce noise in data augmentation. Therefore, a data augmentation method combining data filtering and entity rebalancing algorithm is proposed, aiming to reduce noise introduced by the original dataset and to enhance the problem of mislabeled data, which can improve the quality of data and reduce overfitting. Based on the above method, a novel named entity recognition model NER-kvDKD (named entity recognition model based on k-best Viterbi decoupling knowledge distillation) is proposed. The comparative experimental results on the datasets of MSRA, Resume, Weibo, CLUENER and CoNLL-2003 show that the proposed method can improve the generalization ability of the model and also effectively improves the student model performance.

Key words: named entity recognition (NER), knowledge distillation, k-best Viterbi decoding, data augmentation