计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (8): 2174-2187.DOI: 10.3778/j.issn.1673-9418.2407077

• 人工智能·模式识别 • 上一篇    下一篇

自然语言视角下基于凸包的知识蒸馏可解释性研究

张程,曹京旭,吕劲昕,严冬梅   

  1. 1. 天津财经大学 数字经济与管理学院,天津 300221 
    2. 天津财经大学 理工学院,天津 300221
  • 出版日期:2025-08-01 发布日期:2025-07-31

Study on Interpretability of Knowledge Distillation Based on Convex Hulls from  Perspective of Natural Language

ZHANG Cheng, CAO Jingxu, LYU Jinxin, YAN Dongmei   

  1. 1. School of Digital Economy and Management, Tianjin University of Finance and Economics, Tianjin 300221, China 
    2. School of Science and Technology, Tianjin University of Finance and Economics, Tianjin 300221, China
  • Online:2025-08-01 Published:2025-07-31

摘要: 知识蒸馏(KD)作为一种模型压缩的有效技术,其通过利用教师(参数量较大)模型中的知识来训练学生(参数量较小)模型。这种技术能够助力利用大模型构建更适用于需要隐私保护和边缘计算场景的高性能小模型,这对于基于Transformer的大模型尤为关键。然而,当前对知识蒸馏的研究主要集中在提升学生模型的性能上,对教师和学生模型之间知识传递的本质以及知识蒸馏的可解释性探索不足。为填补这一空白,提出一个知识蒸馏可解释性框架([Exp]-[KD]),从自然语言知识的角度来系统性地分析蒸馏过程中教师和学生模型之间的知识传递过程。从知识的角度总结凝练了知识蒸馏过程中尚不明确的三个关键问题:如何量化知识,如何量化转移的知识,如何量化转移的知识对学生模型的影响。依据自然语言知识与欧式空间中凸包的结构相似性,将自然语言知识表征为欧式空间中的凸包,并参考语义中的含义、中心思想以及关键词这三个概念,提出了用于量化自然语言知识的三个特征,知识范围、知识核心和知识框架。在此基础上,从教师和学生两个不同的视角,构建了知识转移率、知识吸收率、知识扩展度等一系列指标,用于测量知识在“教师-学生”范式中的变化情况,从而形成知识蒸馏可解释性分析框架。通过在多个数据集上以大语言模型(Bert-base-cased和Llama2-7B)作为教师模型进行的实验,探索了知识转移与学生模型学习结果之间的关系,从知识角度阐明了知识蒸馏的可解释性。研究表明,教师模型通常可以将大约 50% 到 60% 的知识传授给学生模型,同时,关系型知识最容易被学生模型吸收,这一结果将引导设计更适合关系型知识的损失函数来提升知识蒸馏效果。

关键词: 自然语言处理, 知识蒸馏, 可解释性, 凸包, 知识表征

Abstract: Knowledge distillation (KD) emerges as an efficacious technique, leveraging the knowledge from larger teacher models to train smaller student models while ensuring the performance degradation within acceptable bounds. This technique becomes particularly pivotal in large Transformer-based models, where smaller, high-performance models are more suitable for scenarios requiring privacy considerations and edge computing. However, current research on KD predominantly focuses on the performance of student models, with insufficient exploration of the essence of knowledge interaction between teacher and student models and the interpretability of KD. To fill this gap, this paper aims to propose a knowledge distillation interpretability framework ([Exp]-[KD]) to systematically analyze the knowledge transfer process between teacher and student models in the distillation process from the perspective of natural language knowledge. Firstly, this paper summarizes three key problems that are not clear in the knowledge distillation process from the perspective of knowledge: how to quantify knowledge, how to quantify transferred knowledge, and how to quantify the impact of transferred knowledge on student models. Secondly, according to the structural similarity between natural language knowledge and convex hull in European space, natural language knowledge is represented as convex hull in European space, and with reference to the three concepts of meaning, central idea and key words in semantics, three characteristics of quantifying natural language knowledge are proposed: knowledge scope, knowledge core and knowledge frame. On this basis, a series of indicators such as knowledge transfer rate, knowledge absorption rate and knowledge expansion degree is constructed from two different perspectives of teachers and students, which are used to measure the change of knowledge in the “teacher-student” paradigm, so as to form the interpretability analysis framework of knowledge distillation. Experiments conducted with large language model (Bert-base-cased and Llama2-7B) as teacher models across multiple datasets explore the relationship between knowledge transfer and student model learning outcomes, elucidating the interpretability of KD from a knowledge perspective. Research shows that teacher models can typically transfer about 50% to 60% of their knowledge to student models, while relational knowledge is the easiest for the student model to absorb. This result will guide us to design more suitable loss functions for relational knowledge to enhance the knowledge distillation effect.

Key words: natural language processing, knowledge distillation, interpretability, convex hull, knowledge representation