计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (8): 2180-2189.DOI: 10.3778/j.issn.1673-9418.2307060

• 人工智能·模式识别 • 上一篇    下一篇

前缀调优的少样本命名实体识别

吕海啸,李益红,周晓谊   

  1. 海南大学 网络空间安全学院,海口 570228
  • 出版日期:2024-08-01 发布日期:2024-07-29

Few-Shot Named Entity Recognition with Prefix-Tuning

LYU Haixiao, LI Yihong, ZHOU Xiaoyi   

  1. School of  Cyberspace Security, Hainan University, Haikou 570228, China
  • Online:2024-08-01 Published:2024-07-29

摘要: 少样本命名实体识别通常使用基于相似性的度量,为了能够充分利用模型参数中的知识转移,提出一种前缀调优的少样本命名实体识别方法(P-NER)。将输入文本的特征向量放入嵌入模块进行特征提取;把前缀提示的向量参数拼接到编码层模型的前端,并将编码层模型参数进行固定;对编码层得到的结果进行交叉熵模型的解码,并对每个训练样本采样两个子模型,通过最小化两个子模型之间相对熵的方式达到对模型预测进行正则化的目的;通过验证输出概率和真实标签概率来衡量模型对每个词的标签预测与实际标签的一致程度并输出分类结果。实验结果表明在CoNLL2003数据集上,该方法的域内少样本实体识别的平均F1得分为84.92%,在跨领域少样本实体识别的MIT Movie、MIT Restaurant和ATIS三个数据集中均领先其他基线方法的结果。因此,该方法可在只需要调节以往微调方法的2.9%参数的情况下,显著提高少样本命名实体识别的效果。

关键词: 命名实体识别(NER), 少样本学习, 提示学习

Abstract: The commonly adopted approach for few-shot named entity recognition (NER) typically involves the use of similarity-based metrics. In order to fully leverage knowledge transfer within the model parameters, this paper proposes a prefix-tuning method for few-shot NER (P-NER). This involves placing the input text’s feature vectors into an embedding module for feature extraction. The vector parameters of prefix prompts are concatenated to the front end of the encoding layer model, with the encoding layer model parameters being fixed. The results obtained from the encoding layer are decoded using a cross-entropy model. For each training sample, two sub-models are sampled, and regularization of the model prediction is achieved by minimizing the relative entropy between the two sub-models. The model’s consistency with actual labels is assessed by validating the output probability and the probability of true labels for each word, ultimately yielding the classification results. Experimental results demonstrate that on the CoNLL2003 dataset, this method achieves an average F1 score of 84.92% for in-domain few-shot entity recognition. In the cross-domain few-shot entity recognition tasks, it outperforms other baseline methods on three datasets: MIT Movie, MIT Restaurant and ATIS. Thus, this method significantly enhances the effectiveness of few-shot named entity recognition with a mere 2.9% adjustment to the parameters of previous fine-tuning methods.

Key words: named entity recognition (NER), few-shot learning, prompt learning