计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (2): 506-515.DOI: 10.3778/j.issn.1673-9418.2210078

• 人工智能·模式识别 • 上一篇    下一篇

多尺寸注意力的命名实体识别方法

唐瑞雪,秦永彬,陈艳平   

  1. 1. 贵州财经大学 信息学院,贵阳 550025
    2. 贵州大学 计算机科学与技术学院,贵阳 550025
    3. 公共大数据国家重点实验室,贵阳 550025
  • 出版日期:2024-02-01 发布日期:2024-02-01

Named Entity Recognition Based on Multi-scale Attention

TANG Ruixue, QIN Yongbin, CHEN Yanping   

  1. 1. School of Information, Guizhou University of Finance and Economics, Guiyang 550025, China
    2. College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
    3. State Key Laboratory of Public Big Data, Guiyang 550025, China
  • Online:2024-02-01 Published:2024-02-01

摘要: 命名实体识别(NER)任务的准确性将促进自然语言领域中诸多下游任务的研究。由于文本中存在大量嵌套语义,导致命名实体识别困难,成为自然语言处理中的难点。以往研究提取特征尺度单一,边界信息利用不够充分,忽略了不同尺度下的许多细节信息,从而造成实体识别错误或遗漏的情况。针对上述问题,提出一种多尺度注意力的命名实体识别方法(MSA-NER)。首先,利用BERT模型得到包含上下文信息的表示向量,并通过BiLSTM网络加强文本的上下文表示。其次,将表示向量进行枚举拼接形成跨度信息矩阵,并融合方向信息获得更丰富的交互信息。然后,利用多头注意力构建多个子空间,通过二维卷积在每个子空间下可选地聚合不同尺度的文本信息,在每个注意力层同时进行多尺度的特征融合。最后,将融合的矩阵进行跨度分类以识别命名实体。实验表明,该方法在GENIA和ACE2005英文数据集上[F1]分别达到81.7%和86.8%,与现有主流模型相比有更好的识别效果。

关键词: 命名实体识别(NER), 嵌套语义, 多尺度注意力, 卷积神经网络, 子空间

Abstract: The accuracy of named entity recognition (NER) task will promote the research of multiple downstream tasks in natural language field. Due to a large number of nested semantics in text, named entities are recognized difficultly. Recognizing nested semantics becomes a difficulty in natural language processing. Previous studies have single scale of extracting feature and under-utilization of the boundary information. They ignore many details under different scales and then lead to the situation of entity recognition error or omission. Aiming at the above problems, a multi-scale attention method for named entity recognition (MSA-NER) is proposed. Firstly, the BERT model is used to obtain representation vector containing context information, and then the BiLSTM network is used to strengthen the context representation of text. Secondly, the representation vectors are enumerated and concatenated to form span information matrix. The direction information is fused to obtain richer interactive information. Thirdly, multi-head attention is used to construct multiple subspaces. Two-dimensional convolution is used to optionally aggregate text information at different scales in each subspace, so as to implement multi-scale feature fusion in each attention layer. Finally, the fused matrix is used for span classification to identify named entities. Experimental results show that the [F1] score of the proposed method reaches 81.7% and 86.8% on GENIA and ACE2005 English datasets, respectively. The proposed method demonstrates better recognition performance compared with existing mainstream models.

Key words: named entity recognition (NER), nested semantics, multi-scale attention, convolutional neural network, subspace