融合预训练模型和注意力的实体关系抽取方法

doi:10.3778/j.issn.1673-9418.2206052

计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (6): 1453-1462.DOI: 10.3778/j.issn.1673-9418.2206052

融合预训练模型和注意力的实体关系抽取方法

李智杰，韩瑞瑞，李昌华，张颉，石昊琦

西安建筑科技大学信息与控制工程学院，西安 710055

出版日期:2023-06-01 发布日期:2023-06-01

Entity Relation Extraction Method Integrating Pre-trained Model and Attention

LI Zhijie, HAN Ruirui, LI Changhua, ZHANG Jie, SHI Haoqi

School of Information and Control Engineering, Xi’an University of Architectural Science and Technology, Xi’an 710055, China

Online:2023-06-01 Published:2023-06-01

摘要/Abstract

摘要： 实体关系抽取旨在从无结构的文档中检测出实体和实体对的关系，是构建领域知识图谱的重要步骤。针对现有抽取模型语义表达能力差、重叠三元组抽取准确率低的情况，研究了融合预训练模型和注意力的实体关系联合抽取问题，将实体关系抽取任务分解为两个标记模块。头实体标记模块采用预训练模型对句子进行编码，为了进一步学习句子的内在特征，利用双向长短时记忆网络（BiLSTM）和自注意力机制组成特征加强层。采用二进制分类器作为模型的解码器，标记出头实体在句子中的起止位置。为了加深两个标记模块之间的联系，在尾实体标记任务前设置特征融合层，将头实体特征与句子向量通过卷积神经网络（CNN）和注意力机制进行特征融合，通过多个相同且独立的二进制分类器判定实体间关系并标记尾实体，构建出融合预训练模型和注意力的联合抽取模型（JPEA）。实验结果表明，该方法能显著提升抽取的效果，对比不同预训练模型下抽取任务的性能，进一步说明了模型的优越性。

关键词: 领域知识图谱, 预训练模型, 自注意力机制, 特征融合

Abstract: Entity relationship extraction aims to detect the relationship between entities and entity pairs from unstruc-tured text. It is an important step in constructing domain knowledge map. In view of the poor semantic expression ability of the existing extraction models and the low accuracy of overlapping triples extraction, this paper studies the joint extraction of entity relationships by integrating pre-trained model and attention, and divides the entity relation-ship extraction task into two tag modules. The head entity tagging module uses a pre-trained model to encode sen-tences. In order to further learn the internal characteristics of sentences, bi-directional long-short term memory and self-attention mechanism are used to form a feature enhancement layer. The binary classifier is used as the decoder of the model to mark the start and end positions of the head entity in the sentence. In order to deepen the relationship between the two marking modules, a feature fusion layer is set up before the tail entity marking task. The head entity features and sentence vectors are fused through convolutional neural networks (CNN) and attention mechanism. The relationship between entities is determined and the tail entity is marked through multiple identical and independent binary classifiers. A joint model based on pre-trained encoder and attention mechanism (JPEA) is constructed. Experimental results show that this method can significantly improve the extraction effect, and the performance of extraction tasks under different pre-trained models is compared, which further illustrates the superio-rity of the model.

Key words: domain knowledge graph, pre-trained model, self-attention mechanism, feature fusion

李智杰, 韩瑞瑞, 李昌华, 张颉, 石昊琦. 融合预训练模型和注意力的实体关系抽取方法[J]. 计算机科学与探索, 2023, 17(6): 1453-1462.

LI Zhijie, HAN Ruirui, LI Changhua, ZHANG Jie, SHI Haoqi. Entity Relation Extraction Method Integrating Pre-trained Model and Attention[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1453-1462.

[1]	马妍, 古丽米拉·克孜尔别克. 图像语义分割方法在高分辨率遥感影像解译中的研究综述[J]. 计算机科学与探索, 2023, 17(7): 1526-1548.
[2]	胡硕, 姚美玉, 孙琳娜, 王洁, 周思恩. 融合注意力特征的精确视觉跟踪[J]. 计算机科学与探索, 2023, 17(4): 868-878.
[3]	竺笈, 肖晓丽, 尹波, 孙倩, 谈东. 融合用户社会关系的双线性扩散图推荐模型[J]. 计算机科学与探索, 2023, 17(4): 826-836.
[4]	祁欣, 袁非牛, 史劲亭, 王贵黔. 多层次特征融合网络的语义分割算法[J]. 计算机科学与探索, 2023, 17(4): 922-932.
[5]	沈怀艳, 吴云. 基于MSFA-Net的肝脏CT图像分割方法[J]. 计算机科学与探索, 2023, 17(3): 646-656.
[6]	王文森, 黄凤荣, 王旭, 刘庆璘, 羿博珩. 基于深度学习的视觉惯性里程计技术综述[J]. 计算机科学与探索, 2023, 17(3): 549-560.
[7]	徐光达, 毛国君. 多层级特征融合的无人机航拍图像目标检测[J]. 计算机科学与探索, 2023, 17(3): 635-645.
[8]	王颖洁, 张程烨, 白凤波, 汪祖民, 季长清. 中文命名实体识别研究综述[J]. 计算机科学与探索, 2023, 17(2): 324-341.
[9]	沈义峰, 金辰曦, 王瑶, 张家想, 卢先领. 融合时间上下文与特征级信息的推荐算法[J]. 计算机科学与探索, 2023, 17(2): 489-498.
[10]	李虹瑾, 彭力. 特征增强的孪生网络高速跟踪算法[J]. 计算机科学与探索, 2023, 17(2): 396-408.
[11]	仝航, 杨燕, 江永全. 检测脑电癫痫的多头自注意力机制神经网络[J]. 计算机科学与探索, 2023, 17(2): 442-452.
[12]	王旭阳, 董帅, 石杰. 复合层次融合的多模态情感分析[J]. 计算机科学与探索, 2023, 17(1): 198-208.
[13]	夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
[14]	彭豪, 李晓明. 多尺度选择金字塔网络的小样本目标检测算法[J]. 计算机科学与探索, 2022, 16(7): 1649-1660.
[15]	赵运基, 范存良, 张新良. 融合多特征和通道感知的目标跟踪算法[J]. 计算机科学与探索, 2022, 16(6): 1417-1428.

融合预训练模型和注意力的实体关系抽取方法

Entity Relation Extraction Method Integrating Pre-trained Model and Attention

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics