计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (7): 1690-1699.DOI: 10.3778/j.issn.1673-9418.2111103

• 人工智能·模式识别 • 上一篇    下一篇

基于双集合预测网络的实体关系联合抽取模型

彭晏飞,王瑞华,张睿思   

  1. 辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125100
  • 出版日期:2023-07-01 发布日期:2023-07-01

Dual Set Prediction Networks Based Joint Extraction of Entity and Relation

PENG Yanfei, WANG Ruihua, ZHANG Ruisi   

  1. School of Electronics and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125100, China
  • Online:2023-07-01 Published:2023-07-01

摘要: 实体关系抽取任务旨在从非结构化文本中识别出实体和实体间的关系,是目前大规模知识图谱构建和更新的技术来源。在现有的实体关系联合抽取方法中,并行解码三元组的方法通过集合预测的方式高效生成三元组,然而这种方法忽略了实体与关系间、实体主客体间的交互,导致生成无效三元组。针对此问题,提出基于双集合预测网络的实体关系联合抽取模型。为了增强关系和实体之间的交互,采用双集合预测网络并行解码三元组,顺序生成三元组中实体信息和关系类型:第一个集合预测网络对三元组集合建模并解码出三元组内的主客体信息,第二个集合预测网络对融合了主客体信息的三元组嵌入集合建模并解码出主客体间的关系类型;针对实体主客体设计了一个实体过滤器,预测句子中实体间的主客体相关性并依照该结果过滤掉主客体相关性较低的三元组。在公开数据集纽约时报(NYT)和WebNLG上的实验结果表明,在编码器为BERT的情况下所提模型相较基线模型在准确率和F1指标上的效果更好,验证了该模型的有效性。

关键词: 实体关系联合抽取, 双集合预测网络, 实体过滤器, 并行解码

Abstract: The entity and relation extraction task, which is the technical source of constructing and updating large-scale knowledge graph, aims to identify the relationship between entities from unstructured text. Among the existing joint extraction methods of entity and relation, parallel decoding of tuples efficiently generates tuples by set prediction. However, this method ignores the interaction between entity and relationship, and entity subject and object, resulting in the generation of invalid tuples. To address this problem, this paper proposes a joint extraction model of entity and relation based on dual set prediction networks. To enhance the interaction between relationships and entities, a dual set prediction network is used to decode the tuples in parallel, and the entity information and relationship types in the tuples are generated sequentially. The first set prediction network models the set of tuples and decodes the subject-object information in the tuple. The second set prediction network models the set of tuples embedded with subject-object information and decodes the relationship type between subject and object. This paper designs an entity filter for entity subject-object, which predicts the subject-object correlation among entities in a sentence and filters out the tuples with lower subject-object correlation according to the result. Experiments on the NYT (New York Times) and WebNLG public datasets show that the proposed model performs better than the baseline model in terms of accuracy and F1 metrics when the encoder is BERT, which verifies the validity of the model.

Key words: joint extraction of entity and relation, dual set prediction networks, entities filter, parallel decoding