计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (4): 1047-1056.DOI: 10.3778/j.issn.1673-9418.2301061

• 人工智能·模式识别 • 上一篇    下一篇

基于潜在关系的实体关系联合抽取模型

彭晏飞,张睿思,王瑞华,郭家隆   

  1. 辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125100
  • 出版日期:2024-04-01 发布日期:2024-04-01

Potential Relationship Based Joint Entity and Relation Extraction

PENG Yanfei, ZHANG Ruisi, WANG Ruihua, GUO Jialong   

  1. School of Electronics and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125100, China
  • Online:2024-04-01 Published:2024-04-01

摘要: 实体关系联合抽取的作用是从特定文本中识别出实体和对应关系,同时它也是知识图谱构建和更新的基础。目前的联合抽取方法在追求性能的同时都忽略了抽取过程中的信息冗余。针对此问题,提出基于潜在关系的实体关系联合抽取模型,通过设计一种新的解码方式来减少预测过程中关系、实体和三元组的冗余信息,从整体上分为提取潜在实体对、解码关系两步来完成从句子中抽取三元组的任务。首先通过潜在实体对提取器预测实体间是否存在潜在关系,同时筛选出置信度高的实体对作为最终的潜在实体对;其次将关系解码视作多标签二分类任务,通过关系解码器预测每个潜在实体对之间全部关系的置信度;最后通过置信度确定关系数量和类型,以完成三元组的抽取任务。在两个通用数据集上的实验结果表明,所提模型相比基线模型在准确率和[F1]指标上的效果更好,验证了所提模型的有效性,消融实验也证明了模型内部各部分的有效性。

关键词: 实体关系联合抽取, 潜在关系, 潜在实体对, 多标签二分类任务, 信息冗余

Abstract: The role of joint entity and relation extraction is to identify entities and their corresponding relations from specific texts, and it is also the basis for constructing and updating knowledge graph. Currently, joint extraction methods ignore information redundancy in the extraction process while pursuing performance. To address this issue, a model based on latent relations for joint entity and relation extraction is proposed. This paper designs a new decoding method to reduce the redundant information of relationships, entities and triples in the prediction process, and it is divided into two steps: extracting potential entity pairs and decoding relationships to complete the extraction of triples. Firstly, the potential entity pair extractor is used to predict whether there is potential relationship between entities, and at the same time, the entity pairs with high confidence are selected as the final potential entity pairs. Secondly, the relational decoding is regarded as a multi-label binary classification task, and the confidence of all relationships between each potential entity pair is predicted by the relational decoder. Finally, the number and type of relationships are determined by confidence to complete the task of extracting triples. Experimental results on two general datasets show that the proposed model is better than the baseline models in terms of accuracy and F1 indicators, which verifies the effectiveness of the proposed model. The ablation experiment also proves the effectiveness of the internal parts of the model.

Key words: joint entity and relation extraction, potential relationship, potential entity pairs, multi-label binary classification tasks, information redundancy