计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (12): 3010-3019.DOI: 10.3778/j.issn.1673-9418.2208102

• 人工智能·模式识别 • 上一篇    下一篇

片段级别的双编码器方面情感三元组抽取模型

张韵琪,李松达,兰于权,李东旭,赵慧   

  1. 1. 华东师范大学 软件工程学院,上海 200062
    2. 华东师范大学 上海市高可信计算重点实验室,上海 200062
  • 出版日期:2023-12-01 发布日期:2023-12-01

Span-Level Dual-Encoder Model for Aspect Sentiment Triplet Extraction

ZHANG Yunqi, LI Songda, LAN Yuquan, LI Dongxu, ZHAO Hui   

  1. 1. School of Software Engineering, East China Normal University, Shanghai 200062, China
    2. Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, China
  • Online:2023-12-01 Published:2023-12-01

摘要: 方面情感三元组抽取(ASTE)是方面级情感分析的子任务之一,旨在识别出句子中所有的方面词及其对应的观点词和情感极性。目前,ASTE任务通过流水线模型或端到端模型完成,前者无法解决三元组方面词重叠问题,且忽视了观点词和情感极性之间的依赖关系;后者将ASTE任务分解为方面词和观点词抽取子任务以及情感极性分类子任务,通过共享编码器进行多任务学习,未区分两个子任务的特征差异,导致特征混淆问题。针对上述问题,提出了片段级别的双编码器方面情感三元组抽取模型(SD-ASTE)。该模型是流水线模型,分为两个模块。第一个模块基于片段抽取方面词和观点词,在片段特征表示中融入片段首尾和长度信息,关注方面词和观点词的边界信息;第二个模块判断方面词-观点词片段对表达的情感极性,采用基于悬浮标记的片段对特征表示方式,侧重于学习三元组各元素之间的依赖关系。模型利用两个独立编码器,分别为两模块提取不同的特征信息。多个数据集上的对比实验结果表明,该模型相较于目前最优的流水线模型和端到端模型具有更优的效果。通过有效性实验,验证了片段特征表示和片段对特征表示以及两个独立编码器的有效性。

关键词: 情感分析, 方面情感三元组抽取(ASTE), 流水线模型, 片段, 独立编码器

Abstract: Aspect sentiment triplet extraction (ASTE)  is one of the subtasks of aspect-based sentiment analysis, which aims to identify all aspect terms, their corresponding opinion terms and sentiment polarities in sentences. Currently, pipeline or end-to-end models are adopted to accomplish the ASTE task. The former cannot solve the overlapping problem of aspect terms in triplets and ignores the dependency between opinion terms and sentiment polarities. The latter divides the ASTE task into two subtasks of aspect-opinion-extraction and sentiment-polarity-classification, which applies multi-task learning through a shared encoder. However, this setting does not distinguish the differences between the features of the two subtasks, leading to the feature confusion problem. SD-ASTE (span-level dual-encoder model for ASTE), a pipeline model with two modules, is proposed to address the above problems. The first module extracts aspect terms and opinion terms based on spans. The span feature representation incor-porates span head, tail and length information to focus on the boundary information of aspect terms and opinion terms. The second module judges the sentiment polarities expressed by aspect-opinion span pairs. The span-pair feature representation is based on levitated markers to focus on the dependency among triplet elements. The model utilizes two independent encoders to extract different features for each module. Comparative experimental results on multiple datasets show that the model is superior to the state-of-the-art pipeline and end-to-end models. Validity experiments show the effectiveness of the span feature representation, span-pair feature representation and the two independent encoders.

Key words: sentiment analysis, aspect sentiment triplet extraction (ASTE), pipeline model, span, independent encoders