计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (9): 1554-1562.DOI: 10.3778/j.issn.1673-9418.1909066

• 人工智能 • 上一篇    下一篇

结合注意力机制和本体的远程监督关系抽取

李艳娟,臧明哲,刘晓燕,刘扬,郭茂祖   

  1. 1. 东北林业大学 信息与计算机工程学院,哈尔滨 150000
    2. 哈尔滨工业大学 计算机科学与技术学院,哈尔滨 150001
    3. 北京建筑大学 电气与信息工程学院,北京 100044
  • 出版日期:2020-09-01 发布日期:2020-09-07

Distant Supervision Relation Extraction Combining Attention Mechanism and Ontology

LI Yanjuan, ZANG Mingzhe, LIU Xiaoyan, LIU Yang, GUO Maozu   

  1. 1. School of Information and Computer Engineering, Northeast Forestry University, Harbin 150000, China
    2. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
    3. School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
  • Online:2020-09-01 Published:2020-09-07

摘要:

关系抽取是从非结构化的文本中抽取关系,并以结构化的形式输出。为了提高抽取准确性并降低对人工标注的依赖,提出了基于注意力机制和本体的远程监督关系抽取模型(APCNNs+OR)。该模型分为特征工程提取模块、分类器模块、本体约束层。在分类器模块中,引入并改进了实例级注意力机制,更好地学习数据袋中每个句子的权重,有效地降低了远程监督假设引入的噪声干扰及句子中实体间的词语信息干扰。在本体约束层,通过引入领域本体对抽取结果进行约束,提高了抽取关系的准确性。SemMed和GoldStandard语料实验结果表明,该模型可有效降低错误标签的噪声干扰,比现有模型具有更好的关系抽取性能。

关键词: 关系抽取, 本体, 远程监督, 注意力机制

Abstract:

Relational extraction extracts relationships from unstructured text and outputs them in a structured form. In order to improve the extraction accuracy and reduce the dependence on manual annotation, this paper proposes a distant supervision relationship extraction model based on attention mechanism and ontology, attention piecewise convolutional neural networks with ontology restriction (APCNNs+OR). The model is divided into feature engineering extraction module, classifier module and ontology restriction layer. In the classifier module, this paper introduces and improves the instance-level attention mechanism to learn the weight of each sentence in the data bag better, effectively reducing the noise interference introduced by the distant supervision hypothesis and the word information interference between the two entities in the sentence. In the ontology restriction layer, the extraction result is constrained by introducing the domain ontology, which improves the accuracy of the relationship extraction. The experiment results of SemMed and GoldStandard corpus show that the model can effectively reduce the noise interference of the wrong label and has better relation extraction performance than the existing models.

Key words: relation extraction, ontology, distant supervision, attention mechanism