Journal of Frontiers of Computer Science and Technology ›› 2016, Vol. 10 ›› Issue (9): 1310-1319.DOI: 10.3778/j.issn.1673-9418.1509086

Previous Articles     Next Articles

Entity Relation Extraction Based on Rule Inference Engine

XUE Lijuan, XI Menglong, WANG Mengjie, WANG Haofen, RUAN Tong+   

  1. College of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
  • Online:2016-09-01 Published:2016-09-05

基于规则推理引擎的实体关系抽取研究

薛丽娟,席梦隆,王梦婕,王昊奋,阮  彤+   

  1. 华东理工大学 信息科学与工程学院,上海 200237

Abstract: Entity relation extraction refers to extract semantic relationships between entities from unstructured natural language text and express in a structured form. Traditional entity relation extraction methods only focus on a particular type of data source, and label large numbers of training data by humans to train extraction model. Manually labeling training data are labor-intensive and time consuming. So this paper proposes a method integrating diversity data sources, and combines rule-based inference engine to discover relation triples. More precisely, integrating structured and unstructured data sources, and in the case of having small amount of seeds provided by structured data, a large number of entity relationships are reasoned by rule-based inference engine. The newly entity relationships are fed as seeds to distantly supervise the learning process to extract entity relationships from unstructured text. The final entity relationships are obtained through multiple iterations. The experimental results show the effectiveness of the proposed method.

Key words: relation extraction, relation reasoning, distant supervision, rule-based inference engine

摘要: 实体关系抽取是指从无结构的自然语言文本中抽取实体之间的语义关系,并以结构化的形式表示出来。传统的实体关系抽取方法只注重一种特定类型的数据源,并需要标注大量的训练数据来训练抽取模型,人工成本高。因此提出了一种综合多种数据源,并结合规则推理引擎的实体关系抽取方法,准确地说就是综合结构化和非结构化两种数据源,在结构化数据提供少量种子的情况下用规则推理引擎推理出更多的实体关系。然后使用远程监督学习方法从无结构的文本中抽取实体关系,通过多次迭代获得最终的实体关系。实验结果证明了该方法的有效性。

关键词: 关系抽取, 关系推理, 远程监督, 规则推理引擎