计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (8): 1288-1295.DOI: 10.3778/j.issn.1673-9418.1706033

• 人工智能与模式识别 • 上一篇    下一篇

基于浅层句法分析和最大熵的问句语义分析

李冬梅1+,张  琪1,王  璇2,檀  稳1   

  1. 1. 北京林业大学 信息学院,北京 100083
    2. 中国人民大学 信息学院,北京 100872
  • 出版日期:2017-08-01 发布日期:2017-08-09

Semantic Analysis of Question Based on Shallow Parsing and Maximum Entropy

LI Dongmei1+, ZHANG Qi1, WANG Xuan2, TAN Wen1   

  1. 1. School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
    2. School of Information, Renmin University of China, Beijing 100872, China
  • Online:2017-08-01 Published:2017-08-09

摘要: 为了使中文问答系统能够准确高效地识别问句的语义,在构建生物医学领域本体的基础上,提出了一种基于浅层句法分析和最大熵模型的语义分析算法。该算法首先对自然语言问句进行语义块识别,如果识别成功,则形成问句向量,然后利用本体进行SPARQL查询;如果识别失败,则调用最大熵模型,判断问句的语义角色。最大熵模型利用标注好语义的熟语料进行训练,提取语义组块特征,从而判断出最可能的句型,形成问句向量,然后再利用本体进行查询,获取答案。通过实验与其他方法相比,新算法具有更高的查准率和召回率。

关键词: 中文问答系统, 本体, 浅层句法分析, 最大熵, SPARQL查询

Abstract: In order to improve the accuracy and effectiveness of question semantic recognition in question answering system, this paper presents a semantic analyzing algorithm combining shallow parsing and the maximum entropy on the basis of constructing biomedical domain ontology. Firstly, natural language questions are identified by semantic blocks. If the recognition is successful, the question vectors are formed, and then the SPARQL query is performed on the ontology. Otherwise, the maximum entropy model is invoked to judge the semantic role of the question. The maximum entropy model is used to train annotated corpus, which extracts the semantic block features to determine the most probable sentence pattern and form question vector, and then query through ontology to get the answers. Finally, compared with other methods, the novel algorithm has higher precision and recall rate.

Key words:  Chinese question answering system, ontology, shallow parsing, maximum entropy, SPARQL query