计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (7): 1114-1121.DOI: 10.3778/j.issn.1673-9418.1605034

• 人工智能与模式识别 • 上一篇    下一篇

融合多结构信息的中文句法分析方法

赵国荣,王文剑+   

  1. 山西大学 计算机与信息技术学院,太原 030006
  • 出版日期:2017-07-01 发布日期:2017-07-07

Method for Chinese Parsing Based on Fusion of Multiple Structural Information

ZHAO Guorong, WANG Wenjian+   

  1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
  • Online:2017-07-01 Published:2017-07-07

摘要: 句法分析是自然语言理解的一项基础技术,是迈向深层语言理解的基石。目前常用的句法分析方法的语法模型建立在上下文无关文法的假设上。事实上,短语结构树的节点之间具有很强的上下文相关性,充分利用结构信息,可进一步提高句法分析的准确性。融合了句法结构树中的多结构信息(在非终节点中增加父亲节点及左、右姐妹节点等标记)以加强语法规则的上下文约束,并采用结构化支持向量机的方法对句法进行了分析。实验表明,该融合多结构信息的句法分析方法可以消解结构歧义,提升句法分析精确率和F1值。

关键词: 结构化支持向量机, 上下文无关文法, 结构上下文相关, 中文句法分析

Abstract: Syntactic parsing is a basic technology of natural language understanding, and it is the cornerstone of deep language understanding. At present, the parsing method is based on the hypothesis of context free grammar. In fact, the context has a strong correlation in phrase structure trees. If the structural information can be used, it can further improve the accuracy of the parser. This paper combines the multiple structural information in syntactic structure trees, the structural information (such as father node or left and right sister nodes) in the non-terminal node can strengthen grammar rules of context constraints. And then this paper uses the method of structural support vector machines (SSVMs) for Chinese parsing. The experimental results show that the method of multiple structural information fusion can resolve the structural ambiguity and improve the accuracy and F1 value.

Key words: structural support vector machines, context-free grammar, structure context correlation, Chinese parsing