计算机科学与探索 ›› 2015, Vol. 9 ›› Issue (1): 43-50.DOI: 10.3778/j.issn.1673-9418.1407042

• 数据库技术 • 上一篇    下一篇

基于规则推导的正规式相交判定算法

刘  嘉+,廖湖声   

  1. 北京工业大学 计算机学院,北京 100124
  • 出版日期:2015-01-01 发布日期:2014-12-31

Intersection Checking for Regular Expressions Based on Inference System

LIU Jia+, LIAO Husheng   

  1. College of Computer Science, Beijing University of Technology, Beijing 100124, China
  • Online:2015-01-01 Published:2014-12-31

摘要: 正规式相交判定问题在扩展标记语言(extensible markup language,XML)类型检查中起着十分重要的作用。传统方法是将其转化为自动机的相交问题,在转化过程中会产生大量计算。基于XML模式语言的特点,提出了一种基于规则推导的正规式相交判定算法。该算法直接根据输入的正规式进行推导而无需进行任何转化计算。对于一般的正规式,尽管其仍然是指数级算法,但无需进行复杂的构造自动机的计算;而对于一些特殊的正规式,特别是在XML类型检查中广泛使用的One-Unambiguous正规式,该算法的时间复杂度降为多项式级。最后证明了该算法所使用的推导规则的正确性和完备性。

关键词: XML类型检查, 正规式, 相交判定, 推导规则

Abstract: Decision problem of intersection checking for regular expressions plays an important role in the extensible markup language (XML) type checking. The typical technique converts the problem of intersection checking into the problem of automata intersection, which may generate a lot of redundant computing during the conversion. According to the features of XML schema languages, this paper proposes a new intersection checking algorithm based on inference system for regular expressions. This algorithm is derived directly based on regular expressions without any conversion. For general regular expressions, this algorithm is an exponential time algorithm, but without constructing automata, and for some special cases, especially for the One-Unambiguous regular expression used in XML type checking, it is the polynomial time algorithm. Finally, this paper proves the correctness and completeness of the inference rules.

Key words: XML type checking, regular expression, intersection checking, inference rule