计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (9): 830-839.DOI: 10.3778/j.issn.1673-9418.2010.09.006

• 学术研究 • 上一篇    下一篇

使用DTD优化XML数据流上的XPath查询*

王兰野+; 洪晓光

  

  1. 山东大学 计算机科学与技术学院, 济南 250101
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-09-09 发布日期:2010-09-09
  • 通讯作者: 王兰野

Using DTD to Optimize XPath Query over XML Data Stream*

WANG Lanye+; HONG Xiaoguang

  

  1. School of Computer Science and Technology, Shandong University, Jinan 250101, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-09-09 Published:2010-09-09
  • Contact: WANG Lanye

摘要: 如何在XML数据流上高效地执行XPath查询, 是XML数据流管理的关键问题。DTD结构信息对提高XML查询效率有很大帮助, 已有的大部分算法没有利用这一资源。提出了一种使用DTD进行XML数据流查询处理的方法, 具有以下特征:利用树自动机表示XPath; 通过XPath树自动机与DTD树匹配, 预先标识不匹配查询结构的DTD节点; 给出一种利用DTD的XML流索引方法DBXSI; 执行查询时, 根据流索引信息直接跳过某些与查询不匹配的节点及子树。实验结果表明:该方法可有效支持Xpath查询, 效率优于传统算法

关键词: 可扩展标示语言, 数据流, 路径查询语言, 流索引, 树自动机

Abstract: How to efficiently process XPath query over XML stream is a fundamental problem in XML (extensive markup language) data stream management. DTD (document type definitions) can be of great help in improving XML query efficiency, however, most current algorithms do not use this effective resource. A method using DTD structural information is proposed to process XML data stream query. The method has the following features: Tree automata is employing to express XPath; by matching XPath tree automata and DTD tree, DTD nodes that do not match the structure of XPath query are pre-marked; a novel index method of XML data stream—— DBXSI (DTD-based XML stream index) is proposed; nodes and sub-trees that do not match the query are skipped by stream index when the XPath query is running. Related algorithms are also introduced. Experimental results demonstrate that the method proposed can effectively support XPath query and outperform the former work in efficiency.

Key words: XML(extensive markup language), data stream, XPath, stream index, tree automata

中图分类号: