Journal of Frontiers of Computer Science and Technology ›› 2008, Vol. 2 ›› Issue (6): 601-613.DOI: 10.3778/j.issn.1673-9418.2008.06.004

An instance-based result schema matching technique for Deep Web resources

NIE Tiezheng, YU Ge+, SHEN Derong, KOU Yue   

  1. College of Information Science and Engineering, Northeastern University, Shenyang 110004, China
基于实例的Deep Web数据源结果模式匹配技术

聂铁铮,于 戈+,申德荣,寇 月


  1. 东北大学 信息科学与工程学院,沈阳 110004
Abstract: To address the problem of result schema matching in the Deep Web, an instance-based approach of schema matching is presented. The approach can match and verify attributes of result schema for Deep Web sources, and mark the position of data in result pages. Moreover, based on query relaxing, a two-parse schema matching approach is presented to increase the accuracy of schema attributes matching. And the co-concurrence of attribute is invoked to address the problem of increasing the precision and recall of schema attributes. The experimental results demonstrate the instance-based approach effectively extracts result schema of data sources, and improve the precision and recall of schema attributes.

Key words: Deep Web, query instance, result schema, schema matching, attribute co-concurrence

摘要: 针对Deep Web数据源结果模式信息的匹配问题,提出了一种基于实例的结果模式匹配的方法。该方法能够匹配并验证数据源的结果模式属性信息,同时记录数据在结果页面中的结构信息。利用基于查询请求松弛的两段模式匹配方法精确地匹配模式属性,并基于模式属性间共现度信息来提高属性匹配的查全率和查准率。从实验结果分析可以看出,基于实例的方法能够有效地识别数据源模式信息,提高模式属性查全率和查准率。

关键词: Deep Web, 查询实例, 结果模式, 模式匹配, 属性共现度