计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (8): 1307-1314.DOI: 10.3778/j.issn.1673-9418.1908036

• 数据库技术 • 上一篇    下一篇

针对隐藏Web数据库的Skyline查询方法研究

李征宇,李贵,曹科研   

  1. 1. 东北大学 计算机科学与工程学院,沈阳 110004
    2. 沈阳建筑大学 信息与控制工程学院,沈阳 110168
  • 出版日期:2020-08-01 发布日期:2020-08-07

Research of Skyline Query Method for Hidden Web Database

LI Zhengyu, LI Gui2, CAO Keyan   

  1. 1. School of Computer Science & Engineering, Northeastern University, Shenyang 110004, China
    2. Faculty of Information & Control Engineering, Shenyang Jianzhu University, Shenyang 110168, China
  • Online:2020-08-01 Published:2020-08-07

摘要:

通过Web接口查询服务端“隐藏”数据库的Skyline可以支持Web集成领域许多新应用。尽管受到客户端基于IP地址访问次数、top-k查询结果返回元组个数k,以及Web接口类型等诸多限制,但利用基本查询方法仍可获得隐藏Web数据库的Skyline,遗留的主要问题是查询代价过大。对此提出了混合属性的隐藏Web数据库Skyline的启发式求解方法。首先,利用平行坐标系分析Skyline元组相交性质,随后构造了启发式相交元组查询分解树,并证明了该树的查全性,最后针对典型的Web接口类型给出了启发式求解方法。理论分析和实验结果证实了启发式算法的有效性和相对基本查询方法的优越性。

关键词: 数据库Skyline, 隐藏Web数据库, 相交元组, 平行坐标系, 查询分解树

Abstract:

Skyline discovery from a hidden Web database can enable a wide variety of innovative applications in Web information integration area. Although there are many limitations such as the finite number of Web accesses one can issue through per-IP-address, no more than k tuples of all matching tuples one top-k query can return, and the restricted Web interface types, Skyline of hidden Web database can still be obtained by using the basic query method. However, the serious problem left is that the query cost is too high. To solve it, this paper puts forward a heuristic algorithm for getting Skyline tuples of a hidden Web database based on the mixture attributes. Specifically, this paper first analyzes intersecting characters of Skyline tuples by parallel coordinate system, and next defines a search decomposition tree for searching heuristically Skyline tuples of a hidden Web database, and then proves that the tree is guaranteed to discover all Skyline tuples. At last, a heuristic method is given for typical Web interface types. Theoretical analysis and experiments demonstrate the effectiveness of the proposed method and the superiority over baseline solutions.

Key words: database Skyline, hidden Web database, intersecting tuples, parallel coordinate system, search decomposition tree