计算机科学与探索 ›› 2016, Vol. 10 ›› Issue (7): 936-947.DOI: 10.3778/j.issn.1673-9418.1506091

• 数据库技术 • 上一篇    下一篇

结合非空间属性的通用Skyline查询处理技术

王海翔1,郑吉平1,2,3+,王永阁1   

  1. 1. 南京航空航天大学 计算机科学与技术学院,南京 211106
    2. 南京大学 计算机软件新技术国家重点实验室,南京 210023
    3. 新南威尔士大学 计算机科学与工程学院,悉尼,澳大利亚 NSW 2052
  • 出版日期:2016-07-01 发布日期:2016-07-01

General Skyline Query Processing Technology Combining with Non-spatial Attributes

WANG Haixiang1, ZHENG Jiping1,2,3+, WANG Yongge1   

  1. 1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
    2. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
    3. School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
  • Online:2016-07-01 Published:2016-07-01

摘要: Skyline查询作为多目标决策的重要手段之一,近年来在各个领域得到广泛的应用。提出了结合非空间属性的通用Skyline查询处理技术,采用R树对设施集及数据集建立索引,并提出了两种方法来计算Skyline。第一种是基于全最近邻算法的扩展,通过计算静态Skyline结果来裁剪部分数据集。另一种是基于渐进最近邻的算法,采用查询点导向的搜索方法,利用静态Skyline结果计算与每一类设施最远的距离,将其作为边界阈值对数据点集进行裁剪,采用数据点导向的搜索方法,为裁剪后的每一个数据点计算距其最近的设施,并将数据点与设施的距离映射到多维距离空间中,结合非空间属性进行Skyline计算。实验结果表明,第二种方法减少了I/O次数,降低了CPU执行时间,提高了计算效率。

关键词: 通用Skyline查询, R树索引, 非空间属性, 最近邻

Abstract: As one of the important methods for multi-criteria decision making problems, the Skyline query processing technologies have been widely studied in many applications. This paper proposes the general Skyline query processing technology combining with non-spatial attributes while R tree is adopted to index the facility set and the dataset. Two algorithms are provided to compute the Skyline results. The first algorithm is the extension of Skyline query algorithm based on all nearest neighbor method, which cuts off part of the dataset by computing the static Skyline results. The second algorithm is based on incremental nearest neighbor method, which utilizes the facility oriented searching method. The algorithm calculates the farthest distances between the static Skyline results and each type of facilities. The distances are set to bound thresholds so as to cut off data points which are farther than them. For the left points, data oriented search method is used to compute the nearest facilities of all types. After that, the distances between the data points and the facilities are mapped to the multi-dimensional distance space so as to compute the Skyline result combining with non-spatial attributes. The experimental results show that for the second algorithm the number of I/Os and CPU time are greatly reduced thus improves the computational efficiency.

Key words: general Skyline queries, R-tree index, non-spatial attributes, nearest neighbor