计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (10): 951-960.DOI: 10.3778/j.issn.1673-9418.2010.10.009

• 学术研究 • 上一篇    

分布式不确定数据上的概率Skyline计算*

王晓伟+;黄九鸣;贾 焰

  

  1. 国防科技大学 计算机学院, 长沙 410073

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-10-01 发布日期:2010-10-01
  • 通讯作者: 王晓伟

Probabilistic Skyline Computation on Distributed Uncertain Data*

WANG Xiaowei+;HUANG Jiuming;JIA Yan

  

  1. School of Computer, National University of Defense Technology, Changsha 410073, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-10-01 Published:2010-10-01
  • Contact: WANG Xiaowei

摘要: 提出了分布式不确定数据上概率skyline的低通信开销算法。首先给出了一种间接的对象分布信息—— 剪枝空间, 分布节点通过共享全局剪枝空间, 能够减少通信开销。为了降低传输剪枝空间带来的额外通信开销, 对表示剪枝空间的虚拟对象集合进行基于距离的压缩。与基本算法相比, 100个分布节点时, 在真实数据集上节省了69%的通信开销; 在均匀、正相关、反相关三种标准模拟数据上分别节省60.5%、41.8%、24.5%的通信开销。

关键词: 分布式不确定数据, 概率skyline, 剪枝空间, 虚拟对象集合

Abstract: A communication-efficient algorithm of probabilistic skyline on distributed uncertain data is proposed. Firstly, an indirect data distribution summary called pruning space (PS) is introduced. By sharing the global pruning space, significant amount of communication will be reduced. In order to reduce the additional communication cost caused by PS sharing, a distance based compression method is applied to virtual object sets (VOS) which represent the pruning space. SPS (sharing pruning space) outperforms the naive algorithm by 69% for real dataset, by 60.5%, 41.8% and 24.5% respectively for the three synthetic datasets.

Key words: distributed uncertain data, probabilistic skyline, pruning space (PS), virtual object sets (VOS)

中图分类号: