计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (12): 1087-1097.DOI: 10.3778/j.issn.1673-9418.2012.12.003

• 学术研究 • 上一篇    下一篇

障碍空间中不确定数据聚类算法

曹科研+,王国仁,韩东红,袁  野,胡雅超,齐宝雷   

  1. 东北大学 信息科学与工程学院,沈阳 110819
  • 出版日期:2012-12-01 发布日期:2012-12-03

Clustering Algorithm of Uncertain Data in Obstacle Space

CAO Keyan+, WANG Guoren, HAN Donghong, YUAN Ye, HU Yachao, QI Baolei   

  1. College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
  • Online:2012-12-01 Published:2012-12-03

摘要: 近些年,由于数据采集的不精确和数据本身的不确定性,使不确定性在位置数据中普通存在。在障碍空间中,聚类不确定数据面临新的挑战。提出了障碍空间中聚类不确定数据的OBS-UK-means(obstacle uncertain K-means)算法,并提出了分别基于R树和Voronoi图的两种剪枝策略和最近距离区域的概念,大大减少了计算量。通过实验验证了OBS-UK-means算法的高效性和准确性,同时证明了剪枝策略在不损害聚类有效性的情况下,能够有效地提高聚类效率。

关键词: 聚类, 不确定数据, 障碍空间

Abstract:  In recent years, uncertain data is generated widely in location data due to the inaccuracy of measurement instruction or the data attributes itself. The existence of obstacles in space brings the new challenges to spatial uncertain data clustering. This paper proposes OBS-UK-means (obstacle uncertain K-means) algorithm to cluster uncertain data in obstacle space, and also proposes two pruning strategies based on R-tree and Voronoi diagram and the shortest distance area concept, that greatly reduces the calculations. Finally, the experiment demonstrates that the efficiency and accuracy of the OBS-UK-means algorithm, and the pruning approach can improve the efficiency of the clustering algorithm, meanwhile, it doesn’t damage the cluster effectiveness.

Key words: clustering, uncertain data, obstacle space