计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (9): 2041-2049.DOI: 10.3778/j.issn.1673-9418.2102013

• 数据库技术 • 上一篇    下一篇

障碍空间中Voronoi图优化的反向近邻数聚类算法

何云斌(), 刘婉旭, 万静   

  1. 哈尔滨理工大学 计算机科学与技术学院,哈尔滨 150080
  • 收稿日期:2021-02-02 修回日期:2021-06-02 出版日期:2022-09-01 发布日期:2021-06-28
  • 通讯作者: + E-mail: hybha@163.com
  • 作者简介:何云斌(1972—),男,教授,硕士生导师,主要研究方向为数据库理论与应用。
    刘婉旭(1995—),女,硕士研究生,主要研究方向为数据库理论与应用。
    万静(1972—),女,教授,硕士生导师,主要研究方向为数据库理论与应用、空间数据库。
  • 基金资助:
    国家自然科学基金(61872105);黑龙江省教育厅科学技术研究项目(12531z004)

Optimized Number of Reverse Neighbor Clustering Algorithm by Voronoi Diagram in Obstacle Space

HE Yunbin(), LIU Wanxu, WAN Jing   

  1. College of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
  • Received:2021-02-02 Revised:2021-06-02 Online:2022-09-01 Published:2021-06-28
  • About author:HE Yunbin, born in 1972, professor, M.S. super-visor. His research interests include database theory and application.
    LIU Wanxu, born in 1995, M.S. candidate. Her research interests include database theory and application.
    WAN Jing, born in 1972, professor, M.S. super-visor. Her research interests include database theory and application and spatial database.
  • Supported by:
    National Natural Science Foundation of China(61872105);Science and Technology Research Project of Heilongjiang Provincial Education Department(12531z004)

摘要:

针对现有的障碍空间聚类算法需要人工选取聚类中心及设定阈值等问题,提出了一种障碍空间中Voronoi图优化的反向近邻数聚类算法(OBRK-means)。该算法从聚类中心的选取、离群点的筛选和广义覆盖圆三方面进行讨论和分析。首先,该算法引入Voronoi图来计算反向近邻数,进而确定聚类中心的候选集合;其次,利用Voronoi图和样本点密度进行数据集中离群点的筛选和剪枝;最后,引入广义覆盖圆来进行初始聚类,针对初始聚类结果不精确的问题提出内边界点和外边界点,并在内边界点和外边界点中根据公式分别计算出剔除点和拓展点来提高聚类准确性。理论研究和实验表明,该算法在处理障碍空间中的数据时具有更高的效率,能够得到更好的聚类结果。

关键词: 聚类, Voronoi图, 障碍空间, 反向近邻数

Abstract:

In order to solve the problem that the existing obstacle space clustering algorithm needs to select the clustering center and set the threshold value manually, OBRK-means (obstacle based on nearest K-means) clus-tering algorithm based on Voronoi diagram is proposed. The algorithm is discussed and analyzed from three aspects: the selection of cluster center, the selection of outliers and the generalized coverage circle. Firstly, Voronoi diagram is introduced to calculate the reverse nearest neighbor number to determine the cluster center. Secondly, Voronoi diagram and density of sample points are used to screen and prune outliers in the dataset. Finally, the generalized covering circle is introduced to carry out the initial clustering, and the inner and outer boundary points are proposed to solve the problem that the initial clustering results are not accurate. The exclusion points and expansion points are calculated respectively from the inner and outer boundary points to improve the accuracy of clustering. Theoretical research and experimental results show that the algorithm has higher efficiency in processing data in obstacle space and gets better clustering results.

Key words: clustering, Voronoi diagram, obstacle space, number of reverse neighbor

中图分类号: