自然反向最近邻优化的密度峰值聚类算法

doi:10.3778/j.issn.1673-9418.2007017

计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (10): 1888-1899.DOI: 10.3778/j.issn.1673-9418.2007017

自然反向最近邻优化的密度峰值聚类算法

刘娟，万静

哈尔滨理工大学计算机科学与技术学院，哈尔滨 150080

出版日期:2021-10-01 发布日期:2021-09-30

Optimized Density Peak Clustering Algorithm by Natural Reverse Nearest Neighbor

LIU Juan, WAN Jing

College of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China

Online:2021-10-01 Published:2021-09-30

摘要/Abstract

摘要：

密度峰值聚类算法是一种基于密度的聚类算法。针对密度峰值聚类算法存在的参数敏感和对复杂流形数据得到的聚类结果较差的缺陷，提出一种新的密度峰值聚类算法，该算法基于自然反向最近邻结构。首先，该算法引入反向最近邻计算数据对象的局部密度；其次，通过代表点和密度相结合的方式选取初始聚类中心；然后，应用密度自适应距离计算初始聚类中心之间的距离，利用基于反向最近邻计算出的局部密度和密度自适应距离在初始聚类中心上构建决策图，并通过决策图选择最终的聚类中心；最后，将剩余的数据对象分配到距离其最近的初始聚类中心所在的簇中。实验结果表明，该算法在合成数据集和UCI真实数据集上与实验对比算法相比较，具有较好的聚类效果和准确性，并且在处理复杂流形数据上的优越性较强。

关键词: 自然邻居, 反向最近邻, 代表点, 局部密度, 聚类

Abstract:

The density peak clustering algorithm is a density based clustering algorithm. The shortcomings of the density peak clustering algorithm are sensitive to parameters and poor clustering results on complex manifold data sets. A novel density peak clustering algorithm is proposed in this paper, which is based on the natural reverse nearest neighbor structure. First of all, reverse nearest neighbor is introduced to calculate the local density of data objects. Then, the initial cluster centers are selected by combining the representative points and the density. Furthermore, the density adaptive distance is used to calculate the distance between the initial cluster centers, the decision graph is constructed on the initial cluster centers by using the local density calculated based on reverse nearest neighbor and the density adaptive distance, and the final cluster centers are selected according to the decision graph. Finally, the remaining data objects are assigned to the same cluster as their nearest initial cluster centers belong to. The experimental results show that the algorithm has better clustering effect and accuracy compared with the experimental comparison algorithms on the synthetic data sets and UCI real data sets, and it has greater advantages in dealing with complex manifold data sets.

Key words: natural neighbor, reverse nearest neighbor, representative points, local density, clustering

刘娟, 万静. 自然反向最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1888-1899.

LIU Juan, WAN Jing. Optimized Density Peak Clustering Algorithm by Natural Reverse Nearest Neighbor[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1888-1899.

[1]	陈俊芬, 张明, 赵佳成, 谢博鋆, 李艳. 结合降噪和自注意力的深度聚类算法[J]. 计算机科学与探索, 2021, 15(9): 1717-1727.
[2]	王大刚, 丁世飞, 钟锦. 基于二阶[k]近邻的密度峰值聚类算法研究[J]. 计算机科学与探索, 2021, 15(8): 1490-1500.
[3]	沈学利, 秦鑫宇. 密度Canopy的增强聚类与深度特征的KNN算法[J]. 计算机科学与探索, 2021, 15(7): 1289-1301.
[4]	范瑞东, 侯臣平. 鲁棒自加权的多视图子空间聚类[J]. 计算机科学与探索, 2021, 15(6): 1062-1073.
[5]	柏锷湘, 罗可, 罗潇. 结合自然和共享最近邻的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(5): 931-940.
[6]	张倪妮, 葛洪伟. 稳定的K-多均值聚类算法[J]. 计算机科学与探索, 2021, 15(5): 941-948.
[7]	马瑞强, 宋宝燕, 丁琳琳, 王俊陆. 面向时间序列事件的动态矩阵聚类方法[J]. 计算机科学与探索, 2021, 15(3): 468-477.
[8]	薛红艳, 钱雪忠, 周世兵. 超簇加权的集成聚类算法[J]. 计算机科学与探索, 2021, 15(12): 2362-2373.
[9]	张培, 祝恩, 蔡志平. 单步划分融合多视图子空间聚类算法[J]. 计算机科学与探索, 2021, 15(12): 2413-2420.
[10]	姚晓红, 黄恒君. 非负半监督函数型聚类方法[J]. 计算机科学与探索, 2021, 15(12): 2438-2448.
[11]	尤坊州, 白亮. 关键节点选择的快速图聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1930-1937.
[12]	黄宇翔, 黄栋, 王昌栋, 赖剑煌. 基于集成学习的改进深度嵌入聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1949-1957.
[13]	屈晶晶, 蔡英, 范艳芳, 夏红科. 基于k-prototype聚类的差分隐私混合数据发布算法[J]. 计算机科学与探索, 2021, 15(1): 109-118.
[14]	范虹，史肖敏，姚若侠. 头脑风暴算法优化的乳腺MR图像软子空间聚类算法[J]. 计算机科学与探索, 2020, 14(8): 1348-1357.
[15]	罗浩，王彦捷，牛明航，邱存月，张利. 动态区间的加权模糊聚类算法[J]. 计算机科学与探索, 2020, 14(7): 1142-1153.

自然反向最近邻优化的密度峰值聚类算法

Optimized Density Peak Clustering Algorithm by Natural Reverse Nearest Neighbor

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics