自然最近邻优化的密度峰值聚类算法

doi:10.3778/j.issn.1673-9418.1804033

计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (4): 711-720.DOI: 10.3778/j.issn.1673-9418.1804033

• 理论与算法 • 上一篇

自然最近邻优化的密度峰值聚类算法

金辉+，钱雪忠

江南大学物联网工程学院物联网技术应用教育部工程研究中心，江苏无锡 214122

出版日期:2019-04-01 发布日期:2019-04-10

Optimized Density Peak Clustering Algorithm by Natural Nearest Neighbor

JIN Hui+, QIAN Xuezhong

Engineering Research Center of Internet of Things Technology Applications Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China

Online:2019-04-01 Published:2019-04-10

摘要/Abstract

摘要： 针对现有的基于密度的聚类算法存在参数敏感，处理非球面数据和复杂流形数据聚类效果差的问题，提出一种新的基于密度峰值的聚类算法。该算法首先根据自然最近邻居的概念确定数据点的局部密度，然后根据密度峰局部密度最高并且被稀疏区域分割来确定聚类中心，最后提出一种新的类簇间相似度概念来解决复杂流形问题。在实验中，该算法在合成和实际数据集中的表现比DPC（clustering by fast search and find of density peaks）、DBSCAN（density-based spatial clustering of applications with noise）和K-means算法要好，并且在非球面数据和复杂流形数据上的优越性特别大。

关键词: 密度峰, 自然最近邻居, 局部密度, 稀疏区域, 类簇间相似度

Abstract: Aiming at the problem that the existing density-based clustering algorithm is sensitive to parameters and the clustering result of aspheric data and complex manifold data is bad, a new clustering algorithm based on density peak is proposed. The algorithm first determines the local density of data based on the natural nearest neighbor, and then determines the clustering center based on which density peaks have the highest local density and are divided by sparse regions. Finally, a new concept of similarity between clusters is proposed to solve complex manifold problems. In the experiment, the performance of this algorithm is better than that of DPC (clustering by fast search and find of density peaks), DBSCAN (density-based spatial clustering of applications with noise) and K-means in synthetic and actual data sets, and the advantages of aspheric data and complex manifold data are particularly superior.

Key words: density peak, natural nearest neighbor, local density, sparse regions, similarity between clusters

金辉，钱雪忠. 自然最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2019, 13(4): 711-720.

JIN Hui, QIAN Xuezhong. Optimized Density Peak Clustering Algorithm by Natural Nearest Neighbor[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(4): 711-720.

[1]	王大刚，丁世飞，钟锦. 基于二阶[k]近邻的密度峰值聚类算法研究[J]. 计算机科学与探索, 2021, 15(8): 1490-1500.
[2]	柏锷湘，罗可，罗潇. 结合自然和共享最近邻的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(5): 931-940.
[3]	刘娟，万静. 自然反向最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1888-1899.
[4]	丁志成，葛洪伟. 优化分配策略的密度峰值聚类算法[J]. 计算机科学与探索, 2020, 14(5): 792-802.
[5]	钱雪忠，金辉. 自适应聚合策略优化的密度峰值聚类算法[J]. 计算机科学与探索, 2020, 14(4): 712-720.
[6]	冯志雨，游晓明，刘升. 分层递进的改进聚类蚁群算法解决TSP问题[J]. 计算机科学与探索, 2019, 13(8): 1280-1294.
[7]	温晓芳，杨志翀，陈梅. 数据点的密度引力聚类新算法[J]. 计算机科学与探索, 2018, 12(12): 1996-2006.
[8]	谢娟英，屈亚楠. 密度峰值优化初始中心的K-medoids聚类算法[J]. 计算机科学与探索, 2016, 10(2): 230-247.
[9]	李涛，葛洪伟，苏树智. 自动确定聚类中心的密度峰聚类[J]. 计算机科学与探索, 2016, 10(11): 1614-1622.

自然最近邻优化的密度峰值聚类算法

Optimized Density Peak Clustering Algorithm by Natural Nearest Neighbor

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 9

编辑推荐

Metrics