自适应聚合策略优化的密度峰值聚类算法

doi:10.3778/j.issn.1673-9418.1902022

计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (4): 712-720.DOI: 10.3778/j.issn.1673-9418.1902022

• 理论与算法 • 上一篇

自适应聚合策略优化的密度峰值聚类算法

钱雪忠，金辉

江南大学物联网工程学院物联网技术应用教育部工程研究中心，江苏无锡 214122

出版日期:2020-04-01 发布日期:2020-04-10

Optimized Density Peak Clustering Algorithm by Adaptive Aggregation Strategy

QIAN Xuezhong, JIN Hui

Engineering Research Center of Internet of Things Technology Applications, Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China

Online:2020-04-01 Published:2020-04-10

摘要/Abstract

摘要：

针对密度峰值聚类算法受人为干预影响较大和参数敏感的问题，即不正确的截断距离[dc]会导致错误的初始聚类中心，而且在某些情况下，即使设置了适当的[dc]值，仍然难以从决策图中人为选择初始聚类中心。为克服这些缺陷，提出一种新的基于密度峰值的聚类算法。该算法首先根据[K]近邻的思想来确定数据点的局部密度，然后提出一种新的自适应聚合策略，即首先通过算法给出阈值判断初始类簇中心，然后依据离初始类簇中心最近分配剩余点，最后通过类簇间密度可达来合并相似类簇。在实验中，该算法在合成和实际数据集中的表现比DPC、DBSCAN、[KNNDPC]和K-means算法要好，能有效提高聚类准确率和质量。

关键词: 密度峰, [K]近邻（KNN）, 局部密度, 合并策略, 类簇间密度可达

Abstract:

Aiming at the problem that the density peak clustering algorithm is greatly influenced by human interven-tion and parameter is sensitive, that is the improper selection of its parameter cutoff distance dc will lead to the wrong selection of initial cluster centers. And in some cases, even the proper value of dc is set, initial cluster centers are still difficult to be selected from the decision graph artificially. To overcome these defects, a new clustering algorithm based on density peak is proposed. Firstly, the algorithm determines the local density of data points according to the idea of K-nearest neighbors, and then a new adaptive aggregation strategy is proposed, which firstly determines the initial cluster center by the threshold of the algorithm, then allocates the remaining points according to the nearest cluster center, and finally merges the similar clusters by the density reachable between the clusters. In the experiment, the algorithm performs better than the DPC, DBSCAN, [KNNDPC] and K-means algorithm in the synthetic and actual datasets, and the algorithm can effectively improve clustering accuracy and quality.

Key words: density peak, K-nearest neighbor [(KNN)], local density, merging strategy, clustering density reachable

钱雪忠，金辉. 自适应聚合策略优化的密度峰值聚类算法[J]. 计算机科学与探索, 2020, 14(4): 712-720.

QIAN Xuezhong, JIN Hui. Optimized Density Peak Clustering Algorithm by Adaptive Aggregation Strategy[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(4): 712-720.

[1]	王大刚，丁世飞，钟锦. 基于二阶[k]近邻的密度峰值聚类算法研究[J]. 计算机科学与探索, 2021, 15(8): 1490-1500.
[2]	柏锷湘，罗可，罗潇. 结合自然和共享最近邻的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(5): 931-940.
[3]	刘娟，万静. 自然反向最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1888-1899.
[4]	丁志成，葛洪伟. 优化分配策略的密度峰值聚类算法[J]. 计算机科学与探索, 2020, 14(5): 792-802.
[5]	冯志雨，游晓明，刘升. 分层递进的改进聚类蚁群算法解决TSP问题[J]. 计算机科学与探索, 2019, 13(8): 1280-1294.
[6]	李佳佳，李雨现，夏秀峰，王波涛，刘向宇. 面向时间依赖路网的连续k近邻查询[J]. 计算机科学与探索, 2019, 13(5): 788-799.
[7]	金辉，钱雪忠. 自然最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2019, 13(4): 711-720.
[8]	温晓芳，杨志翀，陈梅. 数据点的密度引力聚类新算法[J]. 计算机科学与探索, 2018, 12(12): 1996-2006.
[9]	谢娟英，屈亚楠. 密度峰值优化初始中心的K-medoids聚类算法[J]. 计算机科学与探索, 2016, 10(2): 230-247.
[10]	李涛，葛洪伟，苏树智. 自动确定聚类中心的密度峰聚类[J]. 计算机科学与探索, 2016, 10(11): 1614-1622.
[11]	陈明, 何书萍, 李凡长. Finsler度量在KNN算法中的应用研究[J]. 计算机科学与探索, 2011, 5(11): 1021-1026.

自适应聚合策略优化的密度峰值聚类算法

Optimized Density Peak Clustering Algorithm by Adaptive Aggregation Strategy

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 11

编辑推荐

Metrics