Optimized Density Peak Clustering Algorithm by Natural Reverse Nearest Neighbor

doi:10.3778/j.issn.1673-9418.2007017

Abstract

Abstract:

The density peak clustering algorithm is a density based clustering algorithm. The shortcomings of the density peak clustering algorithm are sensitive to parameters and poor clustering results on complex manifold data sets. A novel density peak clustering algorithm is proposed in this paper, which is based on the natural reverse nearest neighbor structure. First of all, reverse nearest neighbor is introduced to calculate the local density of data objects. Then, the initial cluster centers are selected by combining the representative points and the density. Furthermore, the density adaptive distance is used to calculate the distance between the initial cluster centers, the decision graph is constructed on the initial cluster centers by using the local density calculated based on reverse nearest neighbor and the density adaptive distance, and the final cluster centers are selected according to the decision graph. Finally, the remaining data objects are assigned to the same cluster as their nearest initial cluster centers belong to. The experimental results show that the algorithm has better clustering effect and accuracy compared with the experimental comparison algorithms on the synthetic data sets and UCI real data sets, and it has greater advantages in dealing with complex manifold data sets.

Key words: natural neighbor, reverse nearest neighbor, representative points, local density, clustering

摘要：

密度峰值聚类算法是一种基于密度的聚类算法。针对密度峰值聚类算法存在的参数敏感和对复杂流形数据得到的聚类结果较差的缺陷，提出一种新的密度峰值聚类算法，该算法基于自然反向最近邻结构。首先，该算法引入反向最近邻计算数据对象的局部密度；其次，通过代表点和密度相结合的方式选取初始聚类中心；然后，应用密度自适应距离计算初始聚类中心之间的距离，利用基于反向最近邻计算出的局部密度和密度自适应距离在初始聚类中心上构建决策图，并通过决策图选择最终的聚类中心；最后，将剩余的数据对象分配到距离其最近的初始聚类中心所在的簇中。实验结果表明，该算法在合成数据集和UCI真实数据集上与实验对比算法相比较，具有较好的聚类效果和准确性，并且在处理复杂流形数据上的优越性较强。

关键词: 自然邻居, 反向最近邻, 代表点, 局部密度, 聚类

LIU Juan, WAN Jing. Optimized Density Peak Clustering Algorithm by Natural Reverse Nearest Neighbor[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1888-1899.

刘娟, 万静. 自然反向最近邻优化的密度峰值聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1888-1899.

[1]	CHEN Junfen, ZHANG Ming, ZHAO Jiacheng, XIE Bojun, LI Yan. Deep Clustering Algorithm Based on Denoising and Self-Attention [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(9): 1717-1727.
[2]	WANG Dagang, DING Shifei, ZHONG Jin. Research of Density Peaks Clustering Algorithm Based on Second-Order k Neighbors [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1490-1500.
[3]	SHEN Xueli, QIN Xinyu. KNN Algorithm of Enhanced Clustering Based on Density Canopy and Deep Feature [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(7): 1289-1301.
[4]	FAN Ruidong, HOU Chenping. Robust Auto-weighted Multi-view Subspace Clustering [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(6): 1062-1073.
[5]	BAI Exiang, LUO Ke, LUO Xiao. Peak Density Clustering Algorithm Combining Natural and Shared Nearest Neighbor [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 931-940.
[6]	ZHANG Nini, GE Hongwei. Stable K Multiple-Means Clustering Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 941-948.
[7]	MA Ruiqiang, SONG Baoyan, DING Linlin, WANG Junlu. Dynamic Matrix Clustering Method for Time Series Events [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(3): 468-477.
[8]	XUE Hongyan, QIAN Xuezhong, ZHOU Shibing. Ensemble Clustering Algorithm Based on Weighted Super Cluster [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2362-2373.
[9]	ZHANG Pei, ZHU En, CAI Zhiping. One-Stage Partition-Fusion Multi-view Subspace Clustering Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2413-2420.
[10]	YAO Xiaohong, HUANG Hengjun. Semi-supervised Clustering Method for Non-negative Functional Data [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2438-2448.
[11]	YOU Fangzhou, BAI Liang. Fast Graph Clustering Algorithm Based on Selection of Key Nodes [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1930-1937.
[12]	HUANG Yuxiang, HUANG Dong, WANG Changdong, LAI Jianhuang. Improved Deep Embedding Clustering with Ensemble Learning [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1949-1957.
[13]	QU Jingjing, CAI Ying, FAN Yanfang, XIA Hongke. Differentially Private Mixed Data Release Algorithm Based on k-prototype Clustering [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 109-118.
[14]	FAN Hong, SHI Xiaomin, YAO Ruoxia. Soft Subspace Clustering Algorithm Optimized by Brain Storm Algorithm for Breast MR Image [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(8): 1348-1357.
[15]	LUO Hao, WANG Yanjie, NIU Minghang, QIU Cunyue, ZHANG Li. Weighted Fuzzy Clustering Algorithm Based on Dynamic Interval [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(7): 1142-1153.

Optimized Density Peak Clustering Algorithm by Natural Reverse Nearest Neighbor

自然反向最近邻优化的密度峰值聚类算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics