计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (12): 1996-2006.DOI: 10.3778/j.issn.1673-9418.1711034

• 理论与算法 • 上一篇    下一篇

数据点的密度引力聚类新算法

温晓芳,杨志翀,陈梅   

  1. 兰州交通大学 电子与信息工程学院,兰州 730070
  • 出版日期:2018-12-01 发布日期:2018-12-07

Density Attraction Clustering Algorithm Between Data Points

WEN Xiaofang, YANG Zhichong, CHEN Mei   

  1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
  • Online:2018-12-01 Published:2018-12-07

摘要:

现有的很多聚类算法在各种数据集中检测任意簇时通常不能获得好的性能。通过把每个数据点看作自然界中的质点,定义了数据点间密度引力的概念,在此基础上提出了一种新的具有鲁棒性的密度引力聚类算法。首先根据每个数据点的周围邻居分布稀疏程度获得其局部密度,然后迭代地将每个数据点分配给密度比它大且距其最近的互近邻点形成初始簇,最后将具有共同数据点的初始簇进行合并得到最终簇。实验将提出的新算法在六个不同维度、不同类型的数据集上分别与三种经典算法、三种新算法进行了测试,结果表明该算法的聚类性能优于对比算法,且可以在不同维度的数据集中发现任意簇。

关键词: 聚类分析, 任意簇, 密度引力, 局部密度

Abstract:

Existing clustering algorithms often fail in dealing with datasets with various clusters. By regarding each data point as a particle, a new robust density attracted clustering algorithm is proposed to better detect the cluster. First, the local density of a point is obtained according to the dense degree of its neighbors. Then, each point is        assigned to its mutual nearest neighbor with greater density to form initial clusters. Next, the initial clusters including the same points are merged together to form the final clusters. By comparing the new algorithm with the three classical and the three state-of-the-art methods on six various datasets respectively, the results show that the new proposed    algorithm shows the best performance, and can be used to find various clusters in datasets with different dimensions.

Key words: clustering, various clusters, density attraction, local density