计算机科学与探索 ›› 2016, Vol. 10 ›› Issue (2): 248-256.DOI: 10.3778/j.issn.1673-9418.1507069

• 人工智能与模式识别 • 上一篇    下一篇

基于距离关联性动态模型的聚类改进算法

陈雄韬+,闫秋艳   

  1. 中国矿业大学 计算机科学与技术学院,江苏 徐州 221116
  • 出版日期:2016-02-01 发布日期:2016-02-03

Clustering Improved Algorithm Based on Distance-Relatedness Dynamic Model

CHEN Xiongtao+, YAN Qiuyan   

  1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
  • Online:2016-02-01 Published:2016-02-03

摘要: 针对大部分聚类算法无法高效地发现任意形状及不同密度的簇的问题,提出了一种高效的基于距离关联性动态模型的聚类改进算法。首先,为提高聚类效率,使用层次聚类算法对数据集进行初始聚类,并剔除样本点含量过低的簇;其次,为发现任意形状及不同密度的簇,以初始聚类结果的簇的质心作为代表点,利用距离关联性动态模型进行聚类,并利用层次聚类的树状结构进行有效的剪枝计算;最后,检验算法的有效性。实验采用Chameleon数据集进行测试,结果表明,该算法能够有效识别任意形状及不同密度的簇,且与同类算法相比,时间效率有显著的提高。

关键词: 聚类, 任意形状的簇, 不同密度的簇, 距离关联性, 动态模型

Abstract: In view of the fact that most of clustering algorithms fail to find arbitrary shaped and different density clusters efficiently, this paper proposes an efficient clustering improved algorithm based on distance-relatedness dynamic model. Firstly, in order to improve the efficiency of clustering, using hierarchical clustering algorithms for the data set to get the initial clusters and remove abnormal clusters. Secondly, in order to obtain arbitrary shaped clusters, taking the centroid of initial clusters as the representative point of all points in it, then running the distance-relatedness dynamic model for clustering, and using the tree structure of hierarchical clustering for pruning. Finally, verifying the effectiveness of the proposed algorithm. The algorithm is tested on the Chameleon dataset, the experimental results show that the algorithm can obtain arbitrary shape and different density clusters, and compared with the same algorithms, the time efficiency is improved significantly.

Key words: clustering, arbitrary shaped clusters, different density clusters, distance-relatedness, dynamic model