计算机科学与探索 ›› 2018, Vol. 12 ›› Issue (3): 462-471.DOI: 10.3778/j.issn.1673-9418.1703048

• 人工智能与模式识别 • 上一篇    下一篇

局部相似性优化的p-谱聚类算法

胡乾坤,丁世飞+   

  1. 中国矿业大学 计算机科学与技术学院,江苏 徐州 221116
  • 出版日期:2018-03-01 发布日期:2018-03-08

p-Spectral Clustering Algorithm with Optimization of Local Similarity

HU Qiankun, DING Shifei+   

  1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
  • Online:2018-03-01 Published:2018-03-08

摘要: 通过引入p-Laplacian算子,谱聚类算法得以获得较好的图切判据。但算法中的相似矩阵未能充分挖掘数据样本的局部结构信息,同时相似性的计算与数据样本的聚类是在两个不同的步骤中实现的,故得到的相似矩阵并不一定是最适合此聚类方法的,从而得不到最优的聚类结果。因此,提出了基于局部相似性优化的p-谱聚类算法。该算法通过数据样本的自适应和最优近邻之间的局部距离来优化相似性测度的方法,同时通过p-Laplacian矩阵的秩约束,可以得到对应无向图中连通分量的数目等于聚类数目。实验表明,基于局部相似性优化的p-谱聚类算法可以获得更好的聚类效果。

关键词: p-Laplacian算子, 局部相似性, 自适应和最优近邻, 秩约束

Abstract: Spectral clustering algorithm can obtain a better graph cut criterion by introducing p-Laplacian operator. But the similarity matrix doesn't fully exploit the local information of the dataset in the algorithm. Simultaneously, the similarity measurement and data clustering are often conducted in two seperated steps, the learned data similarity may not be optimal one for data clustering and lead to the suboptimal results. Therefore, this paper proposes p-spectral clustering algorithm with the optimization of local similarity. The algorithm learns the data similarity matrix by assigning the adaptive and optimal neighbors for each data point based on the local distances. Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the data similarity matrix, such that the connected components in the resulted similarity matrix are exactly equal to the cluster number. The experiments show that p-spectral clustering algorithm based on local similarity optimization can produce better clustering results.

Key words: p-Laplacian operator, local similarity, adaptive and optimal neighbors, rank constraint