计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (5): 1147-1156.DOI: 10.3778/j.issn.1673-9418.2109025

• 人工智能·模式识别 • 上一篇    下一篇

动态融合的多视图投影聚类算法

姜凯彬,周世兵,钱雪忠,管娇娇   

  1. 江南大学 人工智能与计算机学院,江苏 无锡 214122
  • 出版日期:2023-05-01 发布日期:2023-05-01

Dynamic-Fusion Multi-view Projection Clustering Algorithm

JIANG Kaibin, ZHOU Shibing, QIAN Xuezhong, GUAN Jiaojiao   

  1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2023-05-01 Published:2023-05-01

摘要: 多视图聚类是一个日益受到关注的研究热点。现有的大多数多视图聚类方法通常先对数据进行图学习,再对融合得到的统一图进行聚类得到最终结果,这种图学习和图聚类的两步策略可能导致聚类结果具有随机性。此外,多视图数据本身存在不可避免的噪声并且各视图数据差异较大,在原始高维数据空间进行无效融合可能造成重要信息的损失,不同多视图数据也可能存在选择参数敏感的问题。为了解决上述问题,提出了一种动态融合的多视图投影聚类算法,将自适应降维图学习、无参数的自权重图融合和谱聚类整合在同一框架中,三个过程相互促进,联合优化投影矩阵、相似性矩阵、共识矩阵以及聚类标签。对动态融合过程中得到的共识矩阵的拉普拉斯矩阵施加秩约束,直接获得聚类结果。而且引入的启发式超参数会随着每次优化迭代自动调整。为了求解联合优化问题,设计了一种有效的交替迭代方法。在人工数据集和真实数据集上得到的实验结果表明该算法的优越性。

关键词: 多视图聚类, 投影降维, 图融合, 共识矩阵

Abstract: Multi-view clustering is a hot research area, which has attracted increasing attention. Most existing multi- view clustering methods usually learn the data first, and then cluster the fused unified graph to get the final result. This two-step strategy of graph learning and graph clustering may lead to the randomness of clustering results. Besides, the inevitable noise of the data itself and the large differences among views, these invalid fusion methods in high-dimensional data space may cause important information loss, and different multi-view data may be sensitive to parameter selections. To solve the above problems, a multi-view projection clustering algorithm based on dynamic fusion is proposed, which integrates adaptive dimensionality reduction graph learning, self-weight fusion without parameters and spectral clustering in the same framework. The three processes promote each other and jointly optimize the projection matrix, similarity matrix, consensus matrix and clustering label. The Laplacian matrix of the best consensus matrix obtained by dynamic fusion is constrained by rank, and clustering results are obtained directly. Moreover, heuristic super-parameters are automatically adjusted with each optimization iteration. To solve the joint optimization problem, an effective alternative optimization method is designed. Experimental results on artificial datasets and real datasets show the superiority of the algorithm.

Key words: multi-view clustering, projection dimension reduction, graph fusion, consensus matrix