计算机科学与探索

• 学术研究 •    

多样性诱导和正交非负图重构的多视图聚类

王茜, 周世兵, 杨明瑞, 宋威   

  1. 江南大学 人工智能与计算机学院, 江苏 无锡 214122

Multi-view clustering via diversity induction and orthogonal non-negative graph reconstruction

WANG Xi, ZHOU Shibing, YANG Mingrui, SONG Wei   

  1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China

摘要: 基于图学习的多视图聚类算法由于其简单高效的特点,近年来受到了广泛关注。现有的多视图聚类算法大多只考虑每个视图的一致性,忽略了不同视图间的多样性,而且这些方法直接从原始数据点学习相似图,不能显示出清晰的聚类结构和准确提取数据集底层信息。为了解决上述问题,提出通过多样性诱导和正交非负图重构的多视图聚类算法。首先在统一的框架内充分利用多视图的一致性和多样性,使得一致部分结合自适应权重融合到目标图中产生更加合理的聚类结构。然后集成谱聚类和非负矩阵分解来获取结构更加清晰的聚类目标图。最后,将非负矩阵分解的一个因子矩阵约束为正交指示矩阵,从而直接得到聚类结果。该算法根据每个视图的聚类能力自动分配合适的视图权重。此外,为了联合求解优化问题,引入交替迭代策略来优化聚类算法的目标函数。实验结果表明该算法可以有效提高聚类准确率,在数据集HW2,100leaves,Mfeat上的准确率分别达到了99.21%,89.56%,87.85%,与次优模型相比分别提高了0.81%,5.12%,3.75%。理论分析和实验研究证明了本文算法的有效性和优良性能。

关键词: 多视图聚类, 多样性诱导, 图重构, 谱嵌入, 非负矩阵分解

Abstract: Multi-view clustering algorithm based on graph learning has been widely concerned in recent years because of its simplicity and high efficiency. Most of multi-view clustering algorithms only consider the consistent part of each view and ignore the diversity between different views. Moreover, most of the methods learn the similarity map directly from the original data points, which cannot show the clear cluster structure and accurately extract the underlying information. To solve the above problems, a multi-view clustering algorithm via diversity induction and orthogonal non-negative graph reconstruction is proposed. Firstly, the consistency and diversity of multiple views are fully utilized within a unified framework, and the consistent part is fused into the target graph with adaptive weights to produce a more reasonable clustering structure. Then, spectral clustering and non-negative matrix decomposition are integrated to obtain the cluster target graph with clearer structure. Finally, a factor matrix of non-negative matrix decomposition is constrained to an orthogonal indicator matrix, and the clustering results are obtained directly. The algorithm automatically assigns appropriate view weights according to the clustering ability of each view. In addition, to solve the optimization problem jointly, an alternate iteration strategy is used to optimize the objective function of the clustering algorithm. Experimental results show that our algorithm can effectively improve the clustering accuracy. The accuracy on the HW2, 100leaves, and Mfeat datasets reaches 99.21%, 89.56% and 87.85%, respectively, which has an accuracy improvement of 0.81%, 5.12% and 3.75% compared with the suboptimal model. Theoretical analysis and experimental research demonstrate the effectiveness and excellent performance of the proposed algorithm.

Key words: multi-view clustering, diversity induction, graph reconstruction, spectral embedding, non-negative matrix factorization