Fast Multi-view Clustering with Sparse Matrix and Improved Normalized Cut

doi:10.3778/j.issn.1673-9418.2309037

Abstract

Abstract: The multi-view clustering algorithm is a novel approach to explore the inherent clustering structure among data. However, most existing methods suffer from noise issues when constructing similarity graphs and may lose important information during the clustering, leading to lower accuracy. Moreover, iterative optimization approaches often used by these algorithms can be memory-overflowing and time-consuming. To address these limitations, a fast multi-view clustering algorithm with sparse matrix and improved normalized cut (SINFMC) is proposed. It first constructs similarity graphs for all views and integrates them to form a consensus graph matrix. Then, the [l1]-norm constraint is applied to the consensus graph matrix to obtain a sparse matrix, which helps to denoise the data and speed up computations. Finally, an improved normalized spectral clustering algorithm is used to cluster the sparse consensus graph and obtain a cluster indicator matrix. This matrix provides clustering results directly and avoids information loss and bias. Unlike other methods, the proposed algorithm does not require iterative optimization and simplifies the computation process through sparse matrix representation, reducing time and space complexity. Experimental results on both artificial and real-world datasets demonstrate that the proposed algorithm outperforms the compared algorithms in terms of quality and efficiency.

Key words: multi-view clustering, sparse matrix, normalized cuts, soft threshold, graph fusion

摘要： 多视图聚类是一种新颖的聚类算法，它可以有效地探索出数据之间的内在聚类结构。大多数多视图聚类算法在构造相似图时容易受到噪声的影响，而且在聚类过程中还会面临信息损失问题，从而降低聚类结果的准确性。此外，现有多视图聚类算法通常使用交替迭代优化方法获得最优解，多次迭代会导致内存溢出或耗时过长。为了解决上述问题，提出了一种基于稀疏矩阵和改进归一化切割的快速多视图聚类算法（SINFMC）。该算法根据原始数据构造每个视图的相似图，并对相似图进行融合得到共识图矩阵。对共识图矩阵进行[l1]范数约束获得稀疏矩阵，实现数据降噪和加速计算。使用改进的归一化谱聚类算法对稀疏的共识图进行聚类得到聚类指标矩阵，这样不仅能够直接获得聚类结果，而且消除了聚类过程中的信息损失和偏差。该聚类算法无需交替迭代优化且通过稀疏矩阵表示精简计算过程，大幅降低了算法的时间和空间复杂度。人工和真实数据集上的比较实验结果表明该算法在质量和效率方面优于对比算法。

关键词: 多视图聚类, 稀疏矩阵, 归一化切割, 软阈值, 图融合

YANG Mingrui, ZHOU Shibing, WANG Xi, SONG Wei. Fast Multi-view Clustering with Sparse Matrix and Improved Normalized Cut[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(11): 3027-3040.

杨明瑞, 周世兵, 王茜, 宋威. 稀疏矩阵和改进归一化切割的快速多视图聚类[J]. 计算机科学与探索, 2024, 18(11): 3027-3040.

References

[1] DE SA V R. Spectral clustering with two views[C]//Proceedings of the 22nd International Conference on Machine Learning the Workshop on Learning with Multiple Views, Bonn, Aug 7-11, 2005: 20-27.
[2] XIAO Y, ZHANG J, LIU B, et al. Multi-view maximum margin clustering with privileged information learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(4): 2719-2733.
[3] LU Y, LIU Y, LONG Z, et al. O-Minus decomposition for multi-view tensor subspace clustering[J]. IEEE Transactions on Artificial Intelligence, 2023, 1(1): 1-14.
[4] ZHANG Z, LIU L, SHEN F, et al. Binary multi-view clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(7): 1774-1782.
[5] CHANG W, NIE F, WANG R, et al. Robust subspace clustering by learning an optimal structured bipartite graph via low-rank representation[C]//Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2019: 3692-3696.
[6] LI L, HE H. Bipartite graph based multi-view clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(7): 3111-3125.
[7] ZHU W, NIE F, LI X. Fast spectral clustering with efficient large graph construction[C]//Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2017: 2492-2496.
[8] LIU S, WANG S, ZHANG P, et al. Efficient one-pass multi-view subspace clustering with consensus anchors[C]//Proceedings of the 2022 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2022: 7576-7584.
[9] YANG B, ZHANG X, LI Z, et al. Efficient multi-view K-means clustering with multiple anchor graphs[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 35(7): 6887-6900.
[10] XIA W, GAO Q, WANG Q, et al. Tensorized bipartite graph learning for multi-view clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 5187- 5202.
[11] GUO C, ZHAO H. Community structure discovery method based on the Gaussian kernel similarity matrix[J]. Physica A: Statistical Mechanics and Its Applications, 2012, 391(6): 2268-2278.
[12] VON LUXBURG U. A tutorial on spectral clustering[J]. Statistics and Computing, 2007, 17: 395-416.
[13] SHI J, MALIK J. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
[14] GUO Z, PUN C M. Improved normalized cut for multi-view clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(12): 10244-10251.
[15] HU Z, NIE F, CHANG W, et al. Multi-view spectral clustering via sparse graph learning[J]. Neurocomputing, 2020, 384: 1-10.
[16] LIN Z, CHEN M, MA Y. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices[J]. Mathematics, 2010, 247: 2227-7390.
[17] HUANG J, NIE F, HUANG H. Spectral rotation versus K-means in spectral clustering[C]//Proceedings of the 2013 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2013: 431-437.
[18] SHI S, NIE F, WANG R, et al. Fast multi-view clustering via prototype graph[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 35(1): 443-455.
[19] WANG H, YANG Y, LIU B. GMC: graph-based multi-view clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 32(6): 1116-1129.
[20] YANG B, ZHANG X, NIE F, et al. Fast multi-view clustering via nonnegative and orthogonal factorization[J]. IEEE Transactions on Image Processing, 2020, 30: 2575-2586.
[21] HUANG S, TSANG I W, XU Z, et al. Measuring diversity in graph learning: a unified framework for structured multi-view clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 34(12): 5869-5883.
[22] YANG W, WANG Y, TANG C, et al. One step multi-view spectral clustering via joint adaptive graph learning and matrix factorization[J]. Neurocomputing, 2023, 524: 95-105.
[23] KANG Z, SHI G, Huang S, et al. Multi-graph fusion for multi-view spectral clustering[J]. Knowledge-Based Systems, 2020, 189: 105102.
[24] HU Z, NIE F, WANG R, et al. Multi-view spectral clustering via integrating nonnegative embedding and spectral embedding[J]. Information Fusion, 2020, 55: 251-259.
[25] ZHAO K, LIN Z, ZHU X, et al. Structured graph learning for scalable subspace clustering: from single view to multi-view[J]. IEEE Transactions on Cybernetics, 2021, 52(9): 8976-8986.
[26] YANG Y, XU D, NIE F, et al. Image clustering using local discriminant models and global integration[J]. IEEE Transactions on Image Processing, 2010, 19(10): 2761-2773.
[27] XU W, LIU X, GONG Y. Document clustering based on non-negative matrix factorization[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. New York: ACM, 2003: 267-273.
[28] ZHAN K, NIE F, WANG J, et al. Multi-view consensus graph clustering[J]. IEEE Transactions on Image Processing, 2018, 28(3): 1261-1270.
[29] VINH N X, EPPS J, BAILEY J. Information theoretic measures for clusterings comparison: is a correction for chance necessary?[C]//Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Jun 14-18, 2009: 1073-1080.