面向无监督特征提取的结构化稀疏图学习

doi:10.3778/j.issn.1673-9418.2406069

摘要/Abstract

摘要： 无监督特征提取因解决高维数据造成的“维度灾难”问题而受到越来越多的关注。然而，现有方法通常构建低秩图或者近邻图来寻找高维数据的投影方向，忽略了数据的全局相关结构和表征的稀疏性。为了解决这些问题，提出了一种新的降维方法，被称为面向无监督特征提取的结构化稀疏图学习（SSGL）。SSGL方法使用表征来构建样本之间的最近邻图来保持数据的局部结构，使用最小二乘回归来建模数据的全局相关结构。因此，SSGL能够同时保持数据的局部和全局相关结构。此外，SSGL使用稀疏正则化断开亲和图中不同聚类样本之间的连接，从而使得学到的投影更具有判别力。为了验证SSGL的有效性，在八个公共图像数据集上进行了大量实验。结果表明，SSGL在聚类精度方面优于其他先进的特征提取方法，显著提升了聚类效果和分类性能。

关键词: 特征提取, 稀疏图, 亲和关系, 局部结构

Abstract: Unsupervised feature extraction has garnered increasing attention for alleviating the “curse of dimensionality” problem posed by high-dimensional data. However, existing methods typically construct low-rank graphs or nearest neighbor graphs to find the projection direction of high-dimensional data, overlooking the global structural correlation and sparsity of representation. To address these issues, a novel dimensionality reduction method called structured sparse graph learning-based unsupervised feature extraction (SSGL) is proposed. The SSGL method utilizes representation to construct nearest neighbor graphs between samples to preserve the local structure of the data and uses least squares regression to model the global structural correlation of the data. Consequently, the proposed SSGL can simultaneously preserve both the local and global structural correlations of the data. Moreover, SSGL employs sparse regularization to disconnect links between samples from different clusters in the affinity graph, thereby making the learned projection more discriminative. To validate the effectiveness of SSGL, extensive experiments are conducted on eight public image datasets. The results indicate that SSGL outperforms other advanced feature extraction methods in terms of clustering accuracy, significantly enhancing clustering results and classification performance.

Key words: feature extraction, sparse graph, affinity relationship, local structure

朱奕珂, 丁建浩, 尹学松, 王毅刚. 面向无监督特征提取的结构化稀疏图学习[J]. 计算机科学与探索, 2025, 19(4): 964-975.

ZHU Yike, DING Jianhao, YIN Xuesong, WANG Yigang. Structured Sparsity Graph Learning for Unsupervised Feature Extraction[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(4): 964-975.

参考文献

[1] CHEN H Y, LONG H Y, CHEN T, et al. M3FuNet: an unsupervised multivariate feature fusion network for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5513015.
[2] ZHANG X, JIANG X W, JIANG J J, et al. Spectral-spatial and superpixelwise PCA for unsupervised feature extraction of hyperspectral imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 5502210.
[3] ZHAO H T, SUN S Y, JING Z L, et al. Local structure based supervised feature extraction[J]. Pattern Recognition, 2006, 39(8): 1546-1550.
[4] LU J L, LAI Z H, WANG H L, et al. Generalized embedding regression: a framework for supervised feature extraction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(1): 185-199.
[5] CHAVOSHINEJAD J, SEYEDI S A, AKHLAGHIAN TAB F, et al. Self-supervised semi-supervised nonnegative matrix factorization for data clustering[J]. Pattern Recognition, 2023, 137: 109282.
[6] 郭乐铭, 薛万利, 袁甜甜. 多尺度视觉特征提取及跨模态对齐的连续手语识别[J]. 计算机科学与探索, 2024, 18(10): 2762-2769.
GUO L M, XUE W L, YUAN T T. Multi-scale visual feature extraction and cross-modality alignment for continuous sign language recognition[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(10): 2762-2769.
[7] PANDIT A A, PIMPALE B, DUBEY S. A comprehensive review on unsupervised feature selection algorithms[C]//Proceedings of the International Conference on Intelligent Computing and Smart Communication 2019. Singapore: Springer, 2020: 255-266.
[8] MA?KIEWICZ A, RATAJCZAK W. Principal components analysis (PCA)[J]. Computers & Geosciences, 1993, 19(3): 303-342.
[9] HE X F, NIYOGI P, HE X F, et al. Locality preserving projections[C]//Proceedings of the 17th International Conference on Neural Information Processing Systems, 2003: 153-160.
[10] COMON P. Independent component analysis, a new concept?[J]. Signal Processing, 1994, 36(3): 287-314.
[11] BELKIN M, NIYOGI P. Laplacian eigenmaps for dimensionality reduction and data representation[J]. Neural Computation, 2003, 15(6): 1373-1396.
[12] HE X F, CAI D, YAN S C, et al. Neighborhood preserving embedding[C]//Proceedings of the 10th IEEE International Conference on Computer Vision, Volume 1. Piscataway: IEEE, 2005: 1208-1213.
[13] PANG Y W, ZHANG L, LIU Z K, et al. Neighborhood preserving projections (NPP): a novel linear dimension reduction method[C]//Advances in Intelligent Computing: International Conference on Intelligent Computing. Berlin, Heidelberg: Springer, 2005: 117-125.
[14] ROWEIS S T, SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290(5500): 2323-2326.
[15] CAI D, HE X F, HAN J W, et al. Orthogonal Laplacianfaces for face recognition[J]. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614.
[16] WANG R, NIE F P, HONG R C, et al. Fast and orthogonal locality preserving projections for dimensionality reduction[J]. IEEE Transactions on Image Processing, 2017, 26(10): 5019-5030.
[17] PANG Y W, JI Z, JING P G, et al. Ranking graph embedding for learning to rerank[J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(8): 1292-1303.
[18] LU Y W, LAI Z H, XU Y, et al. Low-rank preserving projections[J]. IEEE Transactions on Cybernetics, 2016, 46(8): 1900-1913.
[19] WEN J, HAN N, FANG X Z, et al. Low-rank preserving projection via graph regularized reconstruction[J]. IEEE Transactions on Cybernetics, 2019, 49(4): 1279-1291.
[20] LU J L, WANG H L, ZHOU J, et al. Low-rank adaptive graph embedding for unsupervised feature extraction[J]. Pattern Recognition, 2021, 113: 107758.
[21] 杨明瑞, 周世兵, 王茜, 等. 稀疏矩阵和改进归一化切割的快速多视图聚类[J]. 计算机科学与探索, 2024, 18(11): 3027-3040.
YANG M R, ZHOU S B, WANG X, et al. Fast multi-view clustering with sparse matrix and improved normalized cut[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(11): 3027-3040.
[22] ZHANG Z, XU Y, YANG J, et al. A survey of sparse representation: algorithms and applications[J]. IEEE Access, 2015, 3: 490-530.
[23] 辛利柯, 杨琬琪, 杨明. 基于判别稀疏性表示的不完整多视图分类[J]. 计算机科学与探索, 2021, 15(10): 1938-1948.
XIN L K, YANG W Q, YANG M. Incomplete multi-view classification via discriminative and sparse representation[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(10): 1938-1948.
[24] QIAO L S, CHEN S C, TAN X Y. Sparsity preserving projections with applications to face recognition[J]. Pattern Recognition, 2010, 43(1): 331-341.
[25] XIE L F, YIN M, YIN X Y, et al. Low-rank sparse preserving projections for dimensionality reduction[J]. IEEE Transactions on Image Processing, 2018, 27(11): 5261-5274.
[26] ZOU H, HASTIE T, TIBSHIRANI R. Sparse principal component analysis[J]. Journal of Computational and Graphical Statistics, 2006, 15(2): 265-286.
[27] ZHUANG L S, GAO H Y, LIN Z C, et al. Non-negative low rank and sparse graph for semi-supervised learning[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 2328-2335.
[28] NIE F P, ZHU W, LI X L, et al. Unsupervised large graph embedding[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2017: 2422-2428.
[29] HUANG S D, WU H J, REN Y Z, et al. Multi-view subspace clustering on topological manifold[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022: 25883-25894.
[30] YANG S M, ZHANG L, HE X F, et al. Learning manifold structures with subspace segmentations[J]. IEEE Transactions on Cybernetics, 2021, 51(4): 1981-1992.
[31] WANG Q, CHEN M L, LI X L, et al. Quantifying and detecting collective motion by manifold learning[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2017: 4292-4298.
[32] YU K Y, ZHU Y K, YIN X S, et al. Structure-aware preserving projections with applications to medical image clustering Image 1[J]. Applied Soft Computing, 2024, 158: 111576.
[33] YIN M, GAO J B, LIN Z C. Laplacian regularized low-rank representation and its applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(3): 504-517.
[34] ELHAMIFAR E, VIDAL R. Sparse subspace clustering: algorithm, theory, and applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2765-2781.
[35] LIU G C, LIN Z C, YU Y, et al. Robust subspace segmentation by low-rank representation[C]//Proceedings of the 27th International Conference on Machine Learning, 2010: 663-670.
[36] YAO J, CAO X Y, ZHAO Q, et al. Robust subspace clustering via penalized mixture of Gaussians[J]. Neurocomputing, 2018, 278: 4-11.
[37] BOYD S, PARIKH N, CHU E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine Learning, 2011, 3(1): 1-122.
[38] NIE F P, WANG X Q, JORDAN M I, et al. The constrained Laplacian rank algorithm for graph-based clustering[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2016: 1969-1976.
[39] CAI J F, CANDèS E J, SHEN Z W. A singular value thresholding algorithm for matrix completion[J]. SIAM Journal on Optimization, 2010, 20(4): 1956-1982.
[40] LIU G C, YAN S C. Latent low-rank representation for subspace segmentation and feature extraction[C]//Proceedings of the 2011 International Conference on Computer Vision. Piscataway: IEEE, 2011: 1615-1622.
[41] WEN J, FANG X Z, XU Y, et al. Low-rank representation with adaptive graph regularization[J]. Neural Networks, 2018, 108: 83-96.
[42] BELKIN M, NIYOGI P, BELKIN M, et al. Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 15th International Conference on Neural Information Processing Systems: Natural and Synthetic, 2001: 585-591.
[43] WAN M H, CAI M X, YANG Z J, et al. Robust latent nonnegative matrix factorization with automatic sparse reconstruction for unsupervised feature extraction[J]. Information Sciences, 2023, 648: 119517.
[44] WANG J Y, WANG L, NIE F P, et al. Joint feature selection and extraction with sparse unsupervised projection[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(6): 3071-3081.