计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (6): 1062-1073.DOI: 10.3778/j.issn.1673-9418.2007003

• 学术研究 • 上一篇    下一篇

鲁棒自加权的多视图子空间聚类

范瑞东,侯臣平   

  1. 国防科技大学 文理学院 体系科学系,长沙 410073
  • 出版日期:2021-06-01 发布日期:2021-06-03

Robust Auto-weighted Multi-view Subspace Clustering

FAN Ruidong, HOU Chenping   

  1. Department of Systems Science, College of Liberal Arts and Sciences, National University of Defense Technology, Changsha 410073, China
  • Online:2021-06-01 Published:2021-06-03

摘要:

随着收集和存储数据的能力不断提高,真实数据通常由不同的表现形式(视图)组成。因此多视图学习在机器学习与模式识别领域中扮演着重要的角色。近年来,多种多视图学习方法被提出并应用于不同的实际场景中。然而,在目标函数中大部分数据点存在平方残差,少数误差较大的离群点很容易令目标函数失效,因此如何处理冗余数据是多视图学习面临的重要挑战。为解决上述问题,提出一种鲁棒自加权的多视图子空间聚类模型。该模型利用Frobenius范数来处理数据的平方误差的同时利用[?1]范数来处理数据的离群点,有效地平衡了离群点与普通数据点对性能的影响。此外,与通过引入超参数来衡量不同视图对模型的影响的传统方法不同,该模型自动学习了每个视图的权重。由于该模型是一个非光滑非凸问题,很难直接求解,设计了一个有效的算法并分析了算法的收敛性和计算复杂度。相比于传统的多视图子空间聚类算法,在多个多视图数据集上的实验结果表明了算法的有效性。

关键词: 鲁棒性, 自加权, 多视图子空间聚类, 矩阵分解

Abstract:

As the ability to collect and store data improving, real data are usually made up of different forms (view). Therefore, multi-view learning plays a more and more important role in the field of machine learning and pattern recognition. In recent years, a variety of multi-view learning methods have been proposed and applied to different practical scenarios. However, since most of the data points in the objective function have square residuals and a few outliers with large errors can easily invalidate the objective function, how to deal with redundant data becomes an important challenge for multi-view learning. For solving the above problems, this paper proposes a model, termed as robust auto-weighted multi-view subspace clustering. The model uses the Frobenius norm to deal with the squared error of data and uses the [?1]-norm to deal with outliers at the same time. Thus the effect of outliers and data points on model performance is effectively balanced. Furthermore, unlike traditional methods which measure the impact of different views by introducing hyper-parameters, the proposed model learns the weight of each view automatically. Since this model is a non-smooth and non-convex problem which is difficult to solve directly, this paper designs an effective algorithm to solve the problem and analyzes the convergence and computational complexity of this algo-rithm. Compared with traditional multi-view subspace clustering algorithms, the experimental results on multi-view datasets present the effectiveness of the proposed algorithm.

Key words: robustness, auto-weighted, multi-view subspace clustering, matrix factorization