计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (7): 1211-1220.DOI: 10.3778/j.issn.1673-9418.1902011

• 人工智能 • 上一篇    下一篇

流形学习与成对约束联合正则化非负矩阵分解

曹佳伟,钱鹏江   

  1. 江南大学 数字媒体学院,江苏 无锡 214122
  • 出版日期:2020-07-01 发布日期:2020-08-12

Nonnegative Matrix Factorization with Joint Regularization of Manifold Learning and Pairwise Constraints

CAO Jiawei, QIAN Pengjiang   

  1. School of Digital Media, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2020-07-01 Published:2020-08-12

摘要:

为处理目标数据集仅有部分成对约束信息可用的半监督聚类场景,基于非负矩阵分解(NMF)架构,通过学习给定成对约束知识和运用流形正则化理论提出了流形学习与成对约束联合正则化非负矩阵分解聚类方法(NMF-JRMLPC)。该方法一方面引入图拉普拉斯以刻画大量无标记样本蕴含的流形结构信息,另一方面将已知样本间的must-link或cannot-link成对约束规则融入目标优化设计,在很大程度上提高了所得算法的聚类性能。此外基于[l2,1]范数的损失函数设计也有助于优化NMF-JRMLPC的鲁棒性。在八个真实数据集上的实验结果证实了所提方法的有效性。

关键词: 聚类, 非负矩阵分解(NMF), 流形正则化, 成对约束, 半监督学习

Abstract:

In order to handle semi-supervised clustering scenarios where only part of the pairwise constraint information is available in the target dataset, on the basis of nonnegative matrix factorization (NMF) architecture, this paper proposes a nonnegative matrix factorization-based clustering algorithm using joint regularization of manifold learning and pairwise constraints (NMF-JRMLPC) by learning given pairwise constraint knowledge and using manifold regularization theory. On the one hand, graph Laplacian is introduced to depict the manifold structure information contained in a large number of unlabeled samples, and on the other hand, the must-link or cannot-link pair-constraint rules among known samples are integrated into the target optimization design, which greatly improves the clustering performance of the algorithm. In addition, the [l2,1] norm based loss function design also helps to optimize the robustness of NMF-JRMLPC. Experimental results on eight real datasets confirm the validity of the proposed method.

Key words: clustering, nonnegative matrix factorization (NMF), manifold regularization, pairwise constraints, semi-supervised learning