计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (5): 822-833.DOI: 10.3778/j.issn.1673-9418.1806014

• 人工智能与模式识别 • 上一篇    下一篇

参数字典稀疏表示的完全无监督域适应

余欢欢+,陈松灿   

  1. 南京航空航天大学 计算机科学与技术学院,南京 211106
  • 出版日期:2019-05-01 发布日期:2019-05-08

Whole Unsupervised Domain Adaptation Using Sparse Representation of Parameter Dictionary

YU Huanhuan+, CHEN Songcan   

  1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
  • Online:2019-05-01 Published:2019-05-08

摘要: 无监督域适应(unsupervised domain adaptation,UDA)针对的是源域有标记而目标域无标记的学习问题,其目的是利用从标记大样本源域中所学“知识”来促进无标记小样本目标域的学习性能。但现实中也往往存在样本无标记的源域,形成了所谓的完全无监督域适应。该问题给域适应学习带来了严峻的挑战。受先前提出的软大间隔聚类学习(soft large margin clustering,SLMC)启发,提出了一种参数迁移方法——参数字典稀疏表示的完全无监督的域适应方法(whole UDA,WUDA)。SLMC采用分类学习思想在输出(标记)空间中实现给定数据的聚类,在这种实现原理的启发下,从参数(决策函数的权重矩阵)公共字典的角度,在源域和目标域的权重间进行互适应参数字典学习实现知识迁移,同时引入[l2,1]范数来约束字典系数矩阵,使得各域权重可从公共字典中自适应地选择,从而实现域适应学习。最后,在相关数据集上的实验显示了WUDA在聚类性能上的显著有效性。

关键词: 完全无监督域适应(WUDA), 参数公共字典, 稀疏表示, 无标记小样本问题, 软大间隔聚类(SLMC)

Abstract: Unsupervised domain adaptation (UDA) concerns on the learning problem in which the source domain contains labeled samples and target domain only contains unlabeled samples. Its goal is to use the “knowledge”learnt from source domain with a large number of labeled samples to promote learning performance in the target domain where all the samples are unlabeled. In reality, however, there often exists the situation in which the samples are also unlabeled in the source domain, leading to the so-called whole unsupervised domain adaptation. This problem brings severe challenge to domain adaptation learning. To address this, inspired by soft large margin clustering (SLMC) which is proposed previously, this paper proposes a parameter-transfer method, i.e., whole unsupervised domain adaptation using sparse representation of parameter dictionary (WUDA). Specifically, borrowing the idea of SLMC that conducts data clustering in the output (label) space, WUDA realizes the knowledge transfer from the perspective of common dictionary of parameter (weight matrix of decision function) and adaptively learns the parameter dictionary between the weights of source domain and target domain. In addition, a [l2,1] norm regularization term is introduced to constrain the coefficient matrix of dictionary, which makes the domain weight adaptively selectable from the common dictionary. Finally, the experimental results on the related datasets show the significant improvement of WUDA on the clustering performance.

Key words: whole unsupervised domain adaptation (WUDA), common dictionary of parameter, sparse representation, problem of unlabeled small sample, soft large margin clustering (SLMC)