可能性分布距离度量：一种鲁棒的域适应学习方法

doi:10.3778/j.issn.1673-9418.2211120

摘要/Abstract

摘要： 领域适应（DA）学习旨在解决训练数据集与测试数据集分布不一致问题而广受关注，现有方法大多采用最小化领域间最大均值差（MMD）或其变体来解决域分布不一致问题。然而，领域中存在的噪声数据将会导致领域均值发生明显漂移，会在一定程度上影响基于MMD及其变体的学习方法的适应性能。故此，提出了可能性分布距离度量下的一种鲁棒的域适应学习方法：首先，将传统MMD准则变换为新颖的可能性聚类模型来削弱噪声数据所带来的影响，以此构建一种鲁棒的可能性分布距离度量（P-DDM）准则，并通过引入模糊熵正则项来进一步提升领域分布配准的鲁棒有效性。其次，基于P-DDM准则，提出一种鲁棒的域适应视觉分类机（C-PDDM），其引入图拉普拉斯矩阵来保留源域与目标域内部数据间的几何结构一致性，以提升标签传播性能，同时通过最大化利用源域判别信息进行最小化领域判别误差，以进一步提升学习模型的泛化性能。理论分析证实，在一定条件下，所提P-DDM是传统分布距离度量方法MMD准则的一个上界，因而通过最小化P-DDM能有效优化MMD目标。最后，与几个代表性的领域适应学习方法进行比较，在6个视觉基准数据集（Office31、Office-Caltech、Office-Home、PIE、MNIST-UPS和COIL20）上的实验结果显示，该方法在泛化性能上平均提升了5%左右，在鲁棒性能上平均提升了10%左右。

关键词: 领域适应（DA）, 可能性聚类, 最大均值差（MMD）, 模糊熵

Abstract: Domain adaptation (DA) aims to solve the problem of inconsistent distribution between training dataset and test dataset, which has attracted extensive attention. Most of the existing DA methods solve this problem by the maximum mean discrepancy (MMD) criterion or its variants. However, the noise data may lead to a significant drift of domain mean, which will reduce the performance of MMD and its variants to some extent. To this end, this paper proposes a robust domain adaptation method with possibilistic distribution distance measure. Firstly, the traditional MMD criterion is transformed into a new possibilistic clustering model, which aims to reduce the impact from noise data. This paper constructs a robust possibilistic distribution distance measure (P-DDM) criterion. It further improves the robust effectiveness of domain distribution alignment by adding the fuzzy entropy regularization term. Secondly, a domain adaptation visual classifier based on P-DDM (C-PDDM) is proposed. It adopts a graphical Laplacian matrix for preserving the geometric consistency of data in source domain and target domain. It can improve the label propagation performance. In order to improve generalization, it maximizes the use of source domain discrimination information to minimize the domain discrimination error. Theoretical analysis confirms that the proposed P-DDM is an upper bound of the traditional distribution distance measurement method MMD criterion under certain conditions. Therefore, minimizing the P-DDM can effectively optimize the MMD objective. Finally, it is compared with several representative domain adaptation methods, and the experimental results on 6 visual benchmark datasets (Office31, Office-Caltech, Office-Home, PIE, MNIST-UPS, and COIL20) show that the proposed method achieves an average improvement of about 5% on generalization performance and an average improvement of about 10% on robustness performance.

Key words: domain adaptation (DA), probabilistic clustering, maximum mean discrepancy (MMD), fuzzy entropy

但雨芳, 陶剑文. 可能性分布距离度量：一种鲁棒的域适应学习方法[J]. 计算机科学与探索, 2024, 18(3): 674-692.

DAN Yufang, TAO Jianwen. Possibilistic Distribution Distance Measure: Robust Domain Adaptation Learning Method[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 674-692.

参考文献

[1] LONG M L, WANG J M, DING G G, et al. Adaptation regularization: a general framework for transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(5): 1076-1089.
[2] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359.
[3] PATEL V M, GOPALAN R, LI R, et al. Visual domain adaptation: a survey of recent advances[J]. IEEE Signal Processing Magazine, 2015, 32(3): 53-69.
[4] BRUZZONE L, MARCONCINI M. Domain adaptation problems: a DASVM classification technique and a circular validation strategy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 770-787.
[5] TAO J W, ZHOU D, LIU F Y, et al. Latent multi-feature co-regression for visual recognition by discriminatively leveraging multi-source models [J]. Pattern Recognition, 2018, 87: 296-316.
[6] LONG M S, CAO Y, WANG J M, et al. Learning transferable features with deep adaptation networks[C]//Proceeding of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 97-105.
[7] LONG M S, WANG J M, JORDAN M I. Unsupervised domain adaptation with residual transfer networks[C]//Proceeding of the 30th Annual Conference on Neural Information Processing Systems, Barcelona, Dec 5-10, 2016: 136-144.
[8] CHEN Z L, ZHANG J Y, LIANG X D, et al. Blending-target domain adaptation by adversarial meta-adaptation networks[C]//Proceeding of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 2243-2252.
[9] LEE S M, KIM D W, KIM N, et al. Drop to adapt: learning discriminative features for unsupervised domain adaptation[C]//Proceeding of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 90-100.
[10] DING Z M, LI S, SHAO M, et al. Graph adaptive knowledge transfer for unsupervised domain adaptation[C]//Proceeding of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 36-52.
[11] TANG H, JIA K. Discriminative adversarial domain adaptation[C]//Proceeding of the 34th National Conference on Artificial Intelligence, New York, Feb 7-12, 2019: 5940-5947.
[12] BEN-DAVID S, BLITZER J, CRAMMER K, et al. A theory of learning from different domains[J]. Machine Learning, 2010, 79(1): 151-175.
[13] ZHANG Y, DENG B, TANG H, et al. Unsupervised multi-class domain adaptation: theory, algorithms, and practice[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(5): 2775-2792.
[14] GRETTON A, BORGWARDT K M, RASCH M, et al. A kernel method for the two-sample-problem[C]//Advances in Neural Information Processing Systems 19, Vancouver, Dec 4-7, 2006. Cambridge: MIT Press, 2007: 513-520.
[15] CHU W S, TORRE F D L, COHN J F. Selective transfer machine for personalized facial action unit detection[C]//Proceedings of the 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Portland, Jun 23-28, 2013: 3515-3522.
[16] PAN S J, TSANG I W, KWOK J T, et al. Domain adaptation via transfer component analysis[J]. IEEE Transactions on Neural Networks, 2011, 22(2): 199-210.
[17] LONG M S, WANG J M, DING G G, et al. Transfer feature learning with joint distribution adaptation[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013: 2200-2207.
[18] BAKTASHMOTLAGH M, HARANDI M T, LOVELL B C, et al. Unsupervised domain adaptation by domain invariant projection[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Washington: IEEE Computer Society, 2013: 769-776.
[19] LIANG J, HE R, SUN Z N, et al. Aggregating randomized clustering-promoting invariant projections for domain adaptation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(5): 1027-1042.
[20] CARLUCCI F M, PORZI L, CAPUTO B, et al. Autodial: automatic domain alignment layers[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 5077-5085.
[21] LUO L K, CHEN L M, HU S Q, et al. Discriminative and geometry aware unsupervised domain adaptation[J]. IEEE Transactions on Cybernetics, 2020, 50(9): 3914-3927.
[22] KANG G L, JIANG L, YANG Y, et al. Contrastive adaptation network for single- and multi-source domain adaptation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 1793-1804.
[23] KRISHNAPURAM R, KELLER J M. A possibilistic approach to clustering[J]. IEEE Transations on Fuzzy Systems, 1993, 1(2): 98-110.
[24] 但雨芳, 陶剑文, 徐浩特. 可能性聚类假设的半监督分类方法[J]. 计算机工程与应用, 2020, 56(9): 65-74.
DAN Y F, TAO J W, XU H T. Semi-supervised classification method of possibilistic clustering assumption[J]. Computer Engineering and Applications, 2020, 56(9): 65-74.
[25] GRETTON A, HARCHAOUI Z, FUKUMIZU K J, et al. A fast, consistent kernel two-sample test[C]//Advances in Neural Information Processing Systems 23, Vancouver, Dec 6-9, 2010: 673-681.
[26] 但雨芳, 陶剑文, 赵悦, 等. 可能性聚类假设的多模适应学习方法[J]. 计算机科学与探索, 2023, 17(6): 1329-1342.
DAN Y F, TAO J W, ZHAO Y, et al. Multi-model adaptation method of possibilistic clustering assumption[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1329-1342.
[27] WANG L C, DING Z M, FU Y. Adaptive graph guided embedding for multi-label annotation[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Jul 13-19, 2018: 2798-2804.
[28] NIE F P, HUANG H, CAI X, et al. Efficient and robust feature selection via joint [l2,1-]norms minimization[C]//Advances in Neural Information Processing Systems 23, Vancouver, Dec 6-9, 2010: 1813-1821.
[29] GHIFARY M, BALDUZZI D, KLEIJN W B, et al. Scatter component analysis: a unified framework for domain adaptation and domain generalization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(7): 1414-1430.
[30] TAO J W, DAN Y F. Multi-source co-adaptation for EEG-based emotion recognition by mining correlation information[J]. Frontiers in Neuroscience, 2021(15): 677106.
[31] SRIPERUMBADUR B K, FUKUMIZU K, GRETTON A, et al. Kernel choice and classifiability for RKHS embeddings of probability distributions[C]//Advances in Neural Information Processing Systems 22, Vancouver, Dec 7-10, 2009: 1750-1758.
[32] SRIPERUMBUDUR B K, GRETTON A, FUKUMIZU K, et al. Hilbert space embeddings and metrics on probability measures[J]. Journal of Machine Learning Research, 2010, 11(3): 1517-1561.
[33] SAENKO K, KULIS B, FRITZ M, et al. Adapting visual category models to new domains[C]//Proceedings of the 11th European Conference on Computer Vision. Berlin, Heidelberg: Springer, 2010: 213-226.
[34] SUN B, FENG J S, SAENKO K. Return of frustratingly easy domain adaptation[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2016: 2058-2065.
[35] GRIFFIN G, HOLUB A, PERONA P. Caltech-256 object category dataset[R]. California Institute of Technology, 2007.
[36] ZHANG J, LI W Q, QGUNBONA P. Joint geometrical and statistical alignment for visual domain adaptation[C]//Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5150-5158.
[37] HERATH S, HARANDI M, PORIKLI F. Learning an invariant Hilbert space for domain adaptation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2017: 3956-3965.
[38] VENKATESWARA H, EUSEBIO J, CHAKRABORTY S, et al. Deep hashing network for unsupervised domain adaptation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2017: 5385-5394.
[39] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 770-778.
[40] SIM T, BAKER S, BSAT M. The CMU pose, illumination, and expression database[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1615-1618.
[41] HULL J J. A database for handwritten text recognition research[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994, 16(5): 550-554.
[42] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[43] NENE S A, NAYAR S K, MURASE H. Columbia object image library (COIL-20)[R]. 1996.
[44] GONG B Q, SHI Y, SHA F, et al. Geodesic flow kernel for unsupervised domain adaptation[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2012: 2066-2073.
[45] LUO L K, WANG X F, HU S Q, et al. Close yet distinctive domain adaptation[J]. arXiv:1704.04235, 2017.
[46] FERNANDO B, HABRARD A, SEBBAN M, et al. Unsupervised visual domain adaptation using subspace alignment[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Washington: IEEE Computer Society, 2013: 2960-2967.
[47] COURTY N, FLAMARY R, HABRARD A, et al. Joint distribution optimal transportation for domain adaptation[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 3730-3739.
[48] TZENG E, HOFFMAN J, ZHANG N, et al. Deep domain confusion: maximizing for domain invariance[J]. arXiv: 1412.3474, 2014.
[49] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain adversarial training of neural networks[J]. Journal of Machine Learning Research, 2016, 17: 1-35.
[50] GHIFARY M, KLEIJN W B, ZHANG M, et al. Deep reconstruction-classification networks for unsupervised domain adaptation[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 597-613.
[51] YAN H L, DING Y K, LI P H, et al. Mind the class weight bias: weighted maximum mean discrepancy for unsupervised domain adaptation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2017: 945-954.
[52] LONG M S, ZHU H, WANG J M, et al. Deep transfer learning with joint adaptation networks[C]//Proceedings of the 34th International Conference on Machine Learning, Sydney, Aug 6-11, 2017: 2208-2217.
[53] TZENG E, HOFFMAN J, SAENKO K, et al. Adversarial discriminative domain adaptation[C]//Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2962-2971.
[54] ZHU Y C, ZHUANG F Z, WANG J D, et al. Deep subdomain adaptation network for image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(4): 1713-1722.
[55] DING N, XU Y X, TANG Y H, et al. Source-free domain adaptation via distribution estimation[J]. arXiv.2204.11257, 2022.
[56] CUI S H, WANG S H, ZHOU J B, et al. Gradually vanishing bridge for adversarial domain adaptation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12455-12464.
[57] HU L Q, KAN M N, SHAN S G, et al. Unsupervised domain adaptation with hierarchical gradient synchronization[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 4043-4052.
[58] TANG H, CHEN K, JIA K. Unsupervised domain adaptation via structurally regularized deep clustering[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8725-8735.
[59] HOFFMAN J, RODNER E, DONAHUE J, et al. Asymmetric and category invariant feature transformations for domain adaptation[J]. International Journal of Computer Vision, 2014, 109(1/2): 28-41.