Research on Fourier Augmented Unbiased Cross-Domain Object Detection

doi:10.3778/j.issn.1673-9418.2308032

Abstract

Abstract: The main purpose of unbiased cross-domain object detection is to utilize the knowledge of the source domain to the maximum extent through knowledge distillation, and reduce the cross-domain gap of the model through domain adaptation. However, the pseudo labels generated by the mean teacher method commonly used for unbiased cross-domain object detection are not reliable, resulting in significant domain bias issues between teacher and student models. Therefore, inspired by the invariance of phase information in Fourier transform, this paper proposes the Fourier augmentation unbiased mean teacher (FAUMT) model based on the mean teacher. This paper utilizes the invariance of Fourier phase information to design an amplitude mixing data augmentation (AMDA) module, which can effectively mix phase information between the source and target domains to achieve data augmentation. And data augmentation will generate additional noise, thus this paper designs two consistency losses to ensure the consistency of predictions before and after data augmentation. In addition, to balance the cross-domain bias between the source and target domains during model training, this paper also designs a multi-layer adversarial learning (MAL) module, with the aim of domain alignment of pixel level features at different levels. On three benchmark datasets Cilpart1K, Watercolor2K and Comic2K, the mAP of proposed method achieves 47.5%, 58.9% and 46.1%, respectively, outperforming other algorithms.

Key words: domain adaptation, cross-domain object detection, deep neural network, mean teacher

摘要： 无偏跨域目标检测的主要目的是借助知识蒸馏最大限度地利用源域的知识，通过领域自适应减小模型的跨域差距。然而，通常用于无偏跨域目标检测的平均教师方法所产生的伪标签并不可靠，从而导致师生模型间仍然存在较大的领域偏差问题。受傅里叶变换中相位信息不变性特点的启发，在平均教师的基础上提出傅里叶增强无偏协同教师模型（FAUMT）。利用傅里叶相位信息的不变性，设计振幅混合的数据增强（AMDA）模块，其可以有效地混合源域和目标域间的相位信息从而实现数据增强。而数据增强会产生额外的噪声，设计两个一致性损失来保证数据增强前后预测的一致性。此外，为平衡模型训练过程中源域和目标域间的跨域偏差，还设计了多层对抗学习（MAL）模块，旨在对不同层次的像素级别特征进行域对齐。在三个基准数据集Cilpart1K、Watercolor2K、Comic2K上，该方法的mAP分别达到了47.5%、58.9%、46.1%，超过了其他算法。

关键词: 领域自适应, 跨域目标检测, 深度神经网络, 平均教师

WANG Bing, XU Pei, ZHANG Xingpeng. Research on Fourier Augmented Unbiased Cross-Domain Object Detection[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2436-2448.

王兵, 徐裴, 张兴鹏. 傅里叶增强的无偏跨域目标检测研究[J]. 计算机科学与探索, 2024, 18(9): 2436-2448.

References

[1] CHEN Y, LI W, SAKARIDIS C, et al. Domain adaptive faster R-CNN for object detection in the wild[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 3339-3348.
[2] CHEN C, ZHENG Z, DING X, et al. Harmonizing transferability and discriminability for adapting object detectors[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 8866-8875.
[3] ZHENG Y, HUANG D, LIU S, et al. Cross-domain object detection through coarse-to-fine feature adaptation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 13763-13772.
[4] ZHAO L, WANG L. Task-specific inconsistency alignment for domain adaptive object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 14197-14206.
[5] HOFFMAN J, TZENG E, PARK T, et al. CyCADA: cycle-consistent adversarial domain adaptation[C]//Proceedings of the 35th International Conference on Machine Learning, Stockholm, Jul 10-15, 2018: 1994-2003.
[6] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2242-2251.
[7] 李岳楠, 徐浩宇, 董浩. 频域内面向目标检测的领域自适应[J]. 红外与激光工程, 2022, 51(7): 452-460.
LI Y N, XU H Y, DONG H. Domain adaptation for object detection in the frequency domain[J]. Infrared and Laser Engineering, 2022, 51(7): 452-460.
[8] TARVAINEN A, VALPOLA H. Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results[C]//Advances in Neural Infor-mation Processing Systems 30, Long Beach, Dec 4-9, 2017: 1195-1204.
[9] DENG J, LI W, CHEN Y, et al. Unbiased mean teacher for cross-domain object detection[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 4091-4101.
[10] KURMI V K, SUBRAMANIAN V K, NAMBOODIRI V P. Domain impression: a source data free domain adaptation method[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, Jan 3-8, 2021. Piscataway: IEEE, 2021: 615-625.
[11] HUANG J, GUAN D, XIAO A, et al. FSDR: frequency space domain randomization for domain generalization[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 6891-6902.
[12] LI Y J, DAI X, MA C Y, et al. Cross-domain adaptive teacher for object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 7571-7580.
[13] CAO S, JOSHI D, GUI L Y, et al. Contrastive mean teacher for domain adaptive object detectors[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Jun 17-24, 2023. Piscataway: IEEE, 2023: 23839-23848.
[14] CHEN C, LI J, ZHOU H Y, et al. Relation matters: foreground-aware graph-based relational reasoning for domain adaptive object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3677-3694.
[15] YANG Y, SOATTO S. FDA: Fourier domain adaptation for semantic segmentation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 4085-4095.
[16] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[17] 董文轩, 梁宏涛, 刘国柱, 等. 深度卷积应用于目标检测算法综述[J]. 计算机科学与探索, 2022, 16(5): 1025-1042.
DONG W X, LIANG H T, LIU G Z, et al. Review of deep convolution applied to target detection algorithms[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1025-1042.
[18] 孙武, 邓赵红, 娄琼丹, 等. 基于模糊规则学习的无监督异构领域自适应[J]. 计算机科学与探索, 2022, 16(2): 403-412.
SUN W, DENG Z H, LOU Q D, et al. Unsupervised heterogeneous domain adaptation with fuzzy rule learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(2): 403-412.
[19] GANIN Y, LEMPITSKY V. Unsupervised domain adaptation by backpropagation[C]//Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 1180-1189.
[20] TZENG E, HOFFMAN J, SAENKO K, et al. Adversarial discriminative domain adaptation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2962-2971.
[21] LONG M, CAO Z, WANG J, et al. Conditional adversarial domain adaptation[C]//Advances in Neural Information Processing Systems 31, Montréal, Dec 3-8, 2018: 1647-1657.
[22] KIM T, JEONG M, KIM S, et al. Diversify and match: a domain adaptive representation learning paradigm for object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 12456-12465.
[23] OUYANG S, WANG X, LYU K, et al. Pseudo-label generation-evaluation framework for cross domain weakly supervised object detection[C]//Proceedings of the 2021 IEEE International Conference on Image Processing, Anchorage, Sep 19-22, 2021. Piscataway: IEEE, 2021: 724-728.
[24] SHEN Z, MAHESHWARI H, YAO W, et al. SCL: towards accurate domain adaptive object detection via gradient detach based stacked complementary losses[EB/OL]. [2023-06-24]. https://arxiv.org/abs/1911.02559.
[25] SAITO K, USHIKU Y, HARADA T, et al. Strong-weak distri-bution alignment for adaptive object detection[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 6956-6965.
[26] XU C D, ZHAO X R, JIN X, et al. Exploring categorical regularization for domain adaptive object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 11721-11730.
[27] 林佳伟, 王士同. 用于无监督域适应的深度对抗重构分类网络[J]. 计算机科学与探索, 2022, 16(5): 1107-1116.
LIN J W, WANG S T. Deep adversarial-reconstruction-classification networks for unsupervised domain adaptation[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1107-1116.
[28] CAI Q, PAN Y, NGO C W, et al. Exploring object relation in mean teacher for cross-domain detection[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 11457-11466.
[29] ZHOU H, JIANG F, LU H. SSDA-YOLO: semi-supervised domain adaptive YOLO for cross-domain object detection[J]. Computer Vision and Image Understanding, 2023, 229: 103649.
[30] WU J, CHEN J, HE M, et al. Target-relevant knowledge pre-servation for multi-source domain adaptive object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 5301-5310.
[31] HE M, WANG Y, WU J, et al. Cross domain object detection by target-perceived dual branch distillation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 9570-9580.
[32] CHEN M, CHEN W, YANG S, et al. Learning domain adaptive object detection with probabilistic teacher[C]//Proceedings of the 2022 International Conference on Machine Learning, Baltimore, Jul 17-23, 2022: 3040-3055.
[33] ZHAO S, GONG M, LIU T, et al. Domain generalization via entropy regularization[C]//Advances in Neural Information Processing Systems 33, Dec 6-12, 2020: 16096-16107.
[34] HINTON G E, VINYALS O, DEAN J. Distilling the knowledge in a neural network[EB/OL]. [2023-06-24]. https://arxiv.org/abs/1503.02531.
[35] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338.
[36] INOUE N, FURUTA R, YAMASAKI T, et al. Cross-domain weakly-supervised object detection through progressive domain adaptation[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 5001-5009.
[37] HE Z, ZHANG L. Domain adaptive object detection via asymmetric tri-way faster-RCNN[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 309-324.
[38] ZHAO Z, GUO Y, SHEN H, et al. Adaptive object detection with dual multi-label prediction[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 54-69.
[39] HOU L, ZHANG Y, FU K, et al. Informative and consistent correspondence mining for cross-domain weakly supervised object detection[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 9929-9938.
[40] JIANG J, CHEN B, WANG J, et al. Decoupled adaptation for cross-domain object detection[C]//Proceedings of the 10th International Conference on Learning Representations, Apr 25-29, 2022.
[41] LI S, YE M, ZHU X, et al. Source-free object detection by learning to overlook domain style[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 8014-8023.