Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (9): 2384-2394.DOI: 10.3778/j.issn.1673-9418.2307016
• Theory·Algorithm • Previous Articles Next Articles
JIANG Youpeng, HUA Yang, SONG Xiaoning
Online:
2024-09-01
Published:
2024-09-01
姜友鹏,华阳,宋晓宁
JIANG Youpeng, HUA Yang, SONG Xiaoning. Domain Adaptation Algorithm for 3D Human Pose Estimation with Spatial Attention and Position Optimization[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2384-2394.
姜友鹏, 华阳, 宋晓宁. 空间注意力与位置优化的三维人体姿态估计域适应算法[J]. 计算机科学与探索, 2024, 18(9): 2384-2394.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2307016
[1] 范苍宁, 刘鹏, 肖婷, 等. 深度域适应综述: 一般情况与复杂情况[J]. 自动化学报, 2021, 47(3): 515-548. FAN C N, LIU P, XIAO T, et al. A review of deep domain adaptation: general situation and complex situation[J]. Acta Automatica Sinica, 2021, 47(3): 515-548. [2] MARTINEZ J, HOSSAIN R, ROMERO J, et al. A simple yet effective baseline for 3D human pose estimation[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2640-2649. [3] ZHOU X, HUANG Q, SUN X, et al. Towards 3D human pose estimation in the wild: a weakly-supervised approach[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 398-407. [4] PAVLAKOS G, ZHOU X, DERPANIS K G, et al. Coarse-to-fine volumetric prediction for single-image 3D human pose[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 7025-7034. [5] WANG J, TAN S, ZHEN X, et al. Deep 3D human pose esti-mation: a review[J]. Computer Vision and Image Understanding, 2021, 210: 103225. [6] 王仕宸, 黄凯, 陈志刚, 等. 深度学习的三维人体姿态估计综述[J]. 计算机科学与探索, 2023, 17(1): 74-87. WANG S C, HUANG K, CHEN Z G, et al. Survey on 3D human pose estimation of deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 74-87. [7] SIGAL L, BALAN A O, BLACK M J. Humaneva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion[J]. International Journal of Computer Vision, 2010, 87: 4. [8] XIE F, SHEN H, YU Y, et al. Detection of weak small image target based on brain-computer interface[C]//Proceedings of the 2021 IEEE 4th International Conference on Electronics Technology, Chengdu, May 7-10, 2021. Piscataway: IEEE, 2021: 1218-1222. [9] SONG Y F, ZHANG Z, SHAN C, et al. Constructing stronger and faster baselines for skeleton-based action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45: 1474-1488. [10] 龚苏明, 陈莹. 时空特征金字塔模块下的视频行为识别[J]. 计算机科学与探索, 2022, 16(9): 2061-2067. GONG S M, CHEN Y. Video action recognition based on spatio-temporal feature pyramid module[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2061-2067. [11] SPURR A, DAHIYA A, WANG X, et al. Self-supervised 3D hand pose estimation from monocular RGB via contrastive learning[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Mar 10, 2021. Piscataway: IEEE, 2021: 11230-11239. [12] CHEN C H, TYAGI A, AGRAWAL A, et al. Unsupervised 3D pose estimation with geometric self-supervision[C]//Pro-ceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5714-5724. [13] RHODIN H, SALZMANN M, FUA P. Unsupervised geometry-aware representation for 3D human pose estimation[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 750-767. [14] CAO J, TANG H, FANG H S, et al. Cross-domain adaptation for animal pose estimation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 9498-9507. [15] KUNDU J N, SETH S, YM P, et al. Uncertainty-aware adaptation for self-supervised 3D human pose estimation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-24, 2022. Piscateway: IEEE, 2022: 20448-20459. [16] LIN K, WANG L, LIU Z. Mesh graphormer[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 12939-12948. [17] LUO C, CHU X, YUILLE A. Orinet: a fully convolutional network for 3D human pose estimation[EB/OL]. [2023-05-23]. https://arxiv.org/abs/1811.04989. [18] MEHTA D, SOTNYCHENKO O, MUELLER F, et al. Single-shot multi-person 3D pose estimation from monocular RGB[C]//Proceedings of the 2018 International Conference on 3D Vision, Verona, Sep 5-8, 2018. Piscataway: IEEE, 2018: 120-130. [19] LIU R, SHEN J, WANG H, et al. Attention mechanism exploits temporal contexts: real-time 3D human pose reconstruction[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 14-19, 2020. Piscataway: IEEE, 2020: 5064-5073. [20] WANG J, YAN S, XIONG Y, et al. Motion guided 3D pose estimation from videos[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 764-780. [21] KOCABAS M, ATHANASIOU N, BLACK M J. Vibe: video inference for human body pose and shape estimation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 14-19, 2020. Pis-cataway: IEEE, 2020: 5253-5263. [22] ZHANG J, NIE X, FENG J. Inference stage optimization for cross-scenario 3D human pose estimation[C]//Advances in Neural Information Processing Systems 33, Dec 6-12, 2020: 2408-2419. [23] WANG Z, SHIN D, FOWLKES C C. Predicting camera viewpoint improves cross-dataset generalization for 3D human pose estimation[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 523-540. [24] GUAN S, XU J, WANG Y, et al. Bilevel online adaptation for out-of-domain human mesh reconstruction[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 10472-10481. [25] ZENG A, SUN X, HUANG F, et al. SRNet: improving generalization in 3D human pose estimation with a split-and-recombine approach[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 507-523. [26] ZHENG C, ZHU S, MENDIETA M, et al. 3D human pose estimation with spatial and temporal transformers[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 11656-11665. [27] ZHANG J, TU Z, YANG J, et al. MixSTE: Seq2seq mixed spatio-temporal encoder for 3D human pose estimation in video[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-24, 2022. Piscateway: IEEE, 2022: 13232-13242. [28] GONG K, ZHANG J, FENG J. Poseaug: a differentiable pose augmentation framework for 3D human pose estimation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 8575-8584. [29] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63: 139-144. [30] GHOLAMI M, WANDT B, RHODIN H, et al. AdaptPose: cross-dataset adaptation for 3D human pose estimation by learnable motion generation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-24, 2022. Piscateway: IEEE, 2022:13075-13085. [31] PAVLLO D, FEICHTENHOFER C, GRANGIER D, et al. 3D human pose estimation in video with temporal convolutions and semi-supervised training[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 7753-7762. [32] LI W, LIU H, DING R, et al. Exploiting temporal contexts with strided transformer for 3D human pose estimation[J]. IEEE Transactions on Multimedia, 2023, 25: 1282-1293. [33] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 5998-6008. [34] MAO X, LI Q, XIE H, et al. Least squares generative adversarial networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2794-2802. [35] IONESCU C, PAPAVA D, OLARU V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 36: 1325-1339. [36] MEHTA D, RHODIN H, CASAS D, et al. Monocular 3D human pose estimation in the wild using improved CNN supervision[C]//Proceedings of the 2017 International Conference on 3D Vision, Qingdao, Oct 10-12, 2017. Piscataway: IEEE, 2017:506-516. [37] VON MARCARD T, HENSCHEL R, BLACK M J, et al. Recovering accurate 3D human pose in the wild using IMUs and a moving camera[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 601-617. [38] LI S, KE L, PRATAMA K, et al. Cascaded deep monocular 3D human pose estimation with evolutionary training data[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 14-19, 2020. Piscataway: IEEE, 2020: 6173-6183. [39] MEHTA D, SRIDHAR S, SOTNYCHENKO O, et al. VNect: real-time 3D human pose estimation with a single RGB camera[J]. ACM Transactions on Graphics, 2017, 36: 1-14. [40] CHAI W, JIANG Z, HWANG J N, et al. Global adaptation meets local generalization: unsupervised domain adaptation for 3D human pose estimation[EB/OL]. [2023-05-23]. https://arxiv.org/abs/2303.16456. [41] JOO H, NEVEROVA N, VEDALDI A. Exemplar fine-tuning for 3D human model fitting towards in-the-wild 3D human pose estimation[C]//Proceedings of the 2021 International Conference on 3D Vision, Dec 1-3, 2021. Piscataway: IEEE, 2021: 42-52. [42] DOERSCH C, ZISSERMAN A. Sim2real transfer learning for 3D human pose estimation: motion to the rescue[C]//Advances in Neural Information Processing Systems 32, Vancouver,Dec 8-14, 2019: 12929-12941. [43] KOLOTOUROS N, PAVLAKOS G, BLACK M J, et al. Learning to reconstruct 3D human pose and shape via model-fitting in the loop[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 2252-2261. |
[1] | LI Mengyun, ZHANG Jing, ZHANG Huanxiang, ZHANG Xiaolin, LIU Luyao. Multimodal Sentiment Analysis Based on Cross-Modal Semantic Information Enhancement [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2476-2486. |
[2] | XU Zhihong, ZHANG Huibin, DONG Yongfeng, WANG Liqin, WANG Xu. Question Feature Enhanced Knowledge Tracing Model [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2466-2475. |
[3] | YUAN Heng, WANG Xiaoxue, ZHANG Shengchong. No-Reference Low-Light Image Enhancement with Enhanced Feature Map [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2449-2465. |
[4] | YE Qingwen, ZHANG Qiuju. Multi-label Image Recognition Using Channel Pixel Attention [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2109-2117. |
[5] | LI Zhengwei, WANG Xili, AI Mei. Prototype-Combined Two-Stage Unsupervised Domain Adaptation Segmentation Model for Remote Sensing Images [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2091-2108. |
[6] | WANG Guokai, ZHANG Xiang, WANG Shunfang. Multi-scale and Boundary Fusion Network for Skin Lesion Regions Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1826-1837. |
[7] | WANG Yonggui, LIU Danni. Cross-Domain Recommendation Algorithm Combining Multi-personalized Bridges and Self-supervised Learning [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1792-1805. |
[8] | WEN Wen, DENG Fengying, HAO Zhifeng, CAI Ruichu, LIANG Fangyu. Recommendation Method for Time-Sequence Point of Interest via Spatio-Temporal Vicinity Perception [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1865-1878. |
[9] | XU Zhihong, HAO Xuemei, WANG Liqin, DONG Yongfeng, WANG Xu. Research on Knowledge Graph Entity Prediction Method of Multi-modal Curriculum Learning [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1590-1599. |
[10] | XIA Qingfeng, XU Ke'er, LI Mingyang, HU Kai, SONG Lipeng, SONG Zhiqiang, SUN Ning. Review of Attention Mechanisms in Reinforcement Learning [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1457-1475. |
[11] | YANG Li, ZHONG Junhong, ZHANG Yun, SONG Xinyu. Temporal Multimodal Sentiment Analysis with Composite Cross Modal Interaction Network [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(5): 1318-1327. |
[12] | WANG Xiang, MAO Li, CHEN Qidong, SUN Jun. Sentiment Analysis Combining Dynamic Gradient and Multi-view Co-attention [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(5): 1328-1338. |
[13] | WANG Longye, XIAO Yue, ZENG Xiaoli, ZHANG Kaixin, MA Ao. Skin Disease Segmentation Method Combining Dense Encoder and Dual-Path Attention [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 978-989. |
[14] | MA Jinlin, CUI Qilei, MA Ziping, YAN Qi, CAO Haojie, WU Jiangtao. Pre-weighted Modulated Dense Graph Convolutional Networks for 3D Human Pose Estimation [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 963-977. |
[15] | CHEN Linying, LIU Jianhua, ZHENG Zhixiong, LIN Jie, XU Ge, SUN Shuihua. Multi-feature Interaction for Aspect Sentiment Triplet Extraction [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 1057-1067. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/