计算机科学与探索

• 学术研究 •    下一篇

空间注意力与位置优化的三维人体姿态估计域适应算法

姜友鹏, 华阳, 宋晓宁   

  1. 江南大学 人工智能与计算机学院 江苏省模式识别与计算智能工程实验室, 江苏 无锡 214122

Domain Adaptation Algorithm for 3D Human Pose Estimation With Spatial Attention and Position Optimization

JIANG Youpeng, HUA Yang, SONG Xiaoning   

  1. Jiangsu Engineering Laboratory of Pattern Recognition and Computational Intelligence, School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China

摘要: 现有三维人体姿态估计器在单个数据集上表现较好,但受限于训练数据姿态结构的单一,其在跨域实验上的泛化性不足。现有方法通过增加姿态多样性来弥补该缺陷,然而这些方法生成的新姿态缺乏真实有效性且姿态全局位置的分布与目标数据集仍存在显著差距。针对上述问题,提出一种基于生成对抗网络(generative adversarial network, GAN)的空间注意力与全局位置优化的三维人体姿态估计域适应算法。算法引入空间节点注意力模块约束生成器产生更自然的人体姿态,并结合姿态位置修正模块促使生成姿态向目标数据域对齐,从而解决以上域适应问题。此外,为了提升估计器训练的稳定性,提出一种端到端随机混合的训练策略,使姿态估计器可兼顾新旧数据信息的学习。作为一种生成式的域适应方法,本算法可以高效地应用于各种二阶段三维人体姿态估计器。通过跨场景实验与跨数据集实验,结果表明所提算法在多个基准数据集上的表现均达到当前最佳。其中在3DHP数据集中,本方法MPJPE与AUC指标相比最优工作优化了1.7%和1.4%,验证了所提算法可有效提高三维人体姿态估计器的泛化性。

关键词: 三维人体姿态估计, 无监督域适应, 生成对抗网络, 注意力机制

Abstract: Existing 3D human pose estimators perform well on a single dataset but are limited by the single pose structure of the training data, resulting in insufficient generalization to cross-domain experiments. Existing methods mitigate this deficiency by increasing pose diversity, but their generated poses often lack genuine validity. Moreover, there is still a significant gap between the global positions of poses in the target and source domains. To address these issues, a domain adaptation algorithm for 3D human pose estimation is proposed. Specifically, a spatial node attention module within the GAN network is introduced to encourage the generation of more natural human poses.In parallel, a pose position correction module is incorporated to facilitate alignment between the generated poses and the target data domain. Furthermore, a stochastic end-to-end mixing training strategy is proposed to enable the pose estimator to learn both old and new data information simultaneously, thereby enhancing the stability of the estimator's training. As a generative domain adaptation approach, the present algorithm can be efficiently applied to a variety of two-stage 3D human pose estimators. Through cross-scene and cross-dataset experiments, the results show that the proposed algorithm achieves the best current performance on several benchmark datasets. In particular, on the 3DHP dataset, the algorithm optimizes the MPJPE and AUC metrics by 1.7% and 1.4%, respectively, compared to the optimal work, verifying its effectiveness in improving the generalization of the 3D human pose estimator.

Key words: 3D human pose estimation, unsupervised domain adaptation, generative adversarial network(GAN), attention mechanism