计算机科学与探索

• 学术研究 •    

深度学习的三维人体姿态估计综述

王仕宸,黄凯,陈志刚,张文东   

  1. 1.新疆大学 软件学院,乌鲁木齐 830046
    2.中南大学 计算机学院,长沙 410083

A Survey on 3D Human Pose Estimation of Deep Learning

WANG Shichen, HUANG Kai, CHEN Zhigang, ZHANG Wendong   

  1. 1.School of Software, Xinjiang University, Urumqi 830046, China
    2.School of Computer Science and Engineering, Central South University, Changsha 410083, China

摘要: 三维人体姿态估计的目的是预测出人体关节点的三维坐标位置和角度等信息,构建人体表示(如人体骨骼),以便进一步分析人体姿态。近年来,随着人工智能和5G技术的发展,三维姿态以其丰富的信息和更加直观的表达方式也受到了广泛的关注,并且在运动分析、虚拟现实、医疗辅助、电影制作等方面有了大量的应用。随着深度学习方法的不断推进,越来越多的基于深度学习的高性能三维人体姿态估计方法被提出。然而由于图片的人体遮挡、训练规模需求较大等原因,三维人体姿态估计仍然存在挑战。我们的研究目的是通过对近年以来的多篇研究论文进行回顾,分析和比较这些方法的推理过程和核心要素,全面阐述近年来基于深度学习的三维姿态估计方法。此外,本文还介绍了相关数据集和评价指标,在Human3.6m数据集中对部分模型进行实验数据比对,分析对比实验结果。最后,根据本次调查的结果,讨论目前三维人体姿态估计所面临的挑战,对三维人体姿态估计的未来发展做出了展望。

关键词: 三维人体姿态估计, 深度学习, 神经网络, 关键点检测

Abstract: The purpose of the three-dimensional human pose estimation is to locate the key points of the human body according to the input data such as RGB images and RGB-D images, and construct a human body representation (such as human bones) in order to further analyze the human posture. In recent years, three-dimensional pose estimation has received more and more attention and has been widely used in motion analysis, virtual reality, medical assistance, film production, and so on. With the continuous advancement of deep learning methods, more and more high-performance three-dimensional human pose estimation methods based on deep learning have been proposed. However, due to the human body obstruction of the picture and the large demand for training scale, the three-dimensional human posture estimation is still challenging. The research purpose of this paper is to comprehensively expound the three-dimensional pose estimation method based on deep learning in recent years by reviewing a number of research papers since 2017, analyzing and comparing the reasoning process and core elements of these methods. In addition, this paper also introduces the relevant data sets and evaluation indicators, compares the experimental data of some models in the Human3.6m dataset, and analyzes and compares the experimental results. Finally, based on the results of this survey, the current challenges of three-dimensional human pose estimation are discussed, and the future development of three-dimensional human pose estimation is proposed.

Key words: 3D human pose estimation, deep learning, neural networks, joints detection