计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (3): 682-692.DOI: 10.3778/j.issn.1673-9418.2403059

• 图形·图像 • 上一篇    下一篇

改进YOLOv8s-Pose多人姿态估计轻量化模型研究

傅裕,高树辉   

  1. 中国人民公安大学 侦查学院,北京 100038
  • 出版日期:2025-03-01 发布日期:2025-02-28

Research on Lightweight Model of Multi-person Pose Estimation Based on Improved YOLOv8s-Pose

FU Yu, GAO Shuhui   

  1. School of Investigation, People’s Public Security University of China, Beijing 100038, China
  • Online:2025-03-01 Published:2025-02-28

摘要: 针对现有人体姿态估计模型计算量大、检测速度慢等问题,提出了一种基于YOLOv8s-Pose模型的轻量化改进算法。在backbone中引入轻量化模块C2f-GhostNetBottleNeckV2替换原先C2f,减少参数量,提高模型速度。引入Non_Local注意力机制捕捉并传递人体关键点位置,直接融合全面的信息,为后续的层级提供更为丰富和深入的语义信息,提升整体的信息处理深度和广度,强化特征提取的效能,减少模型轻量化后精度降低问题,再将neck层引入加权双向特征金字塔网络,通过双向融合的理念,对自顶向下和自底向上的信息流动路径进行了重新规划,确保在处理不同尺度的特征信息时达到良好的平衡,给网络增加一个小目标检测头,减少对小目标的漏检情况,将CIOU损失函数更换为Focal-EIOU损失函数,以增强对复杂场景和多目标场景下的鲁棒性。实验结果表明,改进后的实验模型参数量降低了9.3%,在COCO2017人体关键点数据集上,与原模型相比mAP@0.50提升了0.4个百分点,mAP@0.50:0.95提升了0.6个百分点。可见,所提出的轻量化改进算法在减少模型参数量的同时,提升了人体姿态估计的算法精度,尤其对小目标检测有显著改善,为实现实时准确的姿态估计提供了有效手段。

关键词: 姿态估计, YOLOv8s-Pose, GhostNetV2网络, 加权双向特征金字塔网络, 损失函数

Abstract: To address the issues of high computational load and slow detection speed in existing human pose estimation models, this paper proposes a lightweight improved algorithm based on the YOLOv8s-Pose model. Firstly, a lightweight module C2f-GhostNetBottleNeckV2 is introduced into the backbone to replace the original C2f, reducing the number of parameters. This paper also introduces the Non_Local attention mechanism to integrate the position information of human key points in the image into the channel dimension, thereby enhancing the efficiency of feature extraction and mitigating the accuracy degradation issues that often occur after model lightweighting. Furthermore, the weighted bidirectional feature pyramid network is incorporated into the neck layer to improve the model’s feature fusion capabilities, ensuring a good balance when processing features of different scales. A small object detection head is then added to the network to reduce the missed detection of small objects. Lastly, the CIOU loss function is replaced with Focal-EIOU to enhance the accuracy of human key point regression. Experimental results show that the improved model reduces the number of parameters by 9.3%, and compared with the original model on the COCO2017 human key points dataset, it achieves an improvement of 0.4 percentage points in mAP@0.50 and an improvement of 0.6 percentage points in mAP@0.50:0.95. Therefore, the proposed lightweight improvement algorithm not only reduces the number of model parameters but also enhances the accuracy of human pose estimation algorithms, especially for small target detection, which provides an effective means to achieve real-time and accurate pose estimation.

Key words: pose estimation, YOLOv8s-Pose, GhostNetV2 network, weighted bidirectional feature pyramid network, loss function