计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (12): 2942-2953.DOI: 10.3778/j.issn.1673-9418.2301006

• 图形·图像 • 上一篇    下一篇

动态损失与加强特征融合的目标检测算法

肇启明,张涛,孙俊   

  1. 江南大学 人工智能与计算机学院,江苏 无锡 214122
  • 出版日期:2023-12-01 发布日期:2023-12-01

Object Detection Algorithm with Dynamic Loss and Enhanced Feature Fusion

ZHAO Qiming, ZHANG Tao, SUN Jun   

  1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2023-12-01 Published:2023-12-01

摘要: 目标检测是计算机视觉领域最火热的方向之一,为了进一步提升目标检测算法性能,解决目标检测算法在训练过程中位置损失函数存在的局限性,提出了基于交并比(IoU)的动态交并比损失函数(DYIoU Loss),充分考虑到了位置损失函数内部各组成部分之间的联系,可以在训练的不同阶段,动态给予位置损失组成部分不同的权重,以达到更有针对性地约束网络的目的,即让网络在训练的初期、中期和后期能够更符合目标检测任务的特性去优化不同部分。其次,为了解决目标检测网络在特征融合环节存在的不足,将可变形卷积应用到PAN结构当中,设计了一种可以即插即用的基于可变形卷积的DePAN Neck来提升模型对多尺度特征的融合能力,进而提升模型对小目标物体的检测效果。将上述方法应用到YOLOv6-N、YOLOv6-T、YOLOv6-S三个体量的YOLOv6模型上,在COCO2017数据集上设计了丰富实验,验证方法的有效性,平均精度(mAP)平均提升2.0个百分点。

关键词: 目标检测, 深度学习, 位置损失, 特征融合

Abstract: Object detection is one of the hottest directions in the field of computer vision. In order to further improve the performance of the object detection algorithm, a dynamic intersection over union loss (DYIoU Loss) based on the intersection over union (IoU) is proposed to solve the limitations of the position loss function in the training process. The relationship between the internal components of the position loss function is fully considered, and the weight of the position loss components can be given dynamically at different stages of the training to more specifically constrain the network. This enables the network to optimize different parts more effectively during the early, middle, and late stages of training to better align with the characteristics of the object detection task. In addition, in order to solve the deficiency of the feature fusion stage in the object detection network, deformable convolution is applied to the PAN (path aggregation network) structure, and a deformable path aggregation network neck (DePAN Neck) that can be plugged in is designed to improve the model’s ability to fuse multi-scale features and improve its detection performance on small objects. The above methods are applied to YOLOv6 models of YOLOv6-N, YOLOv6-T and YOLOv6-S sizes, and rich experiments are designed on the COCO2017 dataset to validate the effectiveness. The results show an average increase of 2.0 percentage points in the average precision (mAP).

Key words: object detection, deep learning, position loss, feature fusion