Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (12): 3247-3259.DOI: 10.3778/j.issn.1673-9418.2312050

• Graphics·Image • Previous Articles     Next Articles

Target Detection Algorithm Based on Global Feature Fusion in Parallel Dual Path Backbone

QIU Yunfei, XIN Hao   

  1. School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China
  • Online:2024-12-01 Published:2024-11-29

并行双路径主干下全局特征融合的目标检测算法

邱云飞,辛浩   

  1. 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105

Abstract: The active downsampling of the backbone of conventional single path architecture often leads to insufficient feature extraction and information loss. At the same time, simply adding or splicing feature pyramids is not conducive to the integration of shallow to deep features. To solve these problems, a target detection algorithm based on global feature fusion in parallel dual path backbone is proposed. Firstly, the dual path architecture backbone is used to extract spatial and semantic information in parallel, and the dual path fusion module is used to promote the mutual complement between feature information. Secondly, the top feature is added to the pyramid pooled multi-scale pool mapping at the same time, and the attention mechanism is used to gather the multi-scale pooled features, so as to further improve the multi-scale detection performance. Then, the global scale information is gathered, which is integrated into different layers of features by using self-attention mechanism, and repeated many times to construct the neck network structure of global feature fusion, which effectively improves the ability of neck network to fuse global context information. Finally, the head adopts Ghost Conv combined with channel shuffling operation to maintain model performance and reduce parameter redundancy. Experiments on KITTI, BDD100K and PASCAL VOC datasets show that the average accuracy of the proposed algorithm is improved by 3.5, 3.4 and 2.7 percentage points compared with the baseline model (YOLOv7-tiny), respectively. Experimental results show that the proposed algorithm improves the detection performance in complex scenes, and has low requirements for computing power and other resources.

Key words: target detection, dual path backbone, pooling attention, global feature fusion neck network, Ghost detection head

摘要: 常规单路径架构主干经过积极的下采样,往往导致特征信息的丢失。同时,仅依靠特征金字塔简单地相加或拼接不利于浅层到深层的特征集成。针对上述问题,提出一种并行双路径主干下全局特征融合的目标检测算法。采用双路径架构主干并行地提取空间与语义信息,并通过双路径融合模块,促进特征信息间的相互补充。顶部特征依次与金字塔池化多尺度池映射相加,利用注意力机制将多尺度池化特征聚集其中,进一步提高多尺度的检测性能。聚集全局尺度信息,利用自注意机制将其融入不同层特征,并重复多次以构建全局特征融合的颈部网络结构,有效提升颈部网络融合全局上下文信息的能力。头部采用Ghost Conv并结合通道混洗操作,维持模型性能的同时减少参数冗余。在KITTI、BDD100K和PASCAL VOC数据集上展开实验,所提算法的平均精度值相较于基线模型(YOLOv7-tiny)分别提高了3.5、3.4和2.7个百分点。实验结果表明,提出的算法提升了复杂场景下的检测性能,而且对算力等资源的要求较低。

关键词: 目标检测, 双路径主干, 池化注意力, 全局特征融合颈部网络, Ghost检测头