计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (5): 1112-1125.DOI: 10.3778/j.issn.1673-9418.2109115

• 图形·图像 • 上一篇    下一篇

通道分离双注意力机制的目标检测算法

赵珊,郑爱玲,刘子路,高雨   

  1. 河南理工大学 计算机科学与技术学院,河南 焦作 454003
  • 出版日期:2023-05-01 发布日期:2023-05-01

Object Detection Algorithm Based on Channel Separation Dual Attention Mechanism

ZHAO Shan, ZHENG Ailing, LIU Zilu, GAO Yu   

  1. School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454003, China
  • Online:2023-05-01 Published:2023-05-01

摘要: 对于两阶段目标检测算法中模型存在检测精度低、小目标漏检率高等问题,提出通道分离双注意力机制的目标检测算法,通过改进Faster+FPN主干网络来提高小目标的检测精度。首先针对神经网络不能自动学习特征间的重要性问题,在通道分离过程中提出双注意力机制来构建深度神经网络,另结合分组卷积、空洞卷积等技术减少网络参数。其次针对高分辨率特征经过深度CNN后导致的信息丢失问题,通过添加细节提取模块以及通道注意力特征融合模块来提取更多的细节特征。最后考虑到一般损失函数不可重点评估目标物位置的置信度,结合KL散度进行损失函数优化,通过训练使得预测分布更接近真实分布,有效地解决了神经网络直接用于目标检测存在的问题。采用PASCAL VOC2007、KITTI以及Pedestrian三类数据集对网络进行训练,并将提出的模型与多个目标检测算法进行对比。实验结果表明,该算法能够高效地对图像进行识别,且具有较高的检测精度。

关键词: 通道分离, 双注意力机制, 特征金字塔网络(FPN), KL散度, 目标检测

Abstract: For the problems of low detection accuracy and high leakage rate of small targets in two-stage object detection algorithm, a target detection algorithm based on channel separation and dual attention mechanism is proposed to improve the detection accuracy of small targets by improving the Faster+FPN backbone network. Firstly, in response to the problem that neural networks can not automatically learn the importance between features, a dual-attention mechanism is proposed to build a deep neural network in the channel separation process, and other techniques such as group convolution and dilated convolution are combined to reduce network parameters. Secondly, to address the problem of information loss caused by high resolution features passing through a deep CNN, the detail extraction module and channel attention feature fusion module are added to extract more detailed features. Finally, considering that the general loss function cannot be focused on assessing the confidence level of the target’s location, the KL scatter is combined with the loss function optimization to make the prediction distribution closer to the real distribution through training, and the problems associated with the direct use of neural networks for object detection are effectively addressed. PASCAL VOC2007, KITTI and Pedestrian datasets are adopted to train the network, and the proposed model is compared with several object detection algorithms. Experimental results show that the proposed algorithm in this paper can recognize images efficiently and has high detection accuracy.

Key words: channel separation, dual attention mechanism, feature pyramid networks (FPN), KL divergence, object detection