Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (4): 927-937.DOI: 10.3778/j.issn.1673-9418.2108087

• Graphics and Image • Previous Articles     Next Articles

Deep Small Object Detection Algorithm Integrating Attention Mechanism

ZHAO Pengfei, XIE Linbo+(), PENG Li   

  1. Engineering Research Center of Internet of Things Technology Applications (School of Internet of Things Engineering, Jiangnan University), Ministry of Education, Wuxi, Jiangsu 214122, China
  • Received:2021-07-22 Revised:2021-09-30 Online:2022-04-01 Published:2021-10-18
  • About author:ZHAO Pengfei, born in 1996, M.S. candidate. His research interests include visual object detection and deep learning.
    XIE Linbo, born in 1973, Ph.D., professor, Ph.D. supervisor, member of CAA. His research interests include process modeling and control, intelligent detection and system safety.
    PENG Li, born in 1967, Ph.D., professor, Ph.D. supervisor, member of CAAI and CCF. His research interests include visual Internet of things and intelligent detection.
  • Supported by:
    National Natural Science Foundation of China(61873112);National Key Research and Development Program of China(2018YFD0400902)

融合注意力机制的深层次小目标检测算法

赵鹏飞, 谢林柏+(), 彭力   

  1. 物联网技术应用教育部工程研究中心(江南大学 物联网工程学院),江苏 无锡 214122
  • 通讯作者: + E-mail: xie_linbo@jiangnan.edu.cn
  • 作者简介:赵鹏飞(1996—),男,江苏盐城人,硕士研究生,主要研究方向为目标检测、深度学习。
    谢林柏(1973—),男,湖南永州人,博士,教授,博士生导师,CAA会员,主要研究方向为过程建模与控制、智能检测与系统安全性。
    彭力(1967—),男,河北唐山人,博士,教授,博士生导师,CAAI会员,CCF会员,主要研究方向为视觉物联网、智能检测。
  • 基金资助:
    国家自然科学基金(61873112);国家重点研发计划(2018YFD0400902)

Abstract:

Insufficient feature extraction of the backbone network and lack of semantic information in the shallow convolution layer often lead to poor detection results on small objects. In order to improve the accuracy and robustness of small object detection, this paper proposes a deep small object detection algorithm that integrates attention mechanism. Firstly, to address the problem of insufficient feature extraction capability of the backbone network, Darknet-53 is selected as the network of feature extraction, and a new grouped residual connection is proposed to replace the residual connection structure in the original Darknet-53. This forms a new enhanced backbone network named I-Darknet53. This grouped residual structure can effectively increase the size of the receptive field by interweaving the feature information of different channels. Secondly, in the multi-scale detection phase, a shallow feature enhancement network is proposed to obtain shallow enhanced features by fusing the shallow layer and deep layer. The network including feature enhancement module and an efficient feature fusion strategy guided by channel attention mechanism is used to improve the lack of semantic information of shallow features. Experimental results show that the proposed algorithm has better performance than the SSD algorithm on PASCAL VOC dataset. When the input image size is 300 ×300, the average accuracy of the proposed model is 80.2%; when the input image size is 500 ×500, the average accuracy of the proposed model is 82.3%. In addition, it can effectively improve the detection accuracy of small objects under the premise of ensuring the detection speed.

Key words: small object detection, feature extraction, feature fusion, attention mechanism

摘要:

骨干网络特征提取不充分以及浅层卷积层缺乏语义信息等往往导致了对于小目标检测的效果不佳,为提高小目标检测的精确性与鲁棒性,提出一种融合注意力机制的深层次小目标检测算法。首先,针对骨干网络特征提取能力不足的问题,选用Darknet-53作为特征提取网络,通过构建新的分组残差连接来替换原Darknet-53中的残差连接结构,形成新的I-Darknet53骨干增强网络,该分组残差结构可通过交织不同通道的特征信息有效提高输出的感受野大小。其次,在多尺度检测阶段,提出浅层特征增强网络,采用特征增强模块与通道注意力机制引导下的高效特征融合策略对浅层与深层进行特征融合获得浅层增强特征,从而改善浅层语义特征信息不足的问题。实验结果表明,相较于SSD算法,所提算法在PASCAL VOC数据集上检测效果更加突出。当输入图像尺寸为300 ×300时,模型平均精度均值为80.2%;当输入图像尺寸为500 ×500时,模型平均精度均值为82.3%。并且在保证检测速度的前提下,增强了模型对小目标的检测效果。

关键词: 小目标检测, 特征提取, 特征融合, 注意力机制

CLC Number: