计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (2): 438-447.DOI: 10.3778/j.issn.1673-9418.2105048

• 图形图像 • 上一篇    下一篇

注意力与多尺度有效融合的SSD目标检测算法

王燕妮, 余丽仙+()   

  1. 西安建筑科技大学 信息与控制工程学院,西安 710055
  • 收稿日期:2021-05-13 修回日期:2021-07-16 出版日期:2022-02-01 发布日期:2021-07-22
  • 通讯作者: + E-mail: ylx@xauat.edu.cn
  • 作者简介:王燕妮(1975—),女,陕西渭南人,博士,副教授,主要研究方向为智能信息处理、图像处理。
    余丽仙(1996—),女,浙江淳安人,硕士研究生,主要研究方向为图像处理、深度学习、目标检测。
  • 基金资助:
    陕西省自然科学基础研究项目(2020JM-499);陕西省自然科学基础研究项目(2020JQ-684)

SSD Object Detection Algorithm with Effective Fusion of Attention and Multi-scale

WANG Yanni, YU Lixian+()   

  1. School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China
  • Received:2021-05-13 Revised:2021-07-16 Online:2022-02-01 Published:2021-07-22
  • About author:WANG Yanni, born in 1975, Ph.D., associate professor. Her research interests include intelligent information processing and image processing.
    YU Lixian, born in 1996, M.S. candidate. Her research interests include image processing, deep learning and object detection.
  • Supported by:
    Natural Science Basic Research Project of Shaanxi Province(2020JM-499);Natural Science Basic Research Project of Shaanxi Province(2020JQ-684)

摘要:

针对传统的SSD目标检测算法在进行多尺度目标检测时,存在特征图有效信息弱和困难目标漏检率大等问题,提出一种改进的SSD目标检测算法。首先,在网络特征图输出处引入即插即用的轻量级注意力机制,通过不降维、局部跨通道交互以及核大小自适应选择等操作,在保持网络原始计算量的同时有效突出特征图中关键信息。该模块有利于增强背景信息和目标信息差,可以在有效提升网络性能的同时,不增加网络的复杂性。然后,构造了一种新的特征融合模块,可以将不同尺度的特征图进行有效融合,使浅层特征层既含有丰富的细节信息,又能充分利用上下文语义信息。多尺度融合模块有利于丰富特征图信息,提升网络对困难目标的检测性能。使用公开的PASCAL VOC数据集验证该方法,改进后的网络在PASCAL VOC2007测试集上的检测精度达到了79.6%,比原始SSD算法提升了2.4个百分点,在遮挡目标数据集上提升了4.7个百分点,充分证明改进方法具有一定的时效性和鲁棒性。

关键词: 目标检测, 深度学习, 轻量级注意力机制, 多尺度特征融合

Abstract:

In order to solve the problems of weak effective information of feature map and high miss rate of difficult objects in the traditional single shot multibox detector (SSD) for multi-scale object detection, an improved SSD object detection algorithm is proposed. Firstly, a lightweight attention mechanism is introduced at the output of the network feature map. Through non-dimensionality reduction, local cross-channel interaction and adaptive core size selection, it can effectively highlight the key information in the feature map while maintaining the original amount of network computation. This module helps to enhance the difference between background information and object information, and can effectively improve the performance of the network without increasing the complexity of the network. Then, a new feature fusion module is designed to effectively fuse features of different scales. It can make the shallow feature layer not only contain rich detailed information, but also make full use of contextual semantic information. The multi-scale fusion module helps to enrich the feature map information and improve the detection performance of the network for difficult objects. The experimental results on the PASCAL VOC dataset show that the improved network has a detection accuracy of 79.6% on the PASCAL VOC2007 test set, which is increased by 2.4 percentage points than the original SSD algorithm, and increased by 4.7 percentage points on the occlusion target dataset. It is proven that the improved method has certain timeliness and robustness.

Key words: object detection, deep learning, lightweight attention mechanism, multi-scale feature fusion

中图分类号: