计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (4): 857-867.DOI: 10.3778/j.issn.1673-9418.2112052

• 图形·图像 • 上一篇    下一篇

轻量化高精度双通道注意力机制模块

陈晓雷,卢禹冰,曹宝宁,林冬梅   

  1. 兰州理工大学 电气工程与信息工程学院,兰州 730000
  • 出版日期:2023-04-01 发布日期:2023-04-01

Lightweight and High-Precision Dual-Channel Attention Mechanism Module

CHEN Xiaolei, LU Yubing, CAO Baoning, LIN Dongmei   

  1. School of Electrical Engineering and Information Engineering, Lanzhou University of Technology, Lanzhou 730000, China
  • Online:2023-04-01 Published:2023-04-01

摘要: 目前大多数注意力机制模块在提高深度学习模型应用精度的同时也带来了模型复杂度的增加。针对这一问题,提出了一种轻量化高精度双通道注意力机制模块(EDCA)。EDCA将特征图分别沿通道、宽度和高度三个方向压缩并重新排列组合,采用一维卷积获取组合后的权重信息,接着将权重信息分割并应用于对应维度以获得特征关注度。在图像分类数据集miniImageNet与目标检测数据集Pascal VOC2007上对EDCA进行充分实验,实验结果表明,与SENet、CBAM、SGE、ECA-Net、Coordinate Attention相比,EDCA所需的计算量以及参数量更少。在miniImageNet数据集上使用ResNet50+EDCA时,Top-1精度较以上方法分别提升0.024 3、0.021 8、0.022 1、0.022 5、0.014 1;在Pascal VOC2007数据集上使用MobileNetV3+YOLOv4+EDCA时,AP50较SENet、CBAM、ECA-Net、Coordinate Attention分别提升0.009 4、0.004 6、0.005 9、0.001 4。

关键词: 注意力机制, 一维卷积, 图像分类, 目标检测

Abstract: At present, most attention mechanism modules improve the application accuracy of deep learning models, but also bring the defect of increased model complexity. In response to this problem, this paper proposes a lightweight and efficient dual-channel attention mechanism module (EDCA). EDCA compresses and rearranges the feature maps in three directions of channel, width, and height. One-dimensional convolution is used to obtain the combined weight information, and then the weight information is segmented and applied to corresponding dimensions to obtain feature attention. This paper conducts a full experiment on EDCA on image classification dataset miniImageNet and target detection dataset Pascal VOC2007. Compared with SENet, CBAM, SGE, ECA-Net and Coordinate Attention, EDCA requires less computation and parameters. When ResNet50+EDCA is used on miniImageNet dataset, the Top-1 accuracy is improved by 0.0243, 0.0218, 0.0221, 0.0225 and 0.0141, respectively. When MobileNetV3+YOLOv4+EDCA is used on Pascal VOC2007 dataset, AP50 is improved by 0.0094, 0.0046, 0.0059 and 0.0014 compared with SENet, CBAM, ECA-Net and Coordinate Attention, respectively.

Key words: attention mechanism, one-dimensional convolution, image classification, object detection