Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (2): 345-362.DOI: 10.3778/j.issn.1673-9418.2305057
• Frontiers·Surveys • Previous Articles Next Articles
QI Xuanhao, ZHI Min
Online:
2024-02-01
Published:
2024-02-01
祁宣豪,智敏
QI Xuanhao, ZHI Min. Review of Attention Mechanisms in Image Processing[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2): 345-362.
祁宣豪, 智敏. 图像处理中注意力机制综述[J]. 计算机科学与探索, 2024, 18(2): 345-362.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2305057
[1] ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. [2] HAYHOE M, BALLARD D. Eye movements in natural behavior[J]. Trends in Cognitive Sciences, 2005, 9(4): 188-194. [3] RENSINK R A. The dynamic representation of scenes[J]. Visual Cognition, 2000, 7: 34576. [4] CORBETTA M, SHULMAN G L. Control of goal-directed and stimulus-driven attention in the brain[J]. Nature Reviews Neuroscience, 2002, 3(3): 201-215. [5] 张卫锋. 跨媒体数据语义分析技术研究[D]. 杭州: 杭州电子科技大学, 2019. ZHANG W F. Research on semantic analysis technology of cross-media data[D]. Hangzhou: Hangzhou University of Electronic Science and Technology, 2019. [6] MNIH V, HEESS N, GRAVES A. Recurrent models of visual attention[C]//Advances?in?Neural Information Processing Systems 27, Montreal, Dec?8-13,?2014: 2204-2212. [7] ELMAN J L. Finding structure in time[J]. Cognitive Science, 1990, 14(2): 179-211. [8] KIM Y. Convolutional neural networks for sentence classification[J]. arXiv:1408.5882v2, 2014. [9] JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]//Advances?in?Neural?Information?Processing?Systems?28,?Montreal, Dec?7-12,?2015: 2017-2025. [10] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 7132-7141. [11] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19. [12] 彭红星, 徐慧明, 刘华鼐. 融合双分支特征和注意力机制的葡萄病虫害识别模型[J]. 农业工程学报, 2022, 38(10): 156-165. PENG H X, XU H M, LIU H N. A grape pest identification model incorporating two-branch feature and attention mechanism[J]. Journal of Agricultural Engineering, 2022, 38(10): 156-165. [13] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances?in?Neural?Information?Processing?Systems?30,?Long?Beach,?Dec?4-9,?2017: 5998-6008. [14] 张朝阳, 张上, 王恒涛, 等. 多尺度下遥感小目标多头注意力检测[J]. 计算机工程与应用, 2023, 59(8): 227-238. ZHANG C Y, ZHANG S, WANG H T, et al. Multi-headed attention detection of remote sensing small targets at multiple scales[J]. Computer Engineering and Applications, 2023, 59(8): 227-238. [15] 耿磊, 邱玲, 吴骏, 等. 结合深度可分离卷积与通道加权的全卷积神经网络视网膜图像血管分割[J]. 生物医学工程学杂志, 2019, 36(1): 107-115. GENG L, QIU L, WU J, et al. Full convolutional neural network combining depth-separable convolution with channel weighting for retinal image vascular segmentation[J]. Journal of Biomedical Engineering, 2019, 36(1): 107-115. [16] QIN Z, ZHANG P, WU F, et al. FcaNet: frequency channel attention networks[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Oct 11-17, 2021. Piscataway: IEEE, 2021: 763-772. [17] AHMED N, NATARAJAN T, RAO K R. Discrete cosine transform[J]. IEEE Transactions on Computers, 1974, 100(1): 90-93. [18] WANG Q,WU B,ZHU P, et al. ECA-Net:efficient channel attention for deep convolutional neural networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 16-18, 2020.Piscataway: IEEE, 2020: 11531-11539. [19] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 26-Jul 1, 2016.Washington: IEEE Computer Society, 2016: 2818-2826. [20] YANG Z, ZHU L, WU Y, et al. Gated channel transformation for visual recognition[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 16-18, 2020. Piscataway: IEEE, 2020: 11794-11803. [21] HU J, SHEN L, ALBANIE S, et al. Gather-excite: exploiting feature context in convolutional neural networks[C]//Advances?in?Neural?Information?Processing?Systems?31, Montréal, Dec?3-8,?2018: 9423-9433. [22] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. [23] IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 2015 International Conference on Machine Learning, Lille, Jul 6-11, 2015: 448-456. [24] LI X, WANG W, HU X, et al. Selective kernel networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 510-519. [25] ZHANG K, SUN M, HAN T X, et al. Residual networks of residual networks: multilevel residual networks[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 28(6): 1303-1314. [26] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Image-Net classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. [27] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv:1511.07122, 2015. [28] YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 472-480. [29] DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Italy, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 764-773. [30] GREGOR K, DANIHELKA I, GRAVES A, et al. DRAW: a recurrent neural network for image generation[C]//Proceedings of the 2015 International Conference on Machine Learning, Lille, Jul 6-11, 2015: 1462-1471. [31] HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 603-612. [32] WANG Y, WANG H, PENG Z. Rice diseases detection and classification using attention based neural network and Bayesian optimization[J]. Expert Systems with Applications, 2021, 178: 114770. [33] YU Y, LIU M, FENG H, et al. Split-attention multiframe alignment network for image restoration[J]. IEEE Access, 2020, 8: 39254-39272. [34] ZAGORUYKO S, KOMODAKIS N. Wide residual networks[J]. arXiv:1605.07146, 2016. [35] HAN D, KIM J, KIM J. Deep pyramidal residual networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 5927-5935. [36] IANDOLA F, MOSKEWICZ M, KARAYEV S, et al. DenseNet: implementing efficient ConvNet descriptor pyramids[J]. arXiv:1404.1869, 2014. [37] 李启行, 廖薇, 孟静雯. 基于注意力机制的双通道DAC-RNN文本分类模型[J]. 计算机工程与应用, 2022, 58(16): 157-163. LI Q H, LIAO W, MENG J W. A two-channel DAC-RNN text classification model based on attention mechanism[J]. Computer Engineering and Applications, 2022, 58(16): 157-163. [38] PARK J, WOO S, LEE J Y, et al. BAM: bottleneck attention module[J]. arXiv:1807.06514, 2018. [39] ELSKEN T, METZEN J H, HUTTER F. Neural architecture search: a survey[J]. The Journal of Machine Learning Research, 2019, 20(1): 1997-2017. [40] ZHANG Q L, YANG Y B. SA-Net: shuffle attention for deep convolutional neural networks[C]//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Jun 6-11, 2021. Piscataway: IEEE, 2021: 2235-2239. [41] WU Y, HE K. Group normalization[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19. [42] MA N, ZHANG X, ZHENG H T, et al. ShuffleNet v2: practical guidelines for efficient CNN architecture design[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 116-131. [43] SU K, YU D, XU Z, et al. Multi-person pose estimation with enhanced channel-wise and spatial information[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5674-5682. [44] ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[C]//Proceedings of the 2022 Asian Conference on Computer Vision, Macau, China, Dec 4, 2022: 1161-1177. [45] LIU Y, ZHU Q, CAO F, et al. High-resolution remote sensing image segmentation framework based on attention mechanism and adaptive weighting[J]. ISPRS International Journal of Geo-Information, 2021, 10(4): 241. [46] YIN F, LI S, JI M, et al. Neural TV program recommendation with label and user dual attention[J]. Applied Intelligence, 2022, 52(1): 19-32. [47] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[J]. arXiv:2010.11929, 2020. [48] LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Oct 11-17, 2021. Piscataway: IEEE, 2021: 10012-10022. [49] TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]//Proceedings of the 2021 International Conference on Machine Learning, Jul 18-24, 2021: 10347-10357. [50] QIAO S, CHEN L C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 10213-10224. [51] ZHENG S, LU J, ZHAO H, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 6881-6890. [52] TOLSTIKHIN I O, HOULSBY N, KOLESNIKOV A, et al. MLP-Mixer: an all-MLP architecture for vision[C]//Advances in?Neural?Information?Processing?Systems?34,?Dec?6-14, 2021: 24261-24272. [53] DONG X, BAO J, CHEN D, et al. CSWin transformer: a general vision transformer backbone with cross-shaped windows[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 19-24, 2022. Piscataway: IEEE, 2022: 12124-12134. [54] 刘文婷, 卢新明. 基于计算机视觉的Transformer研究进展[J]. 计算机工程与应用, 2022, 58(6): 1-16. LIU W T, LU X M. Progress of Transformer research based on computer vision[J]. Computer Engineering and Applications, 2022, 58(6): 1-16. [55] 林志玮, 金龄杰, 洪宇. 融合多尺度特征和梯度信息的云种类识别[J]. 激光与光电子学进展, 2022(18): 145-154. LIN Z W, JIN L J, HONG Y. Fusion of multi-scale features and gradient information for cloud species identification[J]. Advances in Lasers and Optoelectronics, 2022(18): 145-154. [56] 董玉民, 卫力行. 一种CNN-Transformer网络在皮肤镜图像分割上的应用[J]. 重庆师范大学学报(自然科学版), 2023(2): 126-134. DONG Y M,WEI L X. A CNN-Transformer network for dermoscopic image segmentation[J]. Journal of Chongqing Normal University (Natural Science Edition), 2023(2): 126-134. [57] SU H, JAMPANI V, SUN D, et al. Pixel-adaptive convolutional neural networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 11166-11175. [58] VOULODIMOS A, DOULAMIS N, DOULAMIS A, et al. Deep learning for computer vision: a brief review[J]. Computational Intelligence and Neuroscience, 2018: 7068349. [59] GALASSI A, LIPPI M, TORRONI P. Attention in natural language processing[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(10): 4291-4308. [60] LEI S, YI W, YING C, et al. Review of attention mechanism in natural language processing[J]. Data Analysis and Knowledge Discovery, 2020, 4(5): 1-14. [61] SOOD E, TANNERT S, MüLLER P, et al. Improving natural language processing tasks with human gaze-guided neural attention[C]//Advances in Neural Information Processing Systems 33, Dec 6-12, 2020: 6327-6341. [62] YANG B, WANG L, WONG D F, et al. Context-aware self-attention networks for natural language processing[J]. Neurocomputing, 2021, 458: 157-169. [63] NIU Z, ZHONG G, YU H. A review on the attention mechanism of deep learning[J]. Neurocomputing, 2021, 452: 48-62. [64] BARGH J A. Attention and automaticity in the processing of self-relevant information[J]. Journal of Personality and Social Psychology, 1982, 43(3): 425. |
[1] | LIN Sui, LU Chaohai, JIANG Wenchao, LIN Xiaoshan, ZHOU Weilin. Few-Shot Knowledge Graph Completion Based on Selective Attention [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 646-658. |
[2] | PENG Bin, BAI Jing, LI Wenjing, ZHENG Hu, MA Xiangyu. Survey on Visual Transformer for Image Classification [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2): 320-344. |
[3] | LI Jin, XIA Hongbin, LIU Yuan. Dual Features Local-Global Attention Model with BERT for Aspect Sentiment Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 205-216. |
[4] | ZHANG Wenxuan, YIN Yanjun, ZHI Min. Affection Enhanced Dual Graph Convolution Network for Aspect Based Sentiment Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 217-230. |
[5] | HE Xiangjie, SONG Xiaoning. Improved YOLOv4-Tiny Lightweight Target Detection Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(1): 138-150. |
[6] | WANG Haiyong, PAN Haitao, LIU Guinan. Face Recognition Method Based on Attention Mechanism and Curriculum Learning [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1893-1903. |
[7] | ZHAO Xiaoyan, SONG Wei. Attention Learning Particle Swarm Optimization Algorithm Guided by Aggrega-tion Indicator [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1852-1866. |
[8] | JIANG Wentao, ZHANG Boqiang. Object Tracking Algorithm with Channel and Anomaly Adaptation [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1644-1657. |
[9] | YIN Qian, WANG Yan, GUO Ping, ZHENG Xin. Distortion Correction of Two-Dimensional Spectral Image Based on Neural Network [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1622-1633. |
[10] | XUE Yanming, LI Guanghui, QI Tao. Traffic Prediction Method Integrating Graph Wavelet and Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1405-1416. |
[11] | LI Zhijie, HAN Ruirui, LI Changhua, ZHANG Jie, SHI Haoqi. Entity Relation Extraction Method Integrating Pre-trained Model and Attention [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1453-1462. |
[12] | JIA Tianhao, PENG Li, DAI Feifei. Object Detector with Residual Learning and Multi-scale Feature Enhancement [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(5): 1102-1111. |
[13] | ZHAO Shan, ZHENG Ailing, LIU Zilu, GAO Yu. Object Detection Algorithm Based on Channel Separation Dual Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(5): 1112-1125. |
[14] | QI Xin, YUAN Feiniu, SHI Jinting, WANG Guiqian. Semantic Segmentation Algorithm of Multi-level Feature Fusion Network [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 922-932. |
[15] | XIA Hongbin, LI Qiang, LIU Yuan. Local and Global Feature Fusion Network Model for Aspect-Based Sentiment Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 902-911. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/