计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (7): 1838-1851.DOI: 10.3778/j.issn.1673-9418.2308091

• 图形·图像 • 上一篇    下一篇

特征细化和多尺度注意力的Transformer图像去噪网络

袁姮,耿仪坤   

  1. 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
  • 出版日期:2024-07-01 发布日期:2024-06-28

Feature Refinement and Multi-scale Attention for Transformer Image Denoising Network

YUAN Heng, GENG Yikun   

  1. School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China
  • Online:2024-07-01 Published:2024-06-28

摘要: 为增强全局上下文信息的关联性,加强对多尺度特征的关注,在提升图像去噪效果的同时最大程度保留细节特征,提出一种基于Transformer的特征细化和多尺度注意力的图像去噪网络(TFRADNet)。该网络不仅在编解码器部分利用Transformer解决大规模图像的长程依赖问题,提高模型的去噪效率,还在上采样操作后加入位置感知层来增强网络对特征图中像素位置的感知能力。为了应对Transformer可能对像素间空间关系的忽略,导致局部细节失真,在特征重建阶段设计了特征细化模块(FRB),采用串行结构逐层引入非线性变换,加强对噪声水平复杂的图像局部特征的识别。同时,设计了多尺度注意力模块(MAB),采用并行双分支结构,对空间注意力和通道注意力联合建模,有效捕捉不同尺度的图像特征并进行加权,提高模型对多尺度特征的感知能力。在真实噪声数据集SIDD、DND和RNI15上的实验结果显示,TFRADNet能够兼顾全局信息和局部细节,相比其他先进方法展现出了更强的抑噪能力和稳健性。

关键词: 图像去噪, 特征细化, 多尺度注意力, Transformer, 真实噪声

Abstract: In order to enhance the relevance of global context information, strengthen the attention to multi-scale features, improve the image denoising effect while preserving the details to the greatest extent, a Transformer based feature refinement and multi-scale attention image denoising network (TFRADNet) is proposed. The network not only uses Transformer in the codec part to solve the long-term dependence problem of large-scale images and improve the efficiency of model noise reduction, but also adds a position awareness layer after the up-sampling operation to enhance the network’s perception ability of pixel positions in the feature map. To cope with Transformer’s neglect of spatial relationships among pixels, which may result in local detail distortion, a feature refinement block (FRB) is designed at feature reconstruction stage. A serial structure is used to introduce nonlinear transformations layer by layer, to enhance the recognition of local image features with complex noise levels. Meanwhile, a multi-scale attention block (MAB) is designed, which adopts a parallel double-branch structure to jointly model spatial attention and channel attention, effectively capturing and weighting image features of different scales, and improving the model’s perception ability of multi-scale features. Experimental results on real noise datasets SIDD, DND and RNI15 show that TFRADNet can take into account global information and local details, and has stronger noise suppression ability and robustness than other advanced methods.

Key words: image denoising, feature refinement, multi-scale attention, Transformer, real noise