计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (5): 1342-1352.DOI: 10.3778/j.issn.1673-9418.2405052

• 人工智能·模式识别 • 上一篇    下一篇

融合梯度预测和无参注意力的高效地震去噪Transformer

高磊,乔昊炜,梁东升,闵帆,杨梅   

  1. 1. 西南石油大学 计算机与软件学院,成都 610500
    2. 西南石油大学 人工智能研究院,成都 610500
    3. 西南石油大学 机器学习研究中心,成都 610500
  • 出版日期:2025-05-01 发布日期:2025-04-28

Efficient Seismic Denoising Transformer with Gradient Prediction and Parameter-Free Attention

GAO Lei, QIAO Haowei, LIANG Dongsheng, MIN Fan, YANG Mei   

  1. 1. School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu 610500, China
    2. Institute for Artificial Intelligence, Southwest Petroleum University, Chengdu 610500, China
    3. Lab of Machine Learning, Southwest Petroleum University, Chengdu 610500, China
  • Online:2025-05-01 Published:2025-04-28

摘要: 压制随机噪声能够有效提升地震数据的信噪比(SNR)。近年来,基于卷积神经网络(CNN)的深度学习方法在地震数据去噪领域展现出显著性能。然而,CNN中的卷积操作由于感受野的限制通常只能捕获局部信息而不能建立全局信息的长距离连接,可能会导致细节信息的丢失。针对地震数据去噪问题,提出了一种融合梯度预测和无参注意力的高效Transformer模型(ETGP)。引入多头“转置”注意力来代替传统的多头注意力,它能在通道间计算注意力来表示全局信息,缓解了传统多头注意力复杂度过高的问题。提出了无参注意力前馈神经网络,它能同时考虑空间和通道维度计算注意力权重,而不向网络增加参数。设计了梯度预测网络以提取边缘信息,并将信息自适应地添加到并行Transformer的输入中,从而获得高质量的地震数据。在合成数据和野外数据上进行了实验,并与经典和先进的去噪方法进行了比较。结果表明,ETGP去噪方法不仅能更有效地压制随机噪声,并且在弱信号保留和同相轴连续性方面具有显著优势。

关键词: 地震数据去噪, 卷积神经网络, Transformer, 注意力模块, 梯度融合

Abstract: Suppression of random noise can effectively improve the signal-to-noise ratio (SNR) of seismic data. In recent years, convolutional neural network (CNN)-based deep learning methods have shown significant performance in seismic data denoising. However, the convolution operation in CNN usually can only capture local information due to the limitation of receptive field while cannot establish long-distance connections of global information, which may lead to the loss of detailed information. For the problem of denoising seismic data, an efficient Transformer model with gradient prediction and parameter-free attention (ETGP) is proposed. Firstly, a multi-Dconv head “transposed” attention is introduced in place of the traditional multi-head attention, which can compute the attention between channels to represent the global information, and alleviate the problem of high complexity of the traditional multi-head attention. Secondly, a parameter-free attention feed-forward network is proposed, which can compute the attention weight considering both the spatial and the channel dimensions without adding parameters to the network. Lastly, a gradient prediction network (GPN) is designed to extract edge information and adaptively add the information to the input of the parallel Transformer to obtain high-quality seismic data. Experiments are conducted on synthetic and field data, and the proposed method in this paper is compared with classical and advanced denoising methods. The results show that the ETGP denoising method not only suppresses random noise more effectively, but also has significant advantages in terms of weak signal retention and event continuity.

Key words: seismic data denoising, convolutional neural network, Transformer, attention module, gradient fusion