计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (1): 154-165.DOI: 10.3778/j.issn.1673-9418.2111121

• 图形·图像 • 上一篇    下一篇

坐标注意力特征金字塔的显著性目标检测算法

王剑哲,吴秦   

  1. 1. 江南大学 人工智能与计算机学院,江苏 无锡 214122
    2. 江南大学 江苏省模式识别与计算智能工程实验室,江苏 无锡 214122
  • 出版日期:2023-01-01 发布日期:2023-01-01

Salient Object Detection Based on Coordinate Attention Feature Pyramid

WANG Jianzhe, WU Qin   

  1. 1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
    2. Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2023-01-01 Published:2023-01-01

摘要: 显著性目标检测旨在获取图像中的视觉显著目标,是计算机视觉领域的重要研究内容。相比传统手工提取特征的方法,基于全卷积神经网络的方法已在这一领域展现出强大优势。然而,显著性目标检测仍然存在一些问题。复杂场景下,背景中可能存在一些易被误判为显著目标的噪声,导致检测性能下降。另外,当显著目标轮廓较为复杂时,边界像素点的检测也变得较为困难。为了解决这些问题,提出一种坐标注意力特征金字塔的显著性目标检测算法。采用基于特征金字塔的网络结构,提取显著目标中不同层次的特征,并设计特征细化模块以实现不同层次特征的有效融合。为解决背景误判问题,采用坐标注意力模块,增大显著性区域权重的同时,抑制背景噪声。对于边界复杂问题,设计边界感知损失函数并结合多层次监督方法,帮助网络更加关注边界像素点,生成边界清晰的高质量显著图。在五个常用显著性目标检测数据集上的实验结果表明,该算法在五种评价指标上均取得较优的检测结果。

关键词: 显著性目标检测, 深度学习, 坐标注意力, 特征金字塔, 边界感知

Abstract: Salient object detection aims to obtain visually salient objects in images and is an important element in the field of computer vision. Compared with traditional manual feature extraction methods, full convolutional neural network-based methods have shown powerful advantages. However, salient object detection still has some problems. In complex scenes, there may be some noises in the background, which can be easily mistaken as salient objects, leading to the degradation of detection performance. In addition, it is difficult to detect the boundary pixels when the salient object contour is complex. To solve these problems, this paper proposes a salient object detection algorithm based on coordinate attention feature pyramid. A feature pyramid network is used to extract features at different levels and a feature refinement module is designed to achieve fusion of feature at different levels. To solve the problem of background misjudgment, the model adopts coordinate attention mechanism to increase the weight of saliency regions and suppress background noise. For boundary complexity problem, a boundary pixel awareness loss is designed and combined with multi-level supervision to help the network pay more attention to the boundary pixels and generate high-quality saliency maps. Experimental results on five common datasets show that the algorithm achieves better detection performance on five evaluation metrics.

Key words: salient object detection, deep learning, coordinate attention, feature pyramid, boundary awareness