计算机科学与探索 ›› 2025, Vol. 19 ›› Issue (9): 2408-2418.DOI: 10.3778/j.issn.1673-9418.2409073

• 图形·图像 • 上一篇    下一篇

位置强化与双路径边缘增强解码的盲道分割方法

翁静梁,王龙业,曾晓莉,刘梦瑶,易 婷   

  1. 1. 西南石油大学 电气信息学院,成都 610500
    2. 西藏大学 信息科学技术学院,拉萨 850000
  • 出版日期:2025-09-01 发布日期:2025-09-01

Blind Lane Segmentation Method Based on Position-Enhanced and Dual-Path Edge- Enhanced Decoding

WENG Jingliang, WANG Longye, ZENG Xiaoli, LIU Mengyao, YI Ting   

  1. 1. School of Electrical Information, Southwest Petroleum University, Chengdu 610500, China
    2. School of Information Science and Technology, Xizang University, Lhasa 850000, China
  • Online:2025-09-01 Published:2025-09-01

摘要: 针对盲道分割任务中背景复杂、边界模糊以及盲道形状多样化导致的分割效果不佳等问题,提出了一种位置强化与双路径边缘增强解码的盲道分割方法。设计了位置强化特征提取模块,通过在降采样过程中融入MobileVitv3构成的主干网络,增强模型对盲道特征的感知能力,充分保留上下文信息;提出了融合通道位置强化注意力模块,分别在通道和空间维度上强化特征提取,提升模型在低对比度场景下有效区分盲道与背景信息的能力;采用双路径边缘增强解码方式,对盲道的区域与边界信息进行解码,并结合联合损失函数进一步优化边界细节的处理。此外,针对当前缺乏大规模公开盲道数据集的问题,自制了一个多场景盲道数据集(MSBD),为模型训练和实验验证提供了更丰富的数据支持。实验结果表明,该网络在MSBD数据集上的mIoU、Precision、Recall以及F1-score分别达到96.82%、96.84%、96.48%、96.66%,均优于SegFormer、Deeplabv3+等网络;在输入图片大小为512×512×3时,参数量和计算量分别为1.73×106和1.93 GFLOPs,且推理帧率可达86 FPS,综合性能优于所对比网络;同时该网络在公开盲道人行道数据集(BACD)和Cityscapes数据集上的综合指标也优于所对比网络。

关键词: 盲道分割, 位置强化, 注意力模块, 多场景, 边界

Abstract: In order to solve the problems of complex backgrounds, blurred boundaries, and diverse blind lane shapes in blind lane segmentation tasks, a blind lane segmentation method based on position enhancement and dual-path edge-enhanced decoding is proposed. Firstly, a position-enhanced feature extraction module is designed, which integrates Mobile- Vitv3 into the down-sampling process to improve the model??s ability to perceive blind lane features and fully preserve contextual information. Secondly, a fusion of channel and position-enhanced attention modules is proposed, enhancing feature extraction in both channel and spatial dimensions, improving the model??s ability to effectively distinguish blind lanes from background information in low-contrast scenes. Finally, the dual-path edge-enhanced decoding method decodes region and boundary information, further optimizing boundary details with a joint loss function. Additionally, in order to address the lack of large-scale public blind lane datasets, a multi-scenario blind track dataset (MSBD) is self-constructed to provide richer data support for model training and experimental validation. Experimental results show that the network achieves mIoU, Precision, Recall, and F1-score of 96.82%, 96.84%, 96.48%, and 96.66% on the MSBD dataset, outperforming SegFormer, Deeplabv3+, etc. With an input image size of 512×512×3, the parameter count and computation count are 1.73×106 and 1.93 GFLOPs , and the inference frame rate reaches 86 FPS. The network also shows superior performance on the blind road and pedestrian crosswalk dataset (BACD) and Cityscapes dataset.

Key words: blind lane segmentation, position enhancement, attention module, multi-scenario, boundary