Journal of Frontiers of Computer Science and Technology

• Science Researches •     Next Articles

The blind lane segmentation method based on position-enhanced and dual-path edge-enhanced decoding

WENG Jingliang,  WANG Longye,  ZENG Xiaoli,  LIU Mengyao,  YI Ting   

  1. 1.School of Electrical Information, Southwest Petroleum University, Chengdu  610500, China
    2.School of Information Science and Technology, Tibet University, Lhasa  850000, China

位置强化与双路径边缘增强解码的盲道分割方法

翁静梁,王龙业,曾晓莉,刘梦瑶,易婷   

  1. 1.西南石油大学 电气信息学院, 成都 610500
    2.西藏大学 信息科学技术学院, 拉萨 850000

Abstract: In order to solve the problems of complex backgrounds, blurred boundaries, and diverse blind lane shapes in blind lane segmentation tasks, a blind lane segmentation method based on position enhancement and dual-path edge-enhanced decoding is proposed. Firstly, a position-enhanced feature extraction module is designed, which integrates MobileVitv3 into the down-sampling process to improve the model's ability to perceive blind lane features and fully preserve contextual information. Secondly, a fusion of channel and position-enhanced attention modules is proposed, enhancing feature extraction in both channel and spatial dimensions, improving the model’s ability to effectively distinguish blind lanes from background information in low-contrast scenes. Finally, the dual-path edge-enhanced decoding method decodes region and boundary information, further optimizing boundary details with a joint loss function. Additionally, in order to address the lack of large-scale public blind lane datasets, a Multi-scenario Blind Track Dataset (MSBD) is self-constructed to provide richer data support for model training and experimental validation. Experimental results show that the network achieves mIoU, Precision, Recall, and F1-score of 96.82%, 96.84%, 96.48%, and 96.66% on the MSBD dataset, outperforming SegFormer and Deeplabv3+. With an input image size of 512×512×3, the parameter count and FLOPs are 1.73M and 1.93G, and the inference frame rate (FPS) reaches 86 frames per second. The network also shows superior performance on the Blind and Cross Datasets (BACD) and Cityscapes datasets.

Key words: Blind lane segmentation, Position enhancement, Attention module, Multi-scenario, Boundary

摘要: 针对盲道分割任务中背景复杂、边界模糊以及盲道形状多样化导致的分割效果不佳等问题,提出了一种位置强化与双路径边缘增强解码的盲道分割方法。首先,设计了位置强化特征提取模块,通过在降采样过程中融入MobileVitv3构成的主干网络,增强模型对盲道特征的感知能力,充分保留上下文信息;其次,提出了融合通道位置强化注意力模块,分别在通道和空间维度上强化特征提取,提升模型在低对比度场景下有效区分盲道与背景信息的能力;最后,采用双路径边缘增强解码方式,对盲道的区域与边界信息进行解码,并结合联合损失函数进一步优化边界细节的处理。此外,针对当前缺乏大规模公开盲道数据集的问题,自制了一个多场景盲道数据集(Multi-scenario Blind lane Dataset, MSBD),为模型训练和实验验证提供了更丰富的数据支持。实验结果表明,该网络在MSBD数据集上的mIoU、Precision、Recall以及F1-score分别达到96.82%、96.84%、96.48%、96.66%均优于SegFormer、Deeplabv3+等网络;在输入图片大小为512×512×3时,参数量和FLOPs分别为1.73M和1.93G,且推理帧率(FPS)可达86张/秒综合性能优于所对比网络;同时该网络在公开盲道人行道数据集(Blind and Cross Datasets, BACD)和Cityscapes数据集上的综合指标也优于所对比网络。

关键词: 盲道分割, 位置强化, 注意力模块, 多场景, 边界