计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (6): 1114-1121.DOI: 10.3778/j.issn.1673-9418.2005060

• 人工智能 • 上一篇    下一篇

卷积神经网络的交通标志语义分割

马宇,张丽果,杜慧敏,毛智礼   

  1. 西安邮电大学 电子工程学院,西安 710121
  • 出版日期:2021-06-01 发布日期:2021-06-03

Traffic Sign Semantic Segmentation Based on Convolutional Neural Network

MA Yu, ZHANG Liguo, DU Huimin, MAO Zhili   

  1. School of Electronic Engineering, Xi’an University of Posts & Telecommunications, Xi’an 710121, China
  • Online:2021-06-01 Published:2021-06-03

摘要:

图像语义分割是现代自动驾驶系统的一个必要部分,因为实时准确地捕获路况信息是导航和动作规划的关键。交通标志是重要的路况信息,性能稳定、实时性较高并且精度可达到应用需要的交通标志语义算法,是实现主动安全驾驶系统和自动驾驶系统的基础。首先,在分析实际应用需要的基础上,选择GTSDB数据库作为原始数据,设计了可综合评估语义分割算法性能的交通标志数据集。然后,基于性能稳定的经典语义分割网络U-Net,提出针对交通标志等小目标的分割性能更优且实时性更高的深度神经网络结构D-Unet(“D”表示dilated convolution)。该方法采用更少的池化层,从而保留更多的图像信息,同时采用扩张卷积代替常规卷积以扩大卷积感受野,更好地统筹全局信息。最后,在设计的数据集上进行了测试,与FCN-8s、SegNet、U-Net等图像分割网络模型相比,改进后的模型均交并比(MIoU)分别提高了约11.9个百分点、6.09个百分点和3.71个百分点,参数量仅有其他三种网络模型的4.94%、22.5%和85.5%。

关键词: 道路交通标志, 深度学习, 语义分割, 扩张卷积

Abstract:

Image semantic segmentation is a necessary part of modern autonomous driving systems, because real-time and accurate capture of road condition information is the key to navigation and action planning. Traffic signs are important road condition information. The traffic sign semantic algorithm with stable performance, high real-time performance and accuracy that can meet the application needs is the basis for the realization of active safe driving systems and automatic driving systems. First, based on the analysis of actual application needs, the GTSDB database is selected as the original data, and a traffic sign data set that can comprehensively evaluate the perfor-mance of the semantic segmentation algorithm is designed. Then, based on the classical semantic segmentation network with stable performance U-Net, the D-Unet (D means dilated convolution), a deep neural network structure is proposed with better segmentation performance and higher real-time performance for small targets such as traffic signs. This method uses fewer pooling layers to retain more image information, and uses dilated convolution instead of conventional convolution to expand the receptive field of convolution, better overall planning global information. Finally, tested on the data set designed in this paper, compared with FCN-8s, SegNet, U-Net and other image segmentation network models, the mean intersection over union (MIoU) of the model is increased by about 11.9 percentage points, 6.09 percentage points and 3.71 percentage points, and the parameter amount is only 4.94%, 22.5% and 85.5% of the other three network models.

Key words: road traffic sign, deep learning, semantic segmentation;dilated convolution