融合转置卷积与深度残差图像语义分割方法

doi:10.3778/j.issn.1673-9418.2012063

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (9): 2132-2142.DOI: 10.3778/j.issn.1673-9418.2012063

融合转置卷积与深度残差图像语义分割方法

刘腊梅, 王晓娜(), 刘万军, 曲海成

辽宁工程技术大学软件学院,辽宁葫芦岛 125105

收稿日期:2020-12-17 修回日期:2021-04-08 出版日期:2022-09-01 发布日期:2021-04-19
通讯作者: + E-mail: 1242161005@qq.com
作者简介:刘腊梅（1979—）,女,硕士,讲师,CCF会员,主要研究方向为图形图像处理。
王晓娜（1994—）,女,山西朔州人,硕士研究生,主要研究方向为图像与智能信息处理。
刘万军（1959—）,男,辽宁锦州人,硕士,教授,博士生导师,CCF高级会员,主要研究方向为图像与智能信息处理。
曲海成（1981—）,男,博士,副教授,硕士生导师,CCF会员,主要研究方向为遥感影像快速处理、智能大数据处理。
基金资助:
国家自然科学基金青年项目(41701479);辽宁省自然科学基金(20180550529)

Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual

LIU Lamei, WANG Xiaona(), LIU Wanjun, QU Haicheng

College of Software, Liaoning Technical University, Huludao, Liaoning 125105, China

Received:2020-12-17 Revised:2021-04-08 Online:2022-09-01 Published:2021-04-19
About author:LIU Lamei, born in 1979, M.S., lecturer, member of CCF. Her research interests include graphics and image processing.
WANG Xiaona, born in 1994, M.S. candidate. Her research interests include image and intelligent information processing.
LIU Wanjun, born in 1959, M.S., professor, Ph.D. supervisor, senior member of CCF. His research interests include image and intelligent information processing.
QU Haicheng, born in 1981, Ph.D., associate professor, M.S. supervisor, member of CCF. His research interests include remote sensing image rapid processing and intelligent big data processing.
Supported by:
Young Scientists Fund of National Natural Science Foundation of China(41701479);Natural Science Foundation of Liaoning Province(20180550529)

摘要/Abstract

摘要：

针对深度学习图像语义分割方法中存在分割精度低、损失率高的问题,提出了融合转置卷积与深度残差图像语义分割方法。首先,为了解决神经网络深度增加引起分割精度下降、收敛速度慢的问题,设计一种深度残差学习模块来提升网络的训练效率和收敛速度;然后,为了使上采样过程与特征提取过程中特征图融合精度更高,将深度残差U-net模型中UpSampling2D和转置卷积两种上采样方式进行拼接,形成新的上采样模块;最后,针对网络训练过程中训练集与验证集之间存在的权值过度拟合问题,在网络的跳跃连接层引入Dropout,增强了网络的学习能力。在CamVid数据集上对算法的性能进行了证明,算法语义分割精度达到89.93%,损失率降到0.23,与U-net模型相比,验证集精度提升了13.13个百分点,损失率降低了1.20,优于当前的图像语义分割方法。所提出的图像语义分割新模型,综合了U-net模型的优点,使得图像语义分割精度更高,语义分割的效果更好,有效提升了算法的鲁棒性。

关键词: 图像语义分割, U-net模型, 深度残差网络, 转置卷积

Abstract:

Aiming at the problems of low segmentation accuracy and high loss of deep learning image semantic segmentation methods, image semantic segmentation method with fusion of transposed convolution and deep residual is proposed. Firstly, in order to solve the problems of decreasing segmentation accuracy and slow convergence speed caused by increasing of the depth of neural network, a deep residual learning module is designed to improve the training efficiency and convergence speed of the network. After that, in order to make the feature map fusion more accurate in upsampling and feature extraction process, two upsampling methods of UpSampling2D and transposed convolution in the deep residual U-net model are merged to form a new upsampling module. Finally, to solve the over-fitting of the weights between training set and validation set in the process of network training, Dropout is introduced in the skip connection layer of the improved network, which enhances learning ability of the model. The performance of algorithm is proven on the CamVid datasets. The semantic segmentation accuracy of the algorithm reaches 89.93% and the loss is reduced to 0.23. Compared with U-net model, the verification set accuracy is improved by 13.13 percentage points, and the loss is reduced by 1.20, which is better than the current image semantic segmentation methods. The proposed model of image semantic segmentation combines the advantages of U-net, which makes the image semantic segmentation more accurate, with better effect, and effectively improves the robustness of algorithm.

Key words: image semantic segmentation, U-net model, deep residual network, transposed convolution

中图分类号:

TP391

刘腊梅, 王晓娜, 刘万军, 曲海成. 融合转置卷积与深度残差图像语义分割方法[J]. 计算机科学与探索, 2022, 16(9): 2132-2142.

LIU Lamei, WANG Xiaona, LIU Wanjun, QU Haicheng. Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2132-2142.

图/表 14

图1 U-net模型结构

Fig.1 U-net model structure

图2 深度残差学习模块

Fig.2 Deep residual learning module

图3 改进的U-net整体框架

Fig.3 Improved overall framework of U-net

图4 U-net原始模块

Fig.4 U-net original module

图5 Deep-ResUnet模块

Fig.5 Deep-ResUnet module

表1 改进的残差模块算法比较

Table 1 Comparison of improved residual module algorithm

Model	Loss	Accuracy/%	val_loss	val_acc/%
U-net+BN	0.305 8	87.31	0.304 7	85.60
ResUnet	0.209 0	89.39	0.292 4	86.57
DResUnet	0.181 1	90.79	0.307 5	86.95

表2 融合转置卷积模块算法比较

Table 2 Module algorithm comparison with transpose convolution

Model	Loss	Accuracy/%	val_loss	val_acc/%
U-net	0.907 4	81.02	1.430 0	76.80
U-net+TC	0.305 8	85.31	0.471 3	81.62
ResUnet+TC	0.195 4	90.39	0.287 5	86.57
DResUnet+TC	0.169 8	91.18	0.265 3	87.79

表3 Dropout模块算法比较

Table 3 Algorithm comparison of Dropout module

Model	Loss	Accuracy/%	val_loss	val_acc/%
ResUnet+TC+DO	0.190 4	90.53	0.399 5	85.12
DResUnet+DO	0.184 5	90.66	0.323 6	85.45
Ours	0.176 9	90.92	0.234 6	89.93

表4 不同迭代次数下损失率与精度对比

Table 4 Comparison of loss and accuracy of different iterations

Step_per_epoch	epoch	Loss	Accuracy/%
100	30	0.302 2	88.47
300	30	0.302 7	87.17
400	30	0.314 4	87.85
200	10	0.403 8	85.33
200	20	0.554 7	81.03
200	40	0.391 8	83.65
200	30	0.234 6	89.93

图6 本文算法损失率和精度变化趋势

Fig.6 Loss and accuracy trend of proposed model

图7 U-net损失率和精度变化趋势

Fig.7 Loss and accuracy trend of U-net

表5 相同条件下不同算法语义分割精度和损失率

Table 5 Accuracy and loss of different algorithms for semantic segmentation under same conditions

Model	Loss	Accuracy/%
U-net	1.430 0	76.80
SegNet	0.889 3	77.06
DResUnet	0.307 5	86.95
Ours	0.234 6	89.93

图8 网络模型实验效果对比

Fig.8 Experiment effect comparison of network models

表6 CamVid数据集下最新语义分割算法对比

Table 6 Comparison of latest semantic segmentation algorithms based on CamVid dataset

模型	基础模型	精度/%
SegNet	—	77.06
DenconvNet	—	85.89
BiseNet	Xception	88.82
LEDNet	—	87.09
ASN	—	88.40
Ours	—	89.93

参考文献 23

[1]	YU H S, YANG Z E, TAN L, et al. Methods and datasets on semantic segmentation: a review[J]. Neurocomputing, 2018, 304(23): 82-104. DOI URL
[2]	WANG X, MA H M, YOU S D. Deep clustering for weakly-supervised semantic segmentation in autonomous driving scenes[J]. Neurocomputing, 2020, 381: 20-28. DOI URL
[3]	徐辉, 祝玉华, 甄彤, 等. 深度神经网络图像语义分割方法综述[J]. 计算机科学与探索, 2021, 15(1): 47-59. DOI
	XU H, ZHU Y H, ZENG T, et al. Survey of image semantic segmentation methods based on deep neural network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 47-59.
[4]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(4): 640-651. DOI URL
[5]	RONNEBERGER O, FISCHER P, BROX T. U-Net: con-volutional networks for biomedical image segmentation[C]// LNCS 9351: Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct 5-9, 2015. Cham: Springer, 2015: 234-241.
[6]	BADRINARAYANAN V, KENDELL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. DOI URL
[7]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. arXiv:1412.7062, 2014.
[8]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 40(4): 834-848. DOI URL
[9]	ZENG T L, LIU J. Follicular ultrasound image segmentation based on improved Deeplabv3[C]// Proceedings of the 2019 3rd International Conference on Computer Engineering,Information Science and Internet Technology, Sanya, Oct 30-31, 2019: 562-567.
[10]	CHEN L C, ZHU Y K, PAPANDROU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 833-851.
[11]	FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition,Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 3146-3154.
[12]	ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]// LNCS 11207: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 418-434.
[13]	PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[J]. arXiv:1606.02147, 2016.
[14]	ZHANG Z, LIU Q, WANG Y. Road extraction by deep residual U-net[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(5): 749-753. DOI URL
[15]	ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet++: a nested U-Net architecture for medical image segmentation[C]// LNCS 11045: Proceedings of the 4th International Workshop and 8th International Workshop on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Granada, Sep 20, 2018. Cham: Springer, 2018: 3-11.
[16]	LIU Z, CAO Y, WANG Y, et al. Computer vision-based concrete crack detection using U-net fully convolutional networks[J]. Automation in Construction, 2019, 104: 129-139. DOI URL
[17]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[18]	HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. arXiv:1207.0580, 2012.
[19]	BROSTOW G J, FAUQUEUR J, CIPOLLA R. Semantic object classes in video: a high-definition ground truth database[J]. Pattern Recognition Letters, 2008, 30(2): 88-97. DOI URL
[20]	JÉGOU S, DROZDZAL M, VÁZQUEZ D, et al. The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1175-1183.
[21]	YU C Q, WANG J B, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 334-349.
[22]	WANG Y, ZHOU Q, LIU J, et al. LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing, Taipei, China, Sep 22-25, 2019. Piscataway: IEEE, 2019: 1860-1864.
[23]	项建弘, 徐昊. 基于深度学习的图像语义分割算法研究[J]. 计算机应用研究, 2020, 37(S2): 316-317.
	XIANG J H, XU H. Research on image semantic segmentation algorithm based on deep learning[J]. Application Research of Computers, 2020, 37(S2): 316-317.

融合转置卷积与深度残差图像语义分割方法

Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 23

相关文章 3

编辑推荐

Metrics

[1]	赵小强, 徐慧萍. 分级特征融合的图像语义分割[J]. 计算机科学与探索, 2021, 15(5): 949-957.
[2]	徐辉, 祝玉华, 甄彤, 李智慧. 深度神经网络图像语义分割方法综述[J]. 计算机科学与探索, 2021, 15(1): 47-59.
[3]	侯永宏，吕晓冬，陈艳芳，赵健，李器宇，陈浩. 深度神经网络在森林步道视觉识别中的应用[J]. 计算机科学与探索, 2019, 13(2): 263-274.