Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual

doi:10.3778/j.issn.1673-9418.2012063

Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (9): 2132-2142.DOI: 10.3778/j.issn.1673-9418.2012063

• Graphics and Image • Previous Articles Next Articles

Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual

LIU Lamei, WANG Xiaona(), LIU Wanjun, QU Haicheng

College of Software, Liaoning Technical University, Huludao, Liaoning 125105, China

Received:2020-12-17 Revised:2021-04-08 Online:2022-09-01 Published:2021-04-19
About author:LIU Lamei, born in 1979, M.S., lecturer, member of CCF. Her research interests include graphics and image processing.
WANG Xiaona, born in 1994, M.S. candidate. Her research interests include image and intelligent information processing.
LIU Wanjun, born in 1959, M.S., professor, Ph.D. supervisor, senior member of CCF. His research interests include image and intelligent information processing.
QU Haicheng, born in 1981, Ph.D., associate professor, M.S. supervisor, member of CCF. His research interests include remote sensing image rapid processing and intelligent big data processing.
Supported by:
Young Scientists Fund of National Natural Science Foundation of China(41701479);Natural Science Foundation of Liaoning Province(20180550529)

融合转置卷积与深度残差图像语义分割方法

刘腊梅, 王晓娜(), 刘万军, 曲海成

辽宁工程技术大学软件学院,辽宁葫芦岛 125105

通讯作者: + E-mail: 1242161005@qq.com
作者简介:刘腊梅（1979—）,女,硕士,讲师,CCF会员,主要研究方向为图形图像处理。
王晓娜（1994—）,女,山西朔州人,硕士研究生,主要研究方向为图像与智能信息处理。
刘万军（1959—）,男,辽宁锦州人,硕士,教授,博士生导师,CCF高级会员,主要研究方向为图像与智能信息处理。
曲海成（1981—）,男,博士,副教授,硕士生导师,CCF会员,主要研究方向为遥感影像快速处理、智能大数据处理。
基金资助:
国家自然科学基金青年项目(41701479);辽宁省自然科学基金(20180550529)

Abstract

Abstract:

Aiming at the problems of low segmentation accuracy and high loss of deep learning image semantic segmentation methods, image semantic segmentation method with fusion of transposed convolution and deep residual is proposed. Firstly, in order to solve the problems of decreasing segmentation accuracy and slow convergence speed caused by increasing of the depth of neural network, a deep residual learning module is designed to improve the training efficiency and convergence speed of the network. After that, in order to make the feature map fusion more accurate in upsampling and feature extraction process, two upsampling methods of UpSampling2D and transposed convolution in the deep residual U-net model are merged to form a new upsampling module. Finally, to solve the over-fitting of the weights between training set and validation set in the process of network training, Dropout is introduced in the skip connection layer of the improved network, which enhances learning ability of the model. The performance of algorithm is proven on the CamVid datasets. The semantic segmentation accuracy of the algorithm reaches 89.93% and the loss is reduced to 0.23. Compared with U-net model, the verification set accuracy is improved by 13.13 percentage points, and the loss is reduced by 1.20, which is better than the current image semantic segmentation methods. The proposed model of image semantic segmentation combines the advantages of U-net, which makes the image semantic segmentation more accurate, with better effect, and effectively improves the robustness of algorithm.

Key words: image semantic segmentation, U-net model, deep residual network, transposed convolution

摘要：

针对深度学习图像语义分割方法中存在分割精度低、损失率高的问题,提出了融合转置卷积与深度残差图像语义分割方法。首先,为了解决神经网络深度增加引起分割精度下降、收敛速度慢的问题,设计一种深度残差学习模块来提升网络的训练效率和收敛速度;然后,为了使上采样过程与特征提取过程中特征图融合精度更高,将深度残差U-net模型中UpSampling2D和转置卷积两种上采样方式进行拼接,形成新的上采样模块;最后,针对网络训练过程中训练集与验证集之间存在的权值过度拟合问题,在网络的跳跃连接层引入Dropout,增强了网络的学习能力。在CamVid数据集上对算法的性能进行了证明,算法语义分割精度达到89.93%,损失率降到0.23,与U-net模型相比,验证集精度提升了13.13个百分点,损失率降低了1.20,优于当前的图像语义分割方法。所提出的图像语义分割新模型,综合了U-net模型的优点,使得图像语义分割精度更高,语义分割的效果更好,有效提升了算法的鲁棒性。

关键词: 图像语义分割, U-net模型, 深度残差网络, 转置卷积

CLC Number:

TP391

LIU Lamei, WANG Xiaona, LIU Wanjun, QU Haicheng. Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(9): 2132-2142.

刘腊梅, 王晓娜, 刘万军, 曲海成. 融合转置卷积与深度残差图像语义分割方法[J]. 计算机科学与探索, 2022, 16(9): 2132-2142.

Figures/Tables 14

References 23

[1]	YU H S, YANG Z E, TAN L, et al. Methods and datasets on semantic segmentation: a review[J]. Neurocomputing, 2018, 304(23): 82-104. DOI URL
[2]	WANG X, MA H M, YOU S D. Deep clustering for weakly-supervised semantic segmentation in autonomous driving scenes[J]. Neurocomputing, 2020, 381: 20-28. DOI URL
[3]	徐辉, 祝玉华, 甄彤, 等. 深度神经网络图像语义分割方法综述[J]. 计算机科学与探索, 2021, 15(1): 47-59. DOI
	XU H, ZHU Y H, ZENG T, et al. Survey of image semantic segmentation methods based on deep neural network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 47-59.
[4]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(4): 640-651. DOI URL
[5]	RONNEBERGER O, FISCHER P, BROX T. U-Net: con-volutional networks for biomedical image segmentation[C]// LNCS 9351: Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct 5-9, 2015. Cham: Springer, 2015: 234-241.
[6]	BADRINARAYANAN V, KENDELL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. DOI URL
[7]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. arXiv:1412.7062, 2014.
[8]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 40(4): 834-848. DOI URL
[9]	ZENG T L, LIU J. Follicular ultrasound image segmentation based on improved Deeplabv3[C]// Proceedings of the 2019 3rd International Conference on Computer Engineering,Information Science and Internet Technology, Sanya, Oct 30-31, 2019: 562-567.
[10]	CHEN L C, ZHU Y K, PAPANDROU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 833-851.
[11]	FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition,Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 3146-3154.
[12]	ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]// LNCS 11207: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 418-434.
[13]	PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[J]. arXiv:1606.02147, 2016.
[14]	ZHANG Z, LIU Q, WANG Y. Road extraction by deep residual U-net[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(5): 749-753. DOI URL
[15]	ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet++: a nested U-Net architecture for medical image segmentation[C]// LNCS 11045: Proceedings of the 4th International Workshop and 8th International Workshop on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Granada, Sep 20, 2018. Cham: Springer, 2018: 3-11.
[16]	LIU Z, CAO Y, WANG Y, et al. Computer vision-based concrete crack detection using U-net fully convolutional networks[J]. Automation in Construction, 2019, 104: 129-139. DOI URL
[17]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[18]	HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. arXiv:1207.0580, 2012.
[19]	BROSTOW G J, FAUQUEUR J, CIPOLLA R. Semantic object classes in video: a high-definition ground truth database[J]. Pattern Recognition Letters, 2008, 30(2): 88-97. DOI URL
[20]	JÉGOU S, DROZDZAL M, VÁZQUEZ D, et al. The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1175-1183.
[21]	YU C Q, WANG J B, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 334-349.
[22]	WANG Y, ZHOU Q, LIU J, et al. LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing, Taipei, China, Sep 22-25, 2019. Piscataway: IEEE, 2019: 1860-1864.
[23]	项建弘, 徐昊. 基于深度学习的图像语义分割算法研究[J]. 计算机应用研究, 2020, 37(S2): 316-317.
	XIANG J H, XU H. Research on image semantic segmentation algorithm based on deep learning[J]. Application Research of Computers, 2020, 37(S2): 316-317.

Model	Loss	Accuracy/%	val_loss	val_acc/%
U-net+BN	0.305 8	87.31	0.304 7	85.60
ResUnet	0.209 0	89.39	0.292 4	86.57
DResUnet	0.181 1	90.79	0.307 5	86.95

Model	Loss	Accuracy/%	val_loss	val_acc/%
U-net+BN	0.305 8	87.31	0.304 7	85.60
ResUnet	0.209 0	89.39	0.292 4	86.57
DResUnet	0.181 1	90.79	0.307 5	86.95

Model	Loss	Accuracy/%	val_loss	val_acc/%
U-net	0.907 4	81.02	1.430 0	76.80
U-net+TC	0.305 8	85.31	0.471 3	81.62
ResUnet+TC	0.195 4	90.39	0.287 5	86.57
DResUnet+TC	0.169 8	91.18	0.265 3	87.79

Model	Loss	Accuracy/%	val_loss	val_acc/%
U-net	0.907 4	81.02	1.430 0	76.80
U-net+TC	0.305 8	85.31	0.471 3	81.62
ResUnet+TC	0.195 4	90.39	0.287 5	86.57
DResUnet+TC	0.169 8	91.18	0.265 3	87.79

Model	Loss	Accuracy/%	val_loss	val_acc/%
ResUnet+TC+DO	0.190 4	90.53	0.399 5	85.12
DResUnet+DO	0.184 5	90.66	0.323 6	85.45
Ours	0.176 9	90.92	0.234 6	89.93

Image Semantic Segmentation Method with Fusion of Transposed Convolution and Deep Residual

融合转置卷积与深度残差图像语义分割方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 14

References 23

Related Articles 3

Recommended Articles

Metrics

Step_per_epoch	epoch	Loss	Accuracy/%
100	30	0.302 2	88.47
300	30	0.302 7	87.17
400	30	0.314 4	87.85
200	10	0.403 8	85.33
200	20	0.554 7	81.03
200	40	0.391 8	83.65
200	30	0.234 6	89.93

模型	基础模型	精度/%
SegNet	—	77.06
DenconvNet	—	85.89
BiseNet	Xception	88.82
LEDNet	—	87.09
ASN	—	88.40
Ours	—	89.93

[1]	ZHAO Xiaoqiang, XU Huiping. Image Semantic Segmentation Method with Hierarchical Feature Fusion [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(5): 949-957.
[2]	XU Hui, ZHU Yuhua, ZHEN Tong, LI Zhihui. Survey of Image Semantic Segmentation Methods Based on Deep Neural Network [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 47-59.
[3]	HOU Yonghong, LV Xiaodong, CHEN Yanfang, ZHAO Jian, LI Qiyu, CHEN Hao. Application of Deep Neural Networks in Visual Recognition of Forest Trails [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(2): 263-274.