Improved Lightweight Network in Image Recognition

doi:10.3778/j.issn.1673-9418.2004057

Abstract

Abstract:

To solve the complexity of convolutional neural network and the large number of parameters in image recognition task, this paper proposes a lightweight network SepNet. In this structure, the traditional fully-connected layer is replaced by Kronecker product in the classifier module. In order to further optimize network structure, in the feature extraction module, by balancing the depth and width of the network, a separable residual network module using the deep separable convolution and residual network is designed. Finally, a lightweight network architecture which can realize end-to-end training is formed, which is called sep_res18_s3. The experiments are conducted on MNIST, CIFAR-10 and CIFAR-100 datasets respectively. The results show that compared with the VGG10 network, the designed SepNet can reduce the number of parameters and computation by 94.15% without losing its accuracy. At the same time, compared with cov_res18_s3, sep_res18_s3 can still reduce the parameter amount by 58.33% and 81.82% of FLOPs. Experimental results show that replacing the fully-connected layer with Kronecker product can not only maintain the accuracy of training results, but also significantly reduce the number of parameters and calculation costs, and to a certain extent, it can prevent overfitting. On this basis, combining the deep separable convolution and residual structure, it proves the effectiveness of sep_res18_s3.

Key words: image recognition, convolutional neural network (CNN), lightweight, Kronecker product, deep separable convolution

摘要：

针对卷积神经网络在图像识别任务上模型复杂度大、参数量多，首先提出了一种轻量化的SepNet网络结构，该结构在分类器模块上采用克罗内克积替换了传统的全连接层。为进一步优化网络结构，在特征提取模块均衡网络深度、宽度，设计了一个利用深度可分离卷积和残差网络的可分离残差模块，最终形成了一个能实现端到端训练的轻量化网络架构，称为sep_res18_s3。实验分别在MNIST、CIFAR-10、CIFAR-100数据集上验证SepNet的有效性，设计的SepNet网络结构相比VGG10，参数数量和运算量在不损失其精度下均降低了94.15%。同时，相比设计的类残差网络cov_res18_s3，sep_res18_s3仍能降低58.33%的参数量和81.82%的FLOPs。实验结果表明，采用克罗内克积替换全连接层可以在保证训练结果准确度的同时显著降低参数数量和计算成本，并在一定程度上防止过拟合，在此基础上结合深度可分离卷积和类残差结构，证明了sep_res18_s3的有效性。

关键词: 图像识别, 卷积神经网络（CNN）, 轻量化, 克罗内克积, 深度可分离卷积

XIAO Zhenjiu, YANG Xiaodi, WEI Xian, TANG Xiaoliang. Improved Lightweight Network in Image Recognition[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(4): 743-753.

肖振久, 杨晓迪, 魏宪, 唐晓亮. 改进的轻量型网络在图像识别上的应用[J]. 计算机科学与探索, 2021, 15(4): 743-753.

References

[1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[2] SüNDERHAUF N, SHIRAZI S, DAYOUB F, et al. On the performance of convnet features for place recognition[C]//Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Sep 28-Oct 2,2015. Piscataway: IEEE, 2015: 4297-4304.
[3] SIMONYAN K, ZISSERMAN A.Very deep convolutional net-works for large-scale image recognition[J]. arXiv:1409.1556,2014.
[4] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learn-ing for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[5] LEI J, GAO X, SONG J, et al. Survey of deep neural net-work model compression[J]. Journal of Software, 2018, 29(2): 251-266.
雷杰, 高鑫, 宋杰, 等. 深度网络模型压缩综述[J]. 软件学报, 2018, 29(2): 251-266.
[6] GE D H, LI H S, ZHANG L, et al. Survey of lightweight neural network[J]. Journal of Software, 2020, 31(9): 2625-2653.
葛道辉, 李洪升, 张亮, 等. 轻量级神经网络架构[J]. 软件学报, 2020, 31(9): 2625-2653.
[7] IANDOLA F N, MOSKEWICZ M W, ASHRAF K, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 Mb model size[J]. arXiv:1602.07360, 2016.
[8] GHOLAMI A, KWON K, WU B C, et al. SqueezeNext: hard-ware-aware neural network design[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 1638-1647.
[9] IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 448-456.
[10] HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017.
[11] SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[J]. arXiv:1602.07261, 2016.
[12] CHOLLET F. Xception: deep learning with depthwise separ-able convolutions[C]//Proceedings of the 2017 IEEE Con-ference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1800-1807.
[13] ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet：an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City,Jun 18-23, 2018. Washington: IEEE Computer Society,?2018: 6848-6856.
[14] MA N, ZHANG X, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[J]. arXiv:1807.11164, 2018.
[15] QIN Z, LI Z, ZHANG Z, et al. ThunderNet: towards real-time generic object detection[J]. arXiv:1903.11752, 2019.
[16] ZHANG D, WANG H T, JIANG Y, et al. Research on real-time face recognition algorithm based on lightweight net-work[J]. Journal of Frontiers of Computer Science and Tech-nology, 2020, 14(2): 317-324.
张典, 汪海涛, 姜瑛, 等. 基于轻量级网络的实时人脸识别算法研究[J]. 计算机科学与探索, 2020, 14(2): 317-324.
[17] GUO T Y, WANG B, LIU Y, et al. Multi-channel fusion separable convolution neural networks for brain magnetic resonance image segmentation[J]. Journal of Image and Gra-phics, 2019, 24(11): 2009-2020.
郭彤宇, 王博, 刘悦, 等. 多通道融合可分离卷积神经网络下的脑部磁共振图像分割[J]. 中国图象图形学报, 2019,24(11): 2009-2020.
[18] LIN M, CHEN Q, YAN S C. Network in network[J]. arXiv: 1312.4400, 2013.
[19] WU J, QIAN X Z. Compact deep convolutional neural network in image recognition[J]. Journal of Frontiers of Com-puter Science and Technology, 2019, 13(2): 275-284.
吴进, 钱雪忠. 紧凑型深度卷积神经网络在图像识别中的应用[J]. 计算机科学与探索, 2019, 13(2): 275-284.
[20] ZHOU S, WU J N. Compression of fully-connected layer in neural network by Kronecker product[J]. arXiv:1507.05775, 2015.
[21] JOSE C, CISSE M, FLEURET F. Kronecker recurrent units[J]. arXiv:1705.10142, 2017.
[22] THAKKER U, BEU J, GOPE D, et al. Compressing RNNs for IOT devices by 15-38x using Kronecker products[J]. arXiv:1906.02876, 2019.
[23] THAKKER U, WHATAMOUGH P, MATTINA M, et al. Compressing language models using doped Kronecker pro-ducts[J]. arXiv:2001.08896, 2020.
[24] LIU W J, GAO M Y, QU H C, et al. Lightweight multi-target detection network based on inverted residual structure[J]. Laser & Optoelectronics Progress, 2019, 56(22): 57-65.
刘万军, 高明月, 曲海成, 等. 基于反残差结构的轻量级多目标检测网络[J]. 激光与光电子学进展, 2019, 56(22): 57-65.
[25] LU W C, PANG Y W, HE Y Q, et al. Real-time and accu-rate semantic segmentation based on separable residual modules[J]. Laser & Optoelectronics Progress, 2019, 56(5): 89-99.
路文超, 庞彦伟, 何宇清, 等. 基于可分离残差模块的精确实时语义分割[J]. 激光与光电子学进展, 2019, 56(5): 89-99.
[26] TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks[J]. arXiv:1905.11946, 2019.
[27] VELI?KOVI? P, WANG D, LANE N D, et al. X-CNN: cross-modal convolutional neural networks for sparse data-sets[C]//Proceedings of the 2016 IEEE Symposium on Com-putational Intelligence, Athens, Dec 6-9, 2016. Piscataway:IEEE, 2016: 1-8.