计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (3): 512-528.DOI: 10.3778/j.issn.1673-9418.2107056
马金林1,2,+(), 张裕1,2, 马自萍3, 毛凯绩1,2
收稿日期:
2021-07-14
修回日期:
2021-09-29
出版日期:
2022-03-01
发布日期:
2021-09-29
通讯作者:
+ E-mail: 624160@163.com作者简介:
马金林(1976—),男,宁夏青铜峡人,博士,副教授,主要研究方向为计算机视觉、深度学习、机器学习。基金资助:
MA Jinlin1,2,+(), ZHANG Yu1,2, MA Ziping3, MAO Kaiji1,2
Received:
2021-07-14
Revised:
2021-09-29
Online:
2022-03-01
Published:
2021-09-29
About author:
MA Jinlin, born in 1976, Ph.D., associate professor. His research interests include computer vision, deep learning and machine learning.Supported by:
摘要:
传统神经网络具有过度依赖硬件资源和对应用设备性能要求较高的缺点,因此无法部署于算力有限的边缘设备和移动终端上,人工智能技术的应用发展在一定程度上受到了限制。然而,随着科技时代的到来,受用户需求影响的人工智能迫切需要在便携式设备上能成功进行如计算机视觉应用等方面的操作。为此,以近几年流行的轻量化神经网络中的卷积部分为研究对象,详细比对了各类轻量化模型中卷积构成方式的区别,并针对卷积设计的主要思路和特点进行了较为详细的阐述。首先,通过引入轻量化神经网络的概念,介绍了轻量化神经网络的发展现状和网络中卷积方面所面临的问题;然后,将卷积分为卷积结构轻量化、卷积模块轻量化和卷积运算轻量化三方面进行介绍,具体通过对各类轻量化神经网络模型中卷积设计的研究,来展示不同卷积的轻量化效果,并对其中优化方法的优缺点进行阐述;最后,对文中所有轻量化模型卷积设计的主要思路和使用方式进行了总结分析,并对其未来的可能性发展进行了展望。
中图分类号:
马金林, 张裕, 马自萍, 毛凯绩. 轻量化神经网络卷积设计研究进展[J]. 计算机科学与探索, 2022, 16(3): 512-528.
MA Jinlin, ZHANG Yu, MA Ziping, MAO Kaiji. Research Progress of Lightweight Neural Network Convolution Design[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 512-528.
改进方法 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
调整卷积核大小 | 使用3×3卷积替换更大的卷积尺寸 | 感受野与之前保持一致,但使用了更小的计算操作数,保证了模型计算量的降低 | 3×3卷积频繁使用,计算量降低的优势将不复存在 |
使用1×1卷积 | 可手工操控通道数量,控制升维、降维功能 | 需要与其他卷积操作进行联合使用,且注意使用数量 | |
改变卷积核数量 | 在空间维度和通道维度上解耦 | 可将三维输入特征拆解为独立的二维平面特征与一维通道特征;分步计算降低了卷积计算量 | 部分二维卷积训练过程容易废掉,结果经过ReLU时更易于出现输出为0的情况 |
在空间维度上因式分解 | 数学层面进行计算降级,将 | 卷积间结构更复杂 | |
改变卷积核数量与网络宽度 | 结合上两种改进方法,构建出轻量化卷积后,通过跳跃连接、并联等手段进行网络宽度的扩展 | 使用轻量化的分组卷积,调整各卷积分组间的关系,提升模型运行速度 | 需要控制分组数量,防止分组数量过多增加模型计算量 |
表1 卷积结构轻量化技术对比表
Table 1 Comparison of convolution lightweight technology
改进方法 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
调整卷积核大小 | 使用3×3卷积替换更大的卷积尺寸 | 感受野与之前保持一致,但使用了更小的计算操作数,保证了模型计算量的降低 | 3×3卷积频繁使用,计算量降低的优势将不复存在 |
使用1×1卷积 | 可手工操控通道数量,控制升维、降维功能 | 需要与其他卷积操作进行联合使用,且注意使用数量 | |
改变卷积核数量 | 在空间维度和通道维度上解耦 | 可将三维输入特征拆解为独立的二维平面特征与一维通道特征;分步计算降低了卷积计算量 | 部分二维卷积训练过程容易废掉,结果经过ReLU时更易于出现输出为0的情况 |
在空间维度上因式分解 | 数学层面进行计算降级,将 | 卷积间结构更复杂 | |
改变卷积核数量与网络宽度 | 结合上两种改进方法,构建出轻量化卷积后,通过跳跃连接、并联等手段进行网络宽度的扩展 | 使用轻量化的分组卷积,调整各卷积分组间的关系,提升模型运行速度 | 需要控制分组数量,防止分组数量过多增加模型计算量 |
模块轻量化 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
残差模块 | 跳跃连接和沙漏状瓶颈结构 | 优化的卷积结构,可与轻量化的卷积替换,对结构进行修改,不会引入额外参数 | 会产生大量的1×1或3×3卷积 |
ResNeXt模块 | 分组卷积 | 分组卷积对通道数分割后,缩减了模型深度,分组并行执行还可提升模型性能 | 分组个数依赖于硬件资源和网络结构设计 |
SE模块 | 模块中几乎均使用1×1卷积,通过改变通道数,进行识别校准 | 1×1卷积不占用大量计算资源,识别校准后,模型性能均会提升 | Squeeze部分获得的上下文信息无法有效利用 |
注意力模块 | CBAM由CAM和SAM两部分组成 | SE模块在通道维度上的进阶版,且对通道维度的结果在空间维度上进行了调整 | 在网络通道数较高时,CAM中的FC层会导致较为严重的耗时量 |
表2 模块轻量化技术对比
Table 2 Comparison of lightweight technology of modules
模块轻量化 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
残差模块 | 跳跃连接和沙漏状瓶颈结构 | 优化的卷积结构,可与轻量化的卷积替换,对结构进行修改,不会引入额外参数 | 会产生大量的1×1或3×3卷积 |
ResNeXt模块 | 分组卷积 | 分组卷积对通道数分割后,缩减了模型深度,分组并行执行还可提升模型性能 | 分组个数依赖于硬件资源和网络结构设计 |
SE模块 | 模块中几乎均使用1×1卷积,通过改变通道数,进行识别校准 | 1×1卷积不占用大量计算资源,识别校准后,模型性能均会提升 | Squeeze部分获得的上下文信息无法有效利用 |
注意力模块 | CBAM由CAM和SAM两部分组成 | SE模块在通道维度上的进阶版,且对通道维度的结果在空间维度上进行了调整 | 在网络通道数较高时,CAM中的FC层会导致较为严重的耗时量 |
模型 | 发布时间 | 参数量/106 | MAC/GB | 处理器 | 运行时间 | 数据集 | 精确度/% | 输入尺寸/像素 |
---|---|---|---|---|---|---|---|---|
PVANet | 2016-09-30 | 3.28 | 7.92 | 单核CPU:Intel I7-6700k | 750.00 ms/image (1.31 FPS) | VOC2007 VOC2012 | mAP:83.8 mAP:82.5 | 1 065×640 |
单核GPU:NVIDIA Titan X | 46.00 ms/image (21.73 FPS) | |||||||
SqueezeNet | 2016-11-04 | 1.24 | 0.35 | 单核GPU: NVIDIA Titan X | 15.74 ms/image | ImageNet | Top-1:57.5 Top-5:80.3 | 227×227 |
Xception | 2017-04-04 | 22.85 | 8.42 | 单核GPU: NVIDIA GTX 1070 | 112.00 ms/image | ImageNet | Top-1:78.8 Top-5:94.3 | 299×299 |
MobileNet | 2017-04-17 | 4.24 | 0.57 | 四核CPU: Xeon E3-1231 v3 | 503.00 ms/image | ImageNet | Top-1:70.6 Top-5:91.7 | 224×224 |
四核GPU: NVIDIA TX2 | 136.00 FPS | |||||||
ShuffleNet | 2017-12-07 | 2.40 | 0.14 | 单核GPU: NVIDIA TX2 | 164.18 FPS | ImageNet | Top-1:68.0 Top-5:86.4 | 512×512 |
PeleeNet | 2018-04-18 | 2.80 | 0.51 | 单核GPU: NVIDIA TX2 | 239.58 FPS | ImageNet | Top-1:72.6 Top-5:86.4 | 304×304 |
DetNet | 2018-04-19 | 25.43 | 0.54 | 八核GPU: NVIDIA Titan XP | 18.54 ms/image | ImageNet | Top-1:76.2 Top-5:89.3 | 1 333×800 |
EffNet | 2018-06-05 | 4.91 | 0.08 | 单核GPU: NVIDIA TX2 | 151.43 FPS | CIFAR-10 | Top-1:80.2 | 224×224 |
ShuffleNet V2 | 2018-07-30 | 2.32 | 0.15 | 单核GPU: NVIDIA GTX 1080 Ti | 24.53 ms/image | ImageNet | Top-1:69.4 Top-5:88.4 | 224×224 |
SqueezeNext | 2018-08-27 | 0.72 | 0.28 | 单核GPU: NVIDIA Titan X | 15.69 ms/image | ImageNet | Top-1:57.2 Top-5:80.2 | 227×227 |
MobileNet V2 | 2019-03-21 | 3.47 | 0.30 | 单核GPU: NVIDIA TX2 | 123.00 FPS | ImageNet | Top-1:72.0 | 224×224 |
DFANet | 2019-04-03 | 7.69 | 3.37 | 单核GPU: NVIDIA Titan X | 10.25 ms/image | CityScapes | class mIoU:71.3 | 1 024×1 024 |
GCNet | 2019-04-25 | 28.08 | 3.86 | 单核CPU: Intel Xeon 4114 | 543.00 ms/image | COCO | mAP:60.8 | 255×255 |
CornerNet-Lite | 2019-04-18 | 171.63 | 11.37 | 四核GPU: NVIDIA GTX 1080 Ti | 114.73 ms/image | COCO | mAP:63.0 | 255×255 |
MobileNet V3 | 2019-05-06 | 5.37 | 0.22 | 单核CPU 单核GPU | 134.55 ms/image 15.44 ms/image | ImageNet | Top-1:75.2 | 224×224 |
LEDNet | 2019-05-13 | 0.94 | 0.35 | 单核GPU: GeForce 1080Ti | 14.00 ms/image (71.00 FPS) | CityScapes | class mIoU:70.6 category mIoU:87.1 | 1 024×512 |
SENet | 2019-05-16 | 28.14 | 20.67 | 八核GPU: NVIDIA Titan X | 167.31 ms/image | ImageNet | Top-1:81.3 Top-5:95.5 | 224×224 |
DeepShift | 2019-06-06 | — | — | 单核CPU: Intel I7-8086k | 750.00 ms/image | ImageNet | Top-1:70.9 Top-5:90.1 | 224×224 |
ThunderNet | 2019-07-26 | 2.08 | 0.46 | 四核CPU: Xeon E5-2682v4 | 47.30 FPS | COCO | mAP:75.1 | 320×320 |
四核GPU:NVIDIA GeForce 1080Ti | 267.00 FPS | |||||||
YOLO Nano | 2019-10-03 | 4.03 | 0.40 | 四核CPU:Kirin 990 | 26.37 ms/image | VOC 2012 | mAP:69.1 | 416×416 |
GhostNet | 2019-11-27 | 5.17 | 0.14 | 单核GPU: NVIDIA Titan X | 15.44 ms/image | ImageNet | Top-1:73.9 Top-5:91.4 | 224×224 |
Rethinking Depthwise Separable Convolutions | 2020-03-31 | 5.03 | 0.74 | 单核GPU: NVIDIA Titan X | 89.00 ms/image | ImageNet | Top-1:72.0 Top-5:93.1 | 224×224 |
Coordinate Attention | 2021-03-04 | 3.95 | 0.30 | 单核GPU: NVIDIA Titan X | 25.98 ms/image | ImageNet | Top-1:74.3 | 255×255 |
表3 模型轻量化效果对比
Table 3 Comparison of model lightweight effects
模型 | 发布时间 | 参数量/106 | MAC/GB | 处理器 | 运行时间 | 数据集 | 精确度/% | 输入尺寸/像素 |
---|---|---|---|---|---|---|---|---|
PVANet | 2016-09-30 | 3.28 | 7.92 | 单核CPU:Intel I7-6700k | 750.00 ms/image (1.31 FPS) | VOC2007 VOC2012 | mAP:83.8 mAP:82.5 | 1 065×640 |
单核GPU:NVIDIA Titan X | 46.00 ms/image (21.73 FPS) | |||||||
SqueezeNet | 2016-11-04 | 1.24 | 0.35 | 单核GPU: NVIDIA Titan X | 15.74 ms/image | ImageNet | Top-1:57.5 Top-5:80.3 | 227×227 |
Xception | 2017-04-04 | 22.85 | 8.42 | 单核GPU: NVIDIA GTX 1070 | 112.00 ms/image | ImageNet | Top-1:78.8 Top-5:94.3 | 299×299 |
MobileNet | 2017-04-17 | 4.24 | 0.57 | 四核CPU: Xeon E3-1231 v3 | 503.00 ms/image | ImageNet | Top-1:70.6 Top-5:91.7 | 224×224 |
四核GPU: NVIDIA TX2 | 136.00 FPS | |||||||
ShuffleNet | 2017-12-07 | 2.40 | 0.14 | 单核GPU: NVIDIA TX2 | 164.18 FPS | ImageNet | Top-1:68.0 Top-5:86.4 | 512×512 |
PeleeNet | 2018-04-18 | 2.80 | 0.51 | 单核GPU: NVIDIA TX2 | 239.58 FPS | ImageNet | Top-1:72.6 Top-5:86.4 | 304×304 |
DetNet | 2018-04-19 | 25.43 | 0.54 | 八核GPU: NVIDIA Titan XP | 18.54 ms/image | ImageNet | Top-1:76.2 Top-5:89.3 | 1 333×800 |
EffNet | 2018-06-05 | 4.91 | 0.08 | 单核GPU: NVIDIA TX2 | 151.43 FPS | CIFAR-10 | Top-1:80.2 | 224×224 |
ShuffleNet V2 | 2018-07-30 | 2.32 | 0.15 | 单核GPU: NVIDIA GTX 1080 Ti | 24.53 ms/image | ImageNet | Top-1:69.4 Top-5:88.4 | 224×224 |
SqueezeNext | 2018-08-27 | 0.72 | 0.28 | 单核GPU: NVIDIA Titan X | 15.69 ms/image | ImageNet | Top-1:57.2 Top-5:80.2 | 227×227 |
MobileNet V2 | 2019-03-21 | 3.47 | 0.30 | 单核GPU: NVIDIA TX2 | 123.00 FPS | ImageNet | Top-1:72.0 | 224×224 |
DFANet | 2019-04-03 | 7.69 | 3.37 | 单核GPU: NVIDIA Titan X | 10.25 ms/image | CityScapes | class mIoU:71.3 | 1 024×1 024 |
GCNet | 2019-04-25 | 28.08 | 3.86 | 单核CPU: Intel Xeon 4114 | 543.00 ms/image | COCO | mAP:60.8 | 255×255 |
CornerNet-Lite | 2019-04-18 | 171.63 | 11.37 | 四核GPU: NVIDIA GTX 1080 Ti | 114.73 ms/image | COCO | mAP:63.0 | 255×255 |
MobileNet V3 | 2019-05-06 | 5.37 | 0.22 | 单核CPU 单核GPU | 134.55 ms/image 15.44 ms/image | ImageNet | Top-1:75.2 | 224×224 |
LEDNet | 2019-05-13 | 0.94 | 0.35 | 单核GPU: GeForce 1080Ti | 14.00 ms/image (71.00 FPS) | CityScapes | class mIoU:70.6 category mIoU:87.1 | 1 024×512 |
SENet | 2019-05-16 | 28.14 | 20.67 | 八核GPU: NVIDIA Titan X | 167.31 ms/image | ImageNet | Top-1:81.3 Top-5:95.5 | 224×224 |
DeepShift | 2019-06-06 | — | — | 单核CPU: Intel I7-8086k | 750.00 ms/image | ImageNet | Top-1:70.9 Top-5:90.1 | 224×224 |
ThunderNet | 2019-07-26 | 2.08 | 0.46 | 四核CPU: Xeon E5-2682v4 | 47.30 FPS | COCO | mAP:75.1 | 320×320 |
四核GPU:NVIDIA GeForce 1080Ti | 267.00 FPS | |||||||
YOLO Nano | 2019-10-03 | 4.03 | 0.40 | 四核CPU:Kirin 990 | 26.37 ms/image | VOC 2012 | mAP:69.1 | 416×416 |
GhostNet | 2019-11-27 | 5.17 | 0.14 | 单核GPU: NVIDIA Titan X | 15.44 ms/image | ImageNet | Top-1:73.9 Top-5:91.4 | 224×224 |
Rethinking Depthwise Separable Convolutions | 2020-03-31 | 5.03 | 0.74 | 单核GPU: NVIDIA Titan X | 89.00 ms/image | ImageNet | Top-1:72.0 Top-5:93.1 | 224×224 |
Coordinate Attention | 2021-03-04 | 3.95 | 0.30 | 单核GPU: NVIDIA Titan X | 25.98 ms/image | ImageNet | Top-1:74.3 | 255×255 |
[1] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 26th Annual Conference on Neural Infor-mation Processing Systems, Lake Tahoe, Dec 3-6, 2012. Red Hook: Curran Associates, 2012: 1097-1105. |
[2] | ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// LNCS 8689: Proceedings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 818-833. |
[3] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014. |
[4] | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 1-9. |
[5] | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Con-ference on Computer Vision and Pattern Recognition, Las Vegas, Jun 26, 2016. Washington: IEEE Computer Society, 2016: 770-778. |
[6] | IANDOLA F, MOSKEWICZ M, KARAYEV S, et al. DenseNet: implementing efficient ConvNet descriptor pyramids[J]. arXiv:1404.1869, 2014. |
[7] | IANDOLA F N, HAN S, MOSKEWICZ M W, et al. Squeeze-Net: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[J]. arXiv:1602.07360, 2016. |
[8] | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami, Jun 20-25, 2009. Washington: IEEE Computer Society, 2009: 248-255. |
[9] | CHEN W L, WILSON J T, TYREE S, et al. Compressing neural networks with the hashing trick[C]// Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 2285-2294. |
[10] | WANG M, LIU B Y, FOROOSH H. Factorized convolutional neural networks[C]// Proceedings of the 2017 IEEE Interna-tional Conference on Computer Vision Workshops, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 545-553. |
[11] | HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network[J]. arXiv:1503.0253, 2015. |
[12] | HAN S, MAO H, DALLY W J. Deep compression: com-pressing deep neural networks with pruning, trained quan-tization and Huffman coding[J]. Fiber, 2015, 56(4): 3-7. |
[13] |
JELODAR H, WANG Y L, ORJI R, et al. Deep sentiment classification and topic discovery on novel coronavirus or Covid-19 online discussions: NLP using LSTM recurrent neural network approach[J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(10): 2733-2742.
DOI URL |
[14] | MORRIS J X, LIFLAND E, YOO J Y, et al. TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP[J]. arXiv:2005.05909, 2020. |
[15] |
ALSHEMALI B, KALITA J. Improving the reliability of deep neural networks in NLP: a review[J]. Knowledge-Based Systems, 2020, 191: 105210.
DOI URL |
[16] | LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. arXiv:2005.11401, 2020. |
[17] | CHEN B Y, LI P X, LI B P, et al. PSViT: better vision trans-former via token pooling and attention sharing[J]. arXiv:2108.03428, 2021. |
[18] | DHANASRI V, DHEEKSHITHA S, SOUNDARYA M. Big data analytics and mining for effective visualization and trends forecasting of crime data[C]// Proceedings of the 2021 International Conference on Computing, Communication, Electrical and Biomedical Systems, Coimbatore, Mar 25-26, 2021. Bristol: IOP Publishing, 2021: 012168. |
[19] | PENG W. Prediction of purchasing power of Google store based on deep ensemble learning model[J]. Automation and Machine Learning, 2019, 1: 1-4. |
[20] | HEMDAN E E-D, SHOUMAN M A, KARAR M E. Covidx-Net: a framework of deep learning classifiers to diagnose Covid-19 in X-ray images[J]. arXiv:2003.11055, 2020. |
[21] |
IBTEHAZ N, RAHMAN M S. MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation[J]. Neural Networks, 2020, 121: 74-87.
DOI URL |
[22] |
FAN D P, ZHOU T, JI G P, et al. Inf-Net: automatic Covid-19 lung infection segmentation from CT images[J]. IEEE Transactions on Medical Imaging, 2020, 39(8): 2626-2637.
DOI URL |
[23] | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2818-2826. |
[24] | KIM K H, CHEON Y, HONG S, et al. PVANET: deep but lightweight neural networks for real-time object detection[J]. arXiv:1608.08021, 2016. |
[25] | WANG R J, BOHN T A, LI C X. Pelee: a real-time object detection system on mobile devices[C]// Proceedings of the Annual Conference on Neural Information Processing Sys-tems, Montréal, Dec 3-6, 2018: 1967-1976. |
[26] | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017. |
[27] | LAW H, TENG Y, RUSSAKOVSKY O, et al. CornerNet-Lite: efficient keypoint based object detection[J]. arXiv:1904.08900, 2019. |
[28] | CHOLLET F. Xception: deep learning with depthwise separ-able convolutions[C]// Proceedings of the 2017 IEEE Confer-ence on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1800-1807. |
[29] | HAASE D, AMTHOR M. Rethinking depthwise separable convolutions: how intra-kernel correlations lead to improved MobileNets[C]// Proceedings of the 2020 IEEE/CVF Confer-ence on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 14588-14597. |
[30] | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 448-456. |
[31] | SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-V4, Inception-ResNet and the impact of residual connections on learning[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 4278-4284. |
[32] | GHOLAMI A, KWON K, WU B C, et al. SqueezeNext: hardware-aware neural network design[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 1638-1647. |
[33] | FREEMAN I, ROESE-KOERNER L, KUMMERT A. EffNet: an efficient structure for convolutional neural networks[C]// Proceedings of the 2018 IEEE International Conference on Image Processing, Athens, Oct 7-10, 2018. Piscataway: IEEE, 2018: 6-10. |
[34] | WANG Y, ZHOU Q, LIU J, et al. LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing, Taipei, China, Sep 22-25, 2019. Pis-cataway: IEEE, 2019: 1860-1864. |
[35] | RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks[C]// LNCS 9908: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 525-542. |
[36] | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 6848-6856. |
[37] | LI Z M, PENG C, YU G, et al. DetNet: design backbone for object detection[C]// LNCS 11213: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 339-354. |
[38] | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv:1511.07122, 2015. |
[39] | SANDLER M, HOWARD A G, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 4510-4520. |
[40] | HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 1580-1589. |
[41] | XIE S N, GIRSHICK R B, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Pro-ceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Wash-ington: IEEE Computer Society, 2017: 5987-5995. |
[42] | MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]// LNCS 11218: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Spri-nger, 2018: 122-138. |
[43] |
CHENG D C, MENG G F, CHENG G L, et al. SeNet: structured edge network for sea-land segmentation[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 14(2): 247-251.
DOI URL |
[44] | HOWARD A, PANG R M, ADAM H, et al. Searching for MobileNetv3[C]// Proceedings of the 2019 IEEE/CVF Inter-national Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 1314-1324. |
[45] | CAO Y, XU J R, LIN S, et al. GCNet: non-local networks meet Squeeze-Excitation networks and beyond[C]// Procee-dings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Oct 27-28, 2019. Pis-cataway: IEEE, 2019: 1971-1980. |
[46] | WANG X L, GIRSHICK R B, GUPTA A, et al. Non-local neural networks[C]// Proceedings of the 2018 IEEE Confer-ence on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 7794-7803. |
[47] | QIU S, WU Y F, ANWAR S, et al. Investigating attention mechanism in 3D point cloud object detection[J]. arXiv:2108.00620, 2021. |
[48] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19. |
[49] | QIN Z, LI Z M, ZHANG Z N, et al. ThunderNet: towards real-time generic object detection on mobile devices[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 6717-6726. |
[50] | GIRSHICK R B, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 580-587. |
[51] | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Reco-gnition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 13713-13722. |
[52] | ELHOUSHI M, CHEN Z H, SHAFIQ F, et al. DeepShift: towards multiplication-less neural networks[J]. arXiv:1905.13298, 2019. |
[53] | COURBARIAUX M, BENGIO Y, DAVID J P. BinaryConnect: training deep neural networks with binary weights during propagations[J]. arXiv:1511.00363, 2015. |
[54] | COURBARIAUX M, HUBARA I, SOUDRY D, et al. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1[J]. arXiv:1602.02830, 2016. |
[55] | PHAN H, HUYNH D, HE Y H, et al. MoBiNet: a mobile binary network for image classification[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Com-puter Vision, Snowmass Village, Mar 1-5, 2020. Piscataway: IEEE, 2020: 3442-3451. |
[56] | BETHGE J, BARTZ C, YANG H J, et al. MeliusNet: can binary neural networks achieve MobileNet-level accuracy?[J]. arXiv:2001.05936, 2020. |
[57] | LIAN D Z, YU Z H, SUN X, et al. AS-MLP: an axial shifted MLP architecture for vision[J]. arXiv:2107.08391, 2021. |
[58] | ZHANG S X, ZHU X, YANG C, et al. Adaptive boundary proposal network for arbitrary shape text detection[J]. arXiv:2107.12664, 2021. |
[59] | 王兵, 乐红霞, 李文璟, 等. 改进YOLO轻量化网络的口罩检测算法[J]. 计算机工程与应用, 2021, 57(8): 62-69. |
WANG B, LE H X, LI W J, et al. Mask detection algorithm based on improved YOLO lightweight network[J]. Computer Engineering and Applications, 2021, 57(8): 62-69. | |
[60] | 刘洋, 战荫伟. 基于深度学习的小目标检测算法综述[J]. 计算机工程与应用, 2021, 57(2): 37-48. |
LIU Y, ZHAN Y W. Survey of small object detection algo-rithms based on deep learning[J]. Computer Engineering and Applications, 2021, 57(2): 37-48. | |
[61] |
权宇, 李志欣, 张灿龙, 等. 融合深度扩张网络和轻量化网络的目标检测模型[J]. 电子学报, 2020, 48(2): 390-397.
DOI |
QUAN Y, LI Z X, ZHANG C L, et al. Fusing deep dilated convolutions network and light-weight network for object detection[J]. Acta Electronica Sinica, 2020, 48(2): 390-397. | |
[62] | 魏宏彬, 张端金, 杜广明, 等. 基于改进型YOLOv3的蔬菜识别算法[J]. 郑州大学学报(工学版), 2020, 41(2): 7-12. |
WEI H B, ZHANG D J, DU G M, et al. Vegetable recognition algorithm based on improved YOLOv3[J]. Journal of Zheng-zhou University (Engineering Science), 2020, 41(2): 7-12. | |
[63] |
QIANG B H, ZHAI Y J, ZHOU M L, et al. SqueezeNet and fusion network-based accurate fast fully convolutional net-work for hand detection and gesture recognition[J]. IEEE Access, 2021, 9: 77661-77674.
DOI URL |
[64] | HU Q Y, YANG B, XIE L H, et al. RandLA-Net: efficient semantic segmentation of large-scale point clouds[C]// Pro-ceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 11105-11114. |
[65] |
ISENSEE F, JAEGER P F, KOHL S A, et al. nnU-Net: a self-configuring method for deep learning-based biome-dical image segmentation[J]. Nature methods, 2021, 18(2): 203-211.
DOI URL |
[66] | DOAN V S, KIM D S. Modified ShuffleNet-based radar signal classification of electronic intelligence system[C]// Proceedings of the Korean Institute of Communications and Information Sciences Winter Conferrence 2020, Seoul, Feb 5-7, 2020. Seoul: KICS, 2020: 285-288. |
[67] | PUJARI S D, PAWAR M M, WADEKAR M. Multi-class-ification of breast histopathological image using Xception: deep learning with depthwise separable convolutions model[M]// PAWAR P M, BALASUBRAMANIAM R, RONGE B P, eds. Techno-Societal 2020. Cham: Springer, 2021: 539-546. |
[1] | 安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897. |
[2] | 曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763. |
[3] | 刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515. |
[4] | 赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503. |
[5] | 张好聪, 李涛, 邢立冬, 潘风蕊. OpenVX特征抽取函数在可编程并行架构的实现[J]. 计算机科学与探索, 2022, 16(7): 1583-1593. |
[6] | 夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610. |
[7] | 孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259. |
[8] | 刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290. |
[9] | 董文轩, 梁宏涛, 刘国柱, 胡强, 于旭. 深度卷积应用于目标检测算法综述[J]. 计算机科学与探索, 2022, 16(5): 1025-1042. |
[10] | 欧阳柳, 贺禧, 瞿绍军. 全卷积注意力机制神经网络的图像语义分割[J]. 计算机科学与探索, 2022, 16(5): 1136-1145. |
[11] | 程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154. |
[12] | 童敢, 黄立波. Winograd快速卷积相关研究综述[J]. 计算机科学与探索, 2022, 16(5): 959-971. |
[13] | 钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990. |
[14] | 裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790. |
[15] | 许嘉, 韦婷婷, 于戈, 黄欣悦, 吕品. 题目难度评估方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 734-759. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||