Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (3): 512-528.DOI: 10.3778/j.issn.1673-9418.2107056
• Surveys and Frontiers • Previous Articles Next Articles
MA Jinlin1,2,+(), ZHANG Yu1,2, MA Ziping3, MAO Kaiji1,2
Received:
2021-07-14
Revised:
2021-09-29
Online:
2022-03-01
Published:
2021-09-29
About author:
MA Jinlin, born in 1976, Ph.D., associate professor. His research interests include computer vision, deep learning and machine learning.Supported by:
马金林1,2,+(), 张裕1,2, 马自萍3, 毛凯绩1,2
通讯作者:
+ E-mail: 624160@163.com作者简介:
马金林(1976—),男,宁夏青铜峡人,博士,副教授,主要研究方向为计算机视觉、深度学习、机器学习。基金资助:
CLC Number:
MA Jinlin, ZHANG Yu, MA Ziping, MAO Kaiji. Research Progress of Lightweight Neural Network Convolution Design[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(3): 512-528.
马金林, 张裕, 马自萍, 毛凯绩. 轻量化神经网络卷积设计研究进展[J]. 计算机科学与探索, 2022, 16(3): 512-528.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2107056
改进方法 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
调整卷积核大小 | 使用3×3卷积替换更大的卷积尺寸 | 感受野与之前保持一致,但使用了更小的计算操作数,保证了模型计算量的降低 | 3×3卷积频繁使用,计算量降低的优势将不复存在 |
使用1×1卷积 | 可手工操控通道数量,控制升维、降维功能 | 需要与其他卷积操作进行联合使用,且注意使用数量 | |
改变卷积核数量 | 在空间维度和通道维度上解耦 | 可将三维输入特征拆解为独立的二维平面特征与一维通道特征;分步计算降低了卷积计算量 | 部分二维卷积训练过程容易废掉,结果经过ReLU时更易于出现输出为0的情况 |
在空间维度上因式分解 | 数学层面进行计算降级,将 | 卷积间结构更复杂 | |
改变卷积核数量与网络宽度 | 结合上两种改进方法,构建出轻量化卷积后,通过跳跃连接、并联等手段进行网络宽度的扩展 | 使用轻量化的分组卷积,调整各卷积分组间的关系,提升模型运行速度 | 需要控制分组数量,防止分组数量过多增加模型计算量 |
Table 1 Comparison of convolution lightweight technology
改进方法 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
调整卷积核大小 | 使用3×3卷积替换更大的卷积尺寸 | 感受野与之前保持一致,但使用了更小的计算操作数,保证了模型计算量的降低 | 3×3卷积频繁使用,计算量降低的优势将不复存在 |
使用1×1卷积 | 可手工操控通道数量,控制升维、降维功能 | 需要与其他卷积操作进行联合使用,且注意使用数量 | |
改变卷积核数量 | 在空间维度和通道维度上解耦 | 可将三维输入特征拆解为独立的二维平面特征与一维通道特征;分步计算降低了卷积计算量 | 部分二维卷积训练过程容易废掉,结果经过ReLU时更易于出现输出为0的情况 |
在空间维度上因式分解 | 数学层面进行计算降级,将 | 卷积间结构更复杂 | |
改变卷积核数量与网络宽度 | 结合上两种改进方法,构建出轻量化卷积后,通过跳跃连接、并联等手段进行网络宽度的扩展 | 使用轻量化的分组卷积,调整各卷积分组间的关系,提升模型运行速度 | 需要控制分组数量,防止分组数量过多增加模型计算量 |
模块轻量化 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
残差模块 | 跳跃连接和沙漏状瓶颈结构 | 优化的卷积结构,可与轻量化的卷积替换,对结构进行修改,不会引入额外参数 | 会产生大量的1×1或3×3卷积 |
ResNeXt模块 | 分组卷积 | 分组卷积对通道数分割后,缩减了模型深度,分组并行执行还可提升模型性能 | 分组个数依赖于硬件资源和网络结构设计 |
SE模块 | 模块中几乎均使用1×1卷积,通过改变通道数,进行识别校准 | 1×1卷积不占用大量计算资源,识别校准后,模型性能均会提升 | Squeeze部分获得的上下文信息无法有效利用 |
注意力模块 | CBAM由CAM和SAM两部分组成 | SE模块在通道维度上的进阶版,且对通道维度的结果在空间维度上进行了调整 | 在网络通道数较高时,CAM中的FC层会导致较为严重的耗时量 |
Table 2 Comparison of lightweight technology of modules
模块轻量化 | 具体思路 | 优点 | 缺点 |
---|---|---|---|
残差模块 | 跳跃连接和沙漏状瓶颈结构 | 优化的卷积结构,可与轻量化的卷积替换,对结构进行修改,不会引入额外参数 | 会产生大量的1×1或3×3卷积 |
ResNeXt模块 | 分组卷积 | 分组卷积对通道数分割后,缩减了模型深度,分组并行执行还可提升模型性能 | 分组个数依赖于硬件资源和网络结构设计 |
SE模块 | 模块中几乎均使用1×1卷积,通过改变通道数,进行识别校准 | 1×1卷积不占用大量计算资源,识别校准后,模型性能均会提升 | Squeeze部分获得的上下文信息无法有效利用 |
注意力模块 | CBAM由CAM和SAM两部分组成 | SE模块在通道维度上的进阶版,且对通道维度的结果在空间维度上进行了调整 | 在网络通道数较高时,CAM中的FC层会导致较为严重的耗时量 |
模型 | 发布时间 | 参数量/106 | MAC/GB | 处理器 | 运行时间 | 数据集 | 精确度/% | 输入尺寸/像素 |
---|---|---|---|---|---|---|---|---|
PVANet | 2016-09-30 | 3.28 | 7.92 | 单核CPU:Intel I7-6700k | 750.00 ms/image (1.31 FPS) | VOC2007 VOC2012 | mAP:83.8 mAP:82.5 | 1 065×640 |
单核GPU:NVIDIA Titan X | 46.00 ms/image (21.73 FPS) | |||||||
SqueezeNet | 2016-11-04 | 1.24 | 0.35 | 单核GPU: NVIDIA Titan X | 15.74 ms/image | ImageNet | Top-1:57.5 Top-5:80.3 | 227×227 |
Xception | 2017-04-04 | 22.85 | 8.42 | 单核GPU: NVIDIA GTX 1070 | 112.00 ms/image | ImageNet | Top-1:78.8 Top-5:94.3 | 299×299 |
MobileNet | 2017-04-17 | 4.24 | 0.57 | 四核CPU: Xeon E3-1231 v3 | 503.00 ms/image | ImageNet | Top-1:70.6 Top-5:91.7 | 224×224 |
四核GPU: NVIDIA TX2 | 136.00 FPS | |||||||
ShuffleNet | 2017-12-07 | 2.40 | 0.14 | 单核GPU: NVIDIA TX2 | 164.18 FPS | ImageNet | Top-1:68.0 Top-5:86.4 | 512×512 |
PeleeNet | 2018-04-18 | 2.80 | 0.51 | 单核GPU: NVIDIA TX2 | 239.58 FPS | ImageNet | Top-1:72.6 Top-5:86.4 | 304×304 |
DetNet | 2018-04-19 | 25.43 | 0.54 | 八核GPU: NVIDIA Titan XP | 18.54 ms/image | ImageNet | Top-1:76.2 Top-5:89.3 | 1 333×800 |
EffNet | 2018-06-05 | 4.91 | 0.08 | 单核GPU: NVIDIA TX2 | 151.43 FPS | CIFAR-10 | Top-1:80.2 | 224×224 |
ShuffleNet V2 | 2018-07-30 | 2.32 | 0.15 | 单核GPU: NVIDIA GTX 1080 Ti | 24.53 ms/image | ImageNet | Top-1:69.4 Top-5:88.4 | 224×224 |
SqueezeNext | 2018-08-27 | 0.72 | 0.28 | 单核GPU: NVIDIA Titan X | 15.69 ms/image | ImageNet | Top-1:57.2 Top-5:80.2 | 227×227 |
MobileNet V2 | 2019-03-21 | 3.47 | 0.30 | 单核GPU: NVIDIA TX2 | 123.00 FPS | ImageNet | Top-1:72.0 | 224×224 |
DFANet | 2019-04-03 | 7.69 | 3.37 | 单核GPU: NVIDIA Titan X | 10.25 ms/image | CityScapes | class mIoU:71.3 | 1 024×1 024 |
GCNet | 2019-04-25 | 28.08 | 3.86 | 单核CPU: Intel Xeon 4114 | 543.00 ms/image | COCO | mAP:60.8 | 255×255 |
CornerNet-Lite | 2019-04-18 | 171.63 | 11.37 | 四核GPU: NVIDIA GTX 1080 Ti | 114.73 ms/image | COCO | mAP:63.0 | 255×255 |
MobileNet V3 | 2019-05-06 | 5.37 | 0.22 | 单核CPU 单核GPU | 134.55 ms/image 15.44 ms/image | ImageNet | Top-1:75.2 | 224×224 |
LEDNet | 2019-05-13 | 0.94 | 0.35 | 单核GPU: GeForce 1080Ti | 14.00 ms/image (71.00 FPS) | CityScapes | class mIoU:70.6 category mIoU:87.1 | 1 024×512 |
SENet | 2019-05-16 | 28.14 | 20.67 | 八核GPU: NVIDIA Titan X | 167.31 ms/image | ImageNet | Top-1:81.3 Top-5:95.5 | 224×224 |
DeepShift | 2019-06-06 | — | — | 单核CPU: Intel I7-8086k | 750.00 ms/image | ImageNet | Top-1:70.9 Top-5:90.1 | 224×224 |
ThunderNet | 2019-07-26 | 2.08 | 0.46 | 四核CPU: Xeon E5-2682v4 | 47.30 FPS | COCO | mAP:75.1 | 320×320 |
四核GPU:NVIDIA GeForce 1080Ti | 267.00 FPS | |||||||
YOLO Nano | 2019-10-03 | 4.03 | 0.40 | 四核CPU:Kirin 990 | 26.37 ms/image | VOC 2012 | mAP:69.1 | 416×416 |
GhostNet | 2019-11-27 | 5.17 | 0.14 | 单核GPU: NVIDIA Titan X | 15.44 ms/image | ImageNet | Top-1:73.9 Top-5:91.4 | 224×224 |
Rethinking Depthwise Separable Convolutions | 2020-03-31 | 5.03 | 0.74 | 单核GPU: NVIDIA Titan X | 89.00 ms/image | ImageNet | Top-1:72.0 Top-5:93.1 | 224×224 |
Coordinate Attention | 2021-03-04 | 3.95 | 0.30 | 单核GPU: NVIDIA Titan X | 25.98 ms/image | ImageNet | Top-1:74.3 | 255×255 |
Table 3 Comparison of model lightweight effects
模型 | 发布时间 | 参数量/106 | MAC/GB | 处理器 | 运行时间 | 数据集 | 精确度/% | 输入尺寸/像素 |
---|---|---|---|---|---|---|---|---|
PVANet | 2016-09-30 | 3.28 | 7.92 | 单核CPU:Intel I7-6700k | 750.00 ms/image (1.31 FPS) | VOC2007 VOC2012 | mAP:83.8 mAP:82.5 | 1 065×640 |
单核GPU:NVIDIA Titan X | 46.00 ms/image (21.73 FPS) | |||||||
SqueezeNet | 2016-11-04 | 1.24 | 0.35 | 单核GPU: NVIDIA Titan X | 15.74 ms/image | ImageNet | Top-1:57.5 Top-5:80.3 | 227×227 |
Xception | 2017-04-04 | 22.85 | 8.42 | 单核GPU: NVIDIA GTX 1070 | 112.00 ms/image | ImageNet | Top-1:78.8 Top-5:94.3 | 299×299 |
MobileNet | 2017-04-17 | 4.24 | 0.57 | 四核CPU: Xeon E3-1231 v3 | 503.00 ms/image | ImageNet | Top-1:70.6 Top-5:91.7 | 224×224 |
四核GPU: NVIDIA TX2 | 136.00 FPS | |||||||
ShuffleNet | 2017-12-07 | 2.40 | 0.14 | 单核GPU: NVIDIA TX2 | 164.18 FPS | ImageNet | Top-1:68.0 Top-5:86.4 | 512×512 |
PeleeNet | 2018-04-18 | 2.80 | 0.51 | 单核GPU: NVIDIA TX2 | 239.58 FPS | ImageNet | Top-1:72.6 Top-5:86.4 | 304×304 |
DetNet | 2018-04-19 | 25.43 | 0.54 | 八核GPU: NVIDIA Titan XP | 18.54 ms/image | ImageNet | Top-1:76.2 Top-5:89.3 | 1 333×800 |
EffNet | 2018-06-05 | 4.91 | 0.08 | 单核GPU: NVIDIA TX2 | 151.43 FPS | CIFAR-10 | Top-1:80.2 | 224×224 |
ShuffleNet V2 | 2018-07-30 | 2.32 | 0.15 | 单核GPU: NVIDIA GTX 1080 Ti | 24.53 ms/image | ImageNet | Top-1:69.4 Top-5:88.4 | 224×224 |
SqueezeNext | 2018-08-27 | 0.72 | 0.28 | 单核GPU: NVIDIA Titan X | 15.69 ms/image | ImageNet | Top-1:57.2 Top-5:80.2 | 227×227 |
MobileNet V2 | 2019-03-21 | 3.47 | 0.30 | 单核GPU: NVIDIA TX2 | 123.00 FPS | ImageNet | Top-1:72.0 | 224×224 |
DFANet | 2019-04-03 | 7.69 | 3.37 | 单核GPU: NVIDIA Titan X | 10.25 ms/image | CityScapes | class mIoU:71.3 | 1 024×1 024 |
GCNet | 2019-04-25 | 28.08 | 3.86 | 单核CPU: Intel Xeon 4114 | 543.00 ms/image | COCO | mAP:60.8 | 255×255 |
CornerNet-Lite | 2019-04-18 | 171.63 | 11.37 | 四核GPU: NVIDIA GTX 1080 Ti | 114.73 ms/image | COCO | mAP:63.0 | 255×255 |
MobileNet V3 | 2019-05-06 | 5.37 | 0.22 | 单核CPU 单核GPU | 134.55 ms/image 15.44 ms/image | ImageNet | Top-1:75.2 | 224×224 |
LEDNet | 2019-05-13 | 0.94 | 0.35 | 单核GPU: GeForce 1080Ti | 14.00 ms/image (71.00 FPS) | CityScapes | class mIoU:70.6 category mIoU:87.1 | 1 024×512 |
SENet | 2019-05-16 | 28.14 | 20.67 | 八核GPU: NVIDIA Titan X | 167.31 ms/image | ImageNet | Top-1:81.3 Top-5:95.5 | 224×224 |
DeepShift | 2019-06-06 | — | — | 单核CPU: Intel I7-8086k | 750.00 ms/image | ImageNet | Top-1:70.9 Top-5:90.1 | 224×224 |
ThunderNet | 2019-07-26 | 2.08 | 0.46 | 四核CPU: Xeon E5-2682v4 | 47.30 FPS | COCO | mAP:75.1 | 320×320 |
四核GPU:NVIDIA GeForce 1080Ti | 267.00 FPS | |||||||
YOLO Nano | 2019-10-03 | 4.03 | 0.40 | 四核CPU:Kirin 990 | 26.37 ms/image | VOC 2012 | mAP:69.1 | 416×416 |
GhostNet | 2019-11-27 | 5.17 | 0.14 | 单核GPU: NVIDIA Titan X | 15.44 ms/image | ImageNet | Top-1:73.9 Top-5:91.4 | 224×224 |
Rethinking Depthwise Separable Convolutions | 2020-03-31 | 5.03 | 0.74 | 单核GPU: NVIDIA Titan X | 89.00 ms/image | ImageNet | Top-1:72.0 Top-5:93.1 | 224×224 |
Coordinate Attention | 2021-03-04 | 3.95 | 0.30 | 单核GPU: NVIDIA Titan X | 25.98 ms/image | ImageNet | Top-1:74.3 | 255×255 |
[1] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 26th Annual Conference on Neural Infor-mation Processing Systems, Lake Tahoe, Dec 3-6, 2012. Red Hook: Curran Associates, 2012: 1097-1105. |
[2] | ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// LNCS 8689: Proceedings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 818-833. |
[3] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014. |
[4] | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 1-9. |
[5] | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Con-ference on Computer Vision and Pattern Recognition, Las Vegas, Jun 26, 2016. Washington: IEEE Computer Society, 2016: 770-778. |
[6] | IANDOLA F, MOSKEWICZ M, KARAYEV S, et al. DenseNet: implementing efficient ConvNet descriptor pyramids[J]. arXiv:1404.1869, 2014. |
[7] | IANDOLA F N, HAN S, MOSKEWICZ M W, et al. Squeeze-Net: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[J]. arXiv:1602.07360, 2016. |
[8] | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami, Jun 20-25, 2009. Washington: IEEE Computer Society, 2009: 248-255. |
[9] | CHEN W L, WILSON J T, TYREE S, et al. Compressing neural networks with the hashing trick[C]// Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 2285-2294. |
[10] | WANG M, LIU B Y, FOROOSH H. Factorized convolutional neural networks[C]// Proceedings of the 2017 IEEE Interna-tional Conference on Computer Vision Workshops, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 545-553. |
[11] | HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network[J]. arXiv:1503.0253, 2015. |
[12] | HAN S, MAO H, DALLY W J. Deep compression: com-pressing deep neural networks with pruning, trained quan-tization and Huffman coding[J]. Fiber, 2015, 56(4): 3-7. |
[13] |
JELODAR H, WANG Y L, ORJI R, et al. Deep sentiment classification and topic discovery on novel coronavirus or Covid-19 online discussions: NLP using LSTM recurrent neural network approach[J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(10): 2733-2742.
DOI URL |
[14] | MORRIS J X, LIFLAND E, YOO J Y, et al. TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP[J]. arXiv:2005.05909, 2020. |
[15] |
ALSHEMALI B, KALITA J. Improving the reliability of deep neural networks in NLP: a review[J]. Knowledge-Based Systems, 2020, 191: 105210.
DOI URL |
[16] | LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. arXiv:2005.11401, 2020. |
[17] | CHEN B Y, LI P X, LI B P, et al. PSViT: better vision trans-former via token pooling and attention sharing[J]. arXiv:2108.03428, 2021. |
[18] | DHANASRI V, DHEEKSHITHA S, SOUNDARYA M. Big data analytics and mining for effective visualization and trends forecasting of crime data[C]// Proceedings of the 2021 International Conference on Computing, Communication, Electrical and Biomedical Systems, Coimbatore, Mar 25-26, 2021. Bristol: IOP Publishing, 2021: 012168. |
[19] | PENG W. Prediction of purchasing power of Google store based on deep ensemble learning model[J]. Automation and Machine Learning, 2019, 1: 1-4. |
[20] | HEMDAN E E-D, SHOUMAN M A, KARAR M E. Covidx-Net: a framework of deep learning classifiers to diagnose Covid-19 in X-ray images[J]. arXiv:2003.11055, 2020. |
[21] |
IBTEHAZ N, RAHMAN M S. MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation[J]. Neural Networks, 2020, 121: 74-87.
DOI URL |
[22] |
FAN D P, ZHOU T, JI G P, et al. Inf-Net: automatic Covid-19 lung infection segmentation from CT images[J]. IEEE Transactions on Medical Imaging, 2020, 39(8): 2626-2637.
DOI URL |
[23] | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 2818-2826. |
[24] | KIM K H, CHEON Y, HONG S, et al. PVANET: deep but lightweight neural networks for real-time object detection[J]. arXiv:1608.08021, 2016. |
[25] | WANG R J, BOHN T A, LI C X. Pelee: a real-time object detection system on mobile devices[C]// Proceedings of the Annual Conference on Neural Information Processing Sys-tems, Montréal, Dec 3-6, 2018: 1967-1976. |
[26] | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017. |
[27] | LAW H, TENG Y, RUSSAKOVSKY O, et al. CornerNet-Lite: efficient keypoint based object detection[J]. arXiv:1904.08900, 2019. |
[28] | CHOLLET F. Xception: deep learning with depthwise separ-able convolutions[C]// Proceedings of the 2017 IEEE Confer-ence on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1800-1807. |
[29] | HAASE D, AMTHOR M. Rethinking depthwise separable convolutions: how intra-kernel correlations lead to improved MobileNets[C]// Proceedings of the 2020 IEEE/CVF Confer-ence on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 14588-14597. |
[30] | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015: 448-456. |
[31] | SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-V4, Inception-ResNet and the impact of residual connections on learning[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, Feb 4-9, 2017. Menlo Park: AAAI, 2017: 4278-4284. |
[32] | GHOLAMI A, KWON K, WU B C, et al. SqueezeNext: hardware-aware neural network design[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 1638-1647. |
[33] | FREEMAN I, ROESE-KOERNER L, KUMMERT A. EffNet: an efficient structure for convolutional neural networks[C]// Proceedings of the 2018 IEEE International Conference on Image Processing, Athens, Oct 7-10, 2018. Piscataway: IEEE, 2018: 6-10. |
[34] | WANG Y, ZHOU Q, LIU J, et al. LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing, Taipei, China, Sep 22-25, 2019. Pis-cataway: IEEE, 2019: 1860-1864. |
[35] | RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks[C]// LNCS 9908: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 525-542. |
[36] | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 6848-6856. |
[37] | LI Z M, PENG C, YU G, et al. DetNet: design backbone for object detection[C]// LNCS 11213: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 339-354. |
[38] | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv:1511.07122, 2015. |
[39] | SANDLER M, HOWARD A G, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 4510-4520. |
[40] | HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 1580-1589. |
[41] | XIE S N, GIRSHICK R B, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Pro-ceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Wash-ington: IEEE Computer Society, 2017: 5987-5995. |
[42] | MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]// LNCS 11218: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Spri-nger, 2018: 122-138. |
[43] |
CHENG D C, MENG G F, CHENG G L, et al. SeNet: structured edge network for sea-land segmentation[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 14(2): 247-251.
DOI URL |
[44] | HOWARD A, PANG R M, ADAM H, et al. Searching for MobileNetv3[C]// Proceedings of the 2019 IEEE/CVF Inter-national Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 1314-1324. |
[45] | CAO Y, XU J R, LIN S, et al. GCNet: non-local networks meet Squeeze-Excitation networks and beyond[C]// Procee-dings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Oct 27-28, 2019. Pis-cataway: IEEE, 2019: 1971-1980. |
[46] | WANG X L, GIRSHICK R B, GUPTA A, et al. Non-local neural networks[C]// Proceedings of the 2018 IEEE Confer-ence on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 7794-7803. |
[47] | QIU S, WU Y F, ANWAR S, et al. Investigating attention mechanism in 3D point cloud object detection[J]. arXiv:2108.00620, 2021. |
[48] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// LNCS 11211: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19. |
[49] | QIN Z, LI Z M, ZHANG Z N, et al. ThunderNet: towards real-time generic object detection on mobile devices[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 6717-6726. |
[50] | GIRSHICK R B, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 580-587. |
[51] | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Reco-gnition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 13713-13722. |
[52] | ELHOUSHI M, CHEN Z H, SHAFIQ F, et al. DeepShift: towards multiplication-less neural networks[J]. arXiv:1905.13298, 2019. |
[53] | COURBARIAUX M, BENGIO Y, DAVID J P. BinaryConnect: training deep neural networks with binary weights during propagations[J]. arXiv:1511.00363, 2015. |
[54] | COURBARIAUX M, HUBARA I, SOUDRY D, et al. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1[J]. arXiv:1602.02830, 2016. |
[55] | PHAN H, HUYNH D, HE Y H, et al. MoBiNet: a mobile binary network for image classification[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Com-puter Vision, Snowmass Village, Mar 1-5, 2020. Piscataway: IEEE, 2020: 3442-3451. |
[56] | BETHGE J, BARTZ C, YANG H J, et al. MeliusNet: can binary neural networks achieve MobileNet-level accuracy?[J]. arXiv:2001.05936, 2020. |
[57] | LIAN D Z, YU Z H, SUN X, et al. AS-MLP: an axial shifted MLP architecture for vision[J]. arXiv:2107.08391, 2021. |
[58] | ZHANG S X, ZHU X, YANG C, et al. Adaptive boundary proposal network for arbitrary shape text detection[J]. arXiv:2107.12664, 2021. |
[59] | 王兵, 乐红霞, 李文璟, 等. 改进YOLO轻量化网络的口罩检测算法[J]. 计算机工程与应用, 2021, 57(8): 62-69. |
WANG B, LE H X, LI W J, et al. Mask detection algorithm based on improved YOLO lightweight network[J]. Computer Engineering and Applications, 2021, 57(8): 62-69. | |
[60] | 刘洋, 战荫伟. 基于深度学习的小目标检测算法综述[J]. 计算机工程与应用, 2021, 57(2): 37-48. |
LIU Y, ZHAN Y W. Survey of small object detection algo-rithms based on deep learning[J]. Computer Engineering and Applications, 2021, 57(2): 37-48. | |
[61] |
权宇, 李志欣, 张灿龙, 等. 融合深度扩张网络和轻量化网络的目标检测模型[J]. 电子学报, 2020, 48(2): 390-397.
DOI |
QUAN Y, LI Z X, ZHANG C L, et al. Fusing deep dilated convolutions network and light-weight network for object detection[J]. Acta Electronica Sinica, 2020, 48(2): 390-397. | |
[62] | 魏宏彬, 张端金, 杜广明, 等. 基于改进型YOLOv3的蔬菜识别算法[J]. 郑州大学学报(工学版), 2020, 41(2): 7-12. |
WEI H B, ZHANG D J, DU G M, et al. Vegetable recognition algorithm based on improved YOLOv3[J]. Journal of Zheng-zhou University (Engineering Science), 2020, 41(2): 7-12. | |
[63] |
QIANG B H, ZHAI Y J, ZHOU M L, et al. SqueezeNet and fusion network-based accurate fast fully convolutional net-work for hand detection and gesture recognition[J]. IEEE Access, 2021, 9: 77661-77674.
DOI URL |
[64] | HU Q Y, YANG B, XIE L H, et al. RandLA-Net: efficient semantic segmentation of large-scale point clouds[C]// Pro-ceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 11105-11114. |
[65] |
ISENSEE F, JAEGER P F, KOHL S A, et al. nnU-Net: a self-configuring method for deep learning-based biome-dical image segmentation[J]. Nature methods, 2021, 18(2): 203-211.
DOI URL |
[66] | DOAN V S, KIM D S. Modified ShuffleNet-based radar signal classification of electronic intelligence system[C]// Proceedings of the Korean Institute of Communications and Information Sciences Winter Conferrence 2020, Seoul, Feb 5-7, 2020. Seoul: KICS, 2020: 285-288. |
[67] | PUJARI S D, PAWAR M M, WADEKAR M. Multi-class-ification of breast histopathological image using Xception: deep learning with depthwise separable convolutions model[M]// PAWAR P M, BALASUBRAMANIAM R, RONGE B P, eds. Techno-Societal 2020. Cham: Springer, 2021: 539-546. |
[1] | AN Fengping, LI Xiaowei, CAO Xiang. Medical Image Classification Algorithm Based on Weight Initialization-Sliding Window CNN [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1885-1897. |
[2] | ZENG Fanzhi, XU Luqian, ZHOU Yan, ZHOU Yuexia, LIAO Junwei. Review of Knowledge Tracing Model for Intelligent Education [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1742-1763. |
[3] | LIU Yi, LI Mengmeng, ZHENG Qibin, QIN Wei, REN Xiaoguang. Survey on Video Object Tracking Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1504-1515. |
[4] | ZHAO Xiaoming, YANG Yijiao, ZHANG Shiqing. Survey of Deep Learning Based Multimodal Emotion Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1479-1503. |
[5] | ZHANG Haocong, LI Tao, XING Lidong, PAN Fengrui. Parallel Implementation of OpenVX Feature Extraction Functions in Programmable Processing Architecture [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1583-1593. |
[6] | XIA Hongbin, XIAO Yifei, LIU Yuan. Long Text Generation Adversarial Network Model with Self-Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1603-1610. |
[7] | SUN Fangwei, LI Chengyang, XIE Yongqiang, LI Zhongbo, YANG Caidong, QI Jin. Review of Deep Learning Applied to Occluded Object Detection [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1243-1259. |
[8] | LIU Yafen, ZHENG Yifeng, JIANG Lingyi, LI Guohe, ZHANG Wenjie. Survey on Pseudo-Labeling Methods in Deep Semi-supervised Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1279-1290. |
[9] | DONG Wenxuan, LIANG Hongtao, LIU Guozhu, HU Qiang, YU Xu. Review of Deep Convolution Applied to Target Detection Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1025-1042. |
[10] | OU Yangliu, HE Xi, QU Shaojun. Fully Convolutional Neural Network with Attention Module for Semantic Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1136-1145. |
[11] | CHENG Weiyue, ZHANG Xueqin, LIN Kezheng, LI Ao. Deep Convolutional Neural Network Algorithm Fusing Global and Local Features [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1146-1154. |
[12] | TONG Gan, HUANG Libo. Review of Winograd Fast Convolution Technique Research [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 959-971. |
[13] | ZHONG Mengyuan, JIANG Lin. Review of Super-Resolution Image Reconstruction Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 972-990. |
[14] | PEI Lishen, ZHAO Xuezhuan. Survey of Collective Activity Recognition Based on Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 775-790. |
[15] | XU Jia, WEI Tingting, YU Ge, HUANG Xinyue, LYU Pin. Review of Question Difficulty Evaluation Approaches [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 734-759. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/