[1] LeCun Y, Denker J S, Solla S A. Optimalbrain damage[J]. Advances in Neural Information Processing Systems, 1990, 2: 598-605.
[2] Han S, Mao H, Dally W J. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding[J]. Fiber, 2015, 56(4): 3-7.
[3] Zhang C, Tian J, Wang Y S, et al. Survey of model compression method for neural networks[J]. Computer Science, 2018, 45(10): 1-5.张弛, 田锦, 王永森, 等. 神经网络模型压缩方法综述[J]. 计算机科学, 2018, 45(10): 1-5.
[4] Cao W L, Rui J W, Li M. Survey of neural network model compression methods[J]. Application Research of Computers, 2018, 36(3): 649-656.曹文龙, 芮建武, 李敏. 神经网络模型压缩方法综述[J]. 计算机应用研究, 2018, 36(3): 649-656.
[5] Luo J, Wu J. An entropy-based pruning method for CNN compression[J]. arXiv:1706.05791, 2017.
[6] Yang T, Chen Y, Sze V. Designing energy-efficient con-volutional neural networks using energy-aware pruning[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 6071-6079.
[7] Hu Y, Sun S, Li J, et al. A novel channel pruning method for deep neural network compression[J]. arXiv:1805.1139419, 2018.
[8] He Y H,Zhang X Y, Sun J. Channel pruning for acce-lerating very deep neural networks[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 1389-1397.
[9] Anwar S, Sung W Y. Coarse pruning of convolutional neural networks with random masks[C]//Proceedings of the 2017 International Conference on Learning Representations, Toulon, Apr 24-26, 2017: 134-145.
[10] Li H, Kadav A, Durdanovic I, et al. Pruning filters for efficient ConvNets[J]. arXiv:1608.08710, 2016.
[11] Pavlo M, Stephen T, Tero K, et al. Pruning convolutional neural networks for resource efficient inference[J]. arXiv:1611.06440, 2016.
[12] Hu H, Peng R, Tai Y W, et al. Network trimming: a data-driven neuron pruning approach towards efficient deep architectures[J]. arXiv:1611.05128, 2016.
[13] Deepak M, Shweta B, Mitesh M, et al. Recovering from random pruning: on the plasticity of deep convolutional neural networks[J]. arXiv:1801.10447, 2018.
[14] He Y, Liu P, Wang Z, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington: IEEE Computer Society, 2019: 4340-4349.
[15] Yu R, Li A, Chen C F, et al. NISP: pruning networks using neuron importance score propagation[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington:IEEE Computer Society, 2018: 9194-9203.
[16] Guo Y, Yao A, Chen Y. Dynamic network surgery for efficient DNNs[J]. arXiv:1608.04493, 2016.
[17] Tian Q, Tal A, James J, et al. Deep LDA-pruned nets for efficient facial gender classification[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 512-521.
[18] Cheng Y, Yu F X, Feris R S, et al. An exploration of parameter redundancy in deep networks with circulant projections[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Washington: IEEE Computer Society, 2015: 2857-2865.
[19] Hsiao T Y, Chang Y C, Chou H, et al. Filter-based deep-compression with global average pooling for convolutional networks[J]. Journal of Systems Architecture, 2019, 95: 9-18.
[20] Lin S, Ji R, Yan C, et al. Towards optimal structured CNN pruning via generative adversarial learning[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington: IEEE Computer Society, 2019: 2790-2799.
[21] Anwar S, Hwang K, Sung W. Structured pruning of deep convolutional neural networks[J]. ACM Journal on Emerging Technologies in Computing Systems, 2017, 13(3): 32.
[22] Jin L L, Yang W Z, Wang S L, et al. Mixed pruning method for convolutional neural network compression[J]. Journal of Chinese Computer Systems, 2018, 39(12): 2596-2601.靳丽蕾, 杨文柱, 王思乐, 等. 一种用于卷积神经网络压缩的混合剪枝方法[J]. 小型微型计算机系统, 2018, 39(12): 2596-2601.
[23] Huang C, Chang T, Tan H, et al. Neural network pruning based on weight similarity[J]. Journal of Frontiers of Computer Science and Technology,?2018,?12(8):?1278-1285.黄聪, 常滔, 谭虎, 等. 基于权值相似性的神经网络剪枝[J]. 计算机科学与探索,2018,12(8):1278-1285.
[24] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. Computer Science, 2015, 14(7): 38-39.
[25] Sergey Z, Nikos K. Paying more attention to attention: improving the performance of convolution neural networks via attention transfer[J]. arXiv:1612.03928, 2016.
[26] Wen W, Wu C, Wang Y, et al. Learning structured sparsity in deep neural networks[J]. arXiv:1608.03665, 2016.
[27] Du Y H, Jae Y N, Byoung C K. Estimation of pedestrian pose orientation using soft target training based on teacher-student framework[J]. Sensors, 2019, 19(5): 1147.
[28] Min R, Hai L, Zong J, et al. A gradually distilled CNN for SAR target, recognition[J]. IEEE Access, 2019, 7: 42190-42200.
[29] Xu Z, Song Z Q. Convolution neural network compression method with scale factor[J]. Computer Engineering and Applications, 2018, 54(12): 105-109.徐喆, 宋泽奇. 带比例因子的卷积神经网络压缩方法[J].计算机工程与应用,2018,54(12):105-109.
[30] Rastegari M, Ordonez V, Redmon J, et al. XNOR-Net: Im-ageNet classification using binary convolutional neural networks[J]. arXiv:1603.05279, 2016.
[31] Lin X, Zhao C, Pan W. Towards accurate binary convolutional neural network[J]. arXiv:1711.11294, 2017.
[32] Li Z, Ni B, Zhang W, et al. Performance guaranteed network acceleration via high-order residual quantization[J]. arXiv:1708.08687, 2017.
[33] Liu Z, Wu B, Luo W, et al. Bi-Real Net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm[J]. arXiv:1808. 00278, 2018.
[34] Courbariaux M, Bengio Y, David J P. BinaryConnect:training deep neural networks with binary weights during propagations[J]. arXiv:1511.00363, 2015.
[35] Zhu C, Han S, Mao H, et al. Trained ternary quantization[J]. arXiv:1612.01064, 2016.
[36] Xu Y, Dong X, Li Y, et al. A main/subsidiary network framework for simplifying binary neural networks[J]. arXiv: 1812.04210, 2018.
[37] Li F, Zhang B, Liu B. Ternary weight networks[J]. arXiv:1605.04711, 2016.
[38] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for eifficient integer-arithmetic-only infer-ence[J]. arXiv:1712.05877, 2017.
[39] Sangil J, Son C Y, Seohyung L, et al. Learning to quantize deep networks by optimizing quantization intervals with task loss[J]. arXiv:1808.05779, 2018.
[40] Dong Y, Ni R, Li J, et al. Learning accurate low-bit deep neural networks with stochastic quantization[J]. arXiv:1708. 01001, 2017.
[41] Zhou A J, Yao A B, Guo Y W, et al. Incremental network quantization: towards lossless CNNS with low-precision wei-ghts[J]. arXiv:1702.03044, 2017.
[42] Wang Y, Xu C, You S, et al. CNNpack: packing convolutional neural networks in the frequency domain[J]. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 2019, 41: 2495-2510.
[43] Qin Z D, Zhu D, Zhu X W, et al. Accelerating deep neural networks by combining block-circulant matrices and low-precision weights[J]. Electronics, 2019, 8(1): 78.
[44] Seo S, Kim J. Efficient weights quantization of convolutional neural networks using kernel density estimation based non-uniform quantizer[J]. Applied Sciences, 2019, 9(12): 2559.
[45] Tan W R, Chan C S, Aguirre H E, et al. Fuzzy qualitative deep compression network[J]. Neurocomputing, 2017, 251:1-15.
[46] Lin J, Gan C, Han S. Defensive quantization: when effi-ciency meets robustness[J]. arXiv:1904.08444, 2019.
[47] Li Y, Lin S, Zhang B, et al. Exploiting kernel sparsity and entropy for interpretable CNN compression[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington:IEEE Computer Society, 2019: 2800-2809.
[48] Wang K, Liu Z, Lin Y, et al. HAQ: hardware-aware automated quantization[J]. arXiv:1811.08886, 2018.
[49] Howard A G, Zhu M, Chen B, et al. MobileNets: efficient convolutional neural networks for mobile vision applic-ations[J]. arXiv:1704.04861, 2017.
[50] Sandler M, Haward A, Zhu M L, et al. MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington:IEEE Computer Society, 2018: 4510-4520.
[51] Zhang X Y, Zhou X Y, Lin M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington: IEEE Computer Society, 2018: 6848-6856.
[52] Ma N N, Zhang X Y, Zheng H T, et al. ShuffleNet-V2: prac-tical guidelines for efficient CNN architecture design[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 122-138.
[53] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and<0.5 MB model size[J]. arXiv:1602.07360, 2016.
[54] Chollet F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1251-1258.
[55] Mehta S, Rastegari M, Shapiro L, et al. ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington: IEEE Computer Society, 2019: 9190-9200.
[56] Li X, Long R, Yan J, et al. TANet: a tiny plankton class-ification network for mobile devices[J]. Mobile Information Systems, 2019(4): 1-8.
[57] Jin X, Yuan X, Feng J, et al. Training skinny deep neural networks with iterative hard thresholding methods[J]. arXiv: 1607.05423, 2016.
[58] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[J]. arXiv:1611.05431, 2016.
[59] Yu Y, Huang J, Du W, et al. Design and analysis of a lightweight context fusion CNN scheme for crowd counting[J]. Sensors, 2019, 19(9): 2013.
[60] Zhu G, Wang J, Wang P, et al. Feature distilled tracking[J].IEEE Transactions on Cybernetics, 2019, 49(2): 440-452.
[61] Elsken T, Metzen J H, Hutter F. Neural architecture search:a survey[J]. arXiv:1808.05377, 2018.
[62] Barret Z, Quoc V L. Neural architecture search with rein-forcement learning[J]. arXiv:1611.01578, 2016.
[63] Fan S, Yu H, Lu D, et al. CSCC: convolution split com-pression calculation algorithm for deep neural network[J]. IEEE Access, 2019, 7: 71607-71615.
[64] Liu H, Simonyan K, Yang Y. Darts: differentiable arch-itecture search[J]. arXiv:1806.09055, 2018.
[65] Frankle J, Carbin M. The lottery ticket hypothesis: finding sparse, trainable neural networks[J]. arXiv:1803.03635, 2018.
[66] Tan M, Chen B, Pang R, et al. MnasNet: platform-aware neural architecture search for mobile[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Washington: IEEE Computer Society, 2019: 2820-2828.
[67] Cai H, Zhu L G, Han S. Proxyless NAS: direct neural architecture search on target task and hardware[J]. arXiv:1812.00332, 2018.
[68] Bergomi M G, Frosini P, Giorgi D, et al. Towards a topolo-gical-geometrical theory of group equivariant non-expansive operators for data analysis and machine learning[J]. Nature Machine Intelligence, 2019,?1: 423-433.
[69] Prost-Boucle A, Bourge A, Petrot F. High-efficiency con-volutional ternary neural networks with custom adder trees and weight compression[J]. ACM Transactions on Recon-figurable Technology & Systems, 2018, 11(3): 1-24.
[70] Tran D T, Alexandros I, Moncef G. Improving efficiency in convolutional neural networks with multilinear filters[J].Neural Networks, 2018, 105: 328-339.
[71] Tan M, Le Q V. EfficientNet: rethinking model scaling for convolutional neural networks[J]. arXiv:1905.11946, 2019.
[72] Zhu S L, Dong X, Su H. Binary ensemble neural network: more bits per network or more networks perbit?[J]. arXiv:1806.07550, 2018.
[73] Wang X, Kan M, Shan S, et al. Fully learnable group convolution for acceleration of deep neural networks[J]. arXiv: 1904.00346, 2019.
[74] Pham H, Guan M Y, Zoph B, et al. Efficient neural arch-itecture search via parameter sharing[J]. arXiv:1802.03268, 2018.
[75] Baker B, Gupta O, Naik N, et al. Designing neural network architectures using reinforcement learning[J]. arXiv:1611. 02167, 2016.
[76] Zhong Z, Yan J, Wu W, et al. Practical block-wise neural network architecture generation[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington: IEEE Computer Society, 2018: 2423-2432.
[77] Zhong Z, Yan J, Liu C L. Practical network blocks design with Q-learning[J]. arXiv:1708.05552, 2017.
[78] Luo R, Tian F, Qin T, et al. Neural architecture optimization[J]. arXiv:1808.07233, 2018.
[79] Liu C, Zoph B, Neumann M, et al. Progressive neural architecture search[J]. arXiv:1712.00559, 2017.
[80] Kirthevasan K, Willie N, Jeff S, et al. Neural architecture search with Bayesian optimisation and optimal transport[J]. arXiv:1802.07191, 2018.
[81] Zela A, Klein A, Falkner S, et al. Towards automated deep learning: efficient joint neural architecture and hyperpa-rameter search[J]. arXiv:1807.06906, 2018.
[82] Hsu C H, Chang S H, Liang J H, et al. MONAS: multi-objective neural architecture search using reinforcement learning[J]. arXiv:1806.10332, 2018.
[83] Dong J D, Cheng A C, Juan D C, et al. DPP-Net: device-aware progressive search for pareto-optimal neural architec-tures[J]. arXiv:1806.08198, 2018.
[84] Cheng Y, Wang D, Zhou P, et al. Model compression and acceleration for deep neural networks: the principles, progress, and challenges[J]. IEEE Signal Processing Magazine, 2018, 35(1): 126-136. |