Journal of Frontiers of Computer Science and Technology ›› 2018, Vol. 12 ›› Issue (8): 1278-1285.DOI: 10.3778/j.issn.1673-9418.1710016

Previous Articles     Next Articles

Neural Network Pruning Based on Weight Similarity

HUANG Cong1, CHANG Tao1, TAN Hu2, LV Shaohe1, WANG Xiaodong1+   

  1. 1. China National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410073, China
    2. Xinjiang Armed Police Corps, Urumqi 830063, China
  • Online:2018-08-01 Published:2018-08-09


黄  聪1,常  滔1,谭  虎2,吕绍和1,王晓东1+   

  1. 1. 国防科技大学 并行与分布处理重点实验室,长沙 410073
    2. 武警新疆总队,乌鲁木齐 830063

Abstract: With the development of deep learning, the structure of deep neural network is becoming more and more complicated, which makes the depth, size and computation increase, as well as memory cost and GPU cost, so it is difficult to apply the model to mobile terminal and embedded devices like auto-driving which have limited resources and high real-time request. Lightening network is the future, building small and efficient neural network has been a hot spot. By studying the propagation mode of weight update of neural network, this paper proposes a simple but useful way to compress the net. For the convolution net, this paper applies the separable convolution to speed the process and evaluate it at a standard 5 layer network. As for full-connect layer, this paper uses a pruning method based on weight similarity, which can prune more than 90% full connect layer units without obviously accuracy decline on mnist dataset and cifar10 dataset.

Key words: deep learning, neural network, net compression, model accelerate, units pruning

摘要: 随着深度学习的发展,深度神经网络结构变得越来越复杂,模型深度、大小、计算量以及模型运行时的内存开销、显存开销都随之快速上升,这使得深度网络难以应用到硬件资源不足且实时性要求高的移动终端或是嵌入式设备上(如自动驾驶汽车)。神经网络的轻型化是其未来发展的方向,网络压缩已经成为当下研究的热点。对神经网络权值更新传播方式进行深入研究,提出一种简单且易于理解的网络压缩方法。针对卷积层,采用一种基于可分卷积的卷积层加速方法,能够有效减少卷积层(包含pooling)的计算量,并在5层网络中实验测评性能;针对全连接层,应用基于权值相似的剪枝方法。在标准mnist数据集和cifar10数据集上,能够剪除90%以上的全连接层单元,而模型分类准确率没有明显下降。

关键词: 深度学习, 神经网络, 网络压缩, 模型加速, 剪枝