Journal of Frontiers of Computer Science and Technology ›› 2019, Vol. 13 ›› Issue (6): 1027-1037.DOI: 10.3778/j.issn.1673-9418.1805051

Previous Articles     Next Articles

Deep Sparse Auto-Encoder Network for Pedestrian Detection

CUI Peng+, ZHAO Shasha, FAN Zhixu   

  1. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
  • Online:2019-06-01 Published:2019-06-14

行人检测的深度稀疏自编码网络

崔  鹏+,赵莎莎范志旭   

  1. 哈尔滨理工大学 计算机科学与技术学院,哈尔滨 150080

Abstract: Aiming at the problem of slow convolution speed, weak noise, and large redundancy in pedestrian detection in traditional convolutional neural networks, a method based on deep sparse auto-encoder network (DSAEN) is proposed. Firstly, add an interested layer after the input layer, enrich the symmetry characteristics of the image based on the nor-neighboring and neighboring features (NNNF). At the same time, fuse multi-channel cross channel features, like the LUV color space and the gradient amplitude channel. Utilize the improved stochastic pooling method in the regional processing of nor-neighboring feature designs to eliminate the influence of pedestrian deformation, so that the main features that represent pedestrian information are obtained. Then use four auto-encoders to design the four-layer hidden layer depth network. The cross-entropy loss function, the improved ReLU activation functions, and the sparse representation theory are introduced to construct a new objective function, from which the network is learned and the internal structure of the data is searched for. Finally, the classifier is trained using the effective features of the fourth hidden layers. Experimental verification on the public database shows that compared with other existing methods, this proposed method reduces the average miss rate and running time and has good robustness.

Key words: interested layer, nor-neighboring and neighboring features (NNNF), stochastic pooling, deep sparse auto-encoder

摘要: 针对传统卷积神经网络在行人检测中卷积速度慢、抗噪弱、冗余大的问题,提出了一种基于深度稀疏自编码网络的方法。首先在输入层后添加一层感兴趣层,在非相邻和相邻特征(nor-neighboring and neigh-boring features,NNNF)的基础上丰富图像的对称性特征,融合LUV空间、梯度方向等多通道的跨通道特征,并在非相邻特征设计的区域处理中采用一种改进的随机池化方法来消除行人形变的影响,得到表示行人信息的主要特征。然后利用四个自动编码器设计四层隐含层深度网络,以交叉熵为损失函数及改进的ReLU(rectified linear unit)函数为激活函数,以此结合稀疏表示的理论构建新的目标函数来学习网络,寻找数据的内在结构。最后用第四层隐含层输出的有效特征来训练分类器。在公共数据库上进行实验验证,结果表明,与现存的其他方法相比,该方法降低了平均漏检率,减少了运行时间,具有良好的鲁棒性。

关键词: 感兴趣层, 非相邻和相邻特征(NNNF), 随机池化, 深度稀疏自编码