计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (3): 545-552.DOI: 10.3778/j.issn.1673-9418.2005001

• 图形图像 • 上一篇    下一篇

Norm-DP模型行人检测优化算法

柴恩惠,马占飞,智敏   

  1. 1. 内蒙古科技大学包头师范学院 信息科学与技术学院, 内蒙古 包头 014030
    2. 内蒙古师范大学 计算机科学学院,呼和浩特 010022
  • 出版日期:2021-03-01 发布日期:2021-03-05

Optimized Pedestrian Detection Algorithm for Norm-DP Model

CHAI Enhui, MA Zhanfei, ZHI Min   

  1. 1. School of Information Science and Technology, Inner Mongolia University of Science and Technology Baotou Teachers?? College, Baotou, Inner Mongolia 014030, China
    2. School of Computer Science, Inner Mongolia Normal University, Hohhot 010022, China
  • Online:2021-03-01 Published:2021-03-05

摘要:

传统深度金字塔模型作为一种有效的行人检测算法备受关注,融合可变形部件模型和卷积神经网络模型,但特征提取部分使用的算法像素区域的大小不同,导致模型之间不能完全融合,在行人数量多、姿势复杂和有遮挡情况时的检测效果不理想。因此,提出一种基于规范化函数的深度金字塔模型(Norm-DP)算法,使用规范化函数融合可变形部件模型和卷积神经网络模型,直接从金字塔特征中提取正负样本,使用隐变量支持向量机进行模型训练,结合柔性非最大抑制(soft-NMS)算法和边界框回归(BBR)算法对定位框进行优化。分别使用INRIA和MS COCO数据集进行实验验证,在行人数量多、姿势复杂和有遮挡情况时,检测精度高于最优的可变形部件模型算法、卷积神经网络算法、深度金字塔模型算法和结合区域选择的卷积神经网络算法。

关键词: 卷积神经网络(CNN), 可变形部件模型算法, 规范化深度金字塔(Norm-DP), 柔性非最大抑制(Soft-NMS), 边界框回归(BBR)

Abstract:

The traditional deep pyramid model attracts much attention as an effective pedestrian detection algorithm. It combines deformable part model and convolutional neural network model. However, the algorithm adopted in the feature extraction section has different pixel area sizes, so the models cannot be fully fused. The detection result is not ideal when it comes to the situation with a large number of pedestrians, complex postures, and occlusions. Therefore, a deep pyramid model algorithm based on normalization function (Norm-DP) is proposed in this paper. This algorithm combines the deformable part model and the convolutional neural network model, which extracts positive and negative samples directly from the pyramid features. Model training is then conducted on a latent variable support vector machine. The positioning frame is optimized through soft-non-maximum suppression (soft-NMS) algorithm and bounding box regression (BBR) algorithm. Experimental verification is performed on INRIA and MS COCO datasets. As a result, the detection accuracy of the proposed algorithm is higher than the optimal deformable part model algorithm, convolutional neural network algorithm, deep pyramid model algorithm and convolutional neural network algorithm combined with region selection in the situation with many pedestrians, complex postures and occlusions.

Key words: convolutional neural network (CNN), deformable part model algorithm, normalization deep pyramid (Norm-DP), soft-non-maximum suppression (Soft-NMS), bounding box regression (BBR)