计算机科学与探索

• 学术研究 •    

深度在线多目标跟踪算法综述

刘文强,裘杭萍,李航,杨利,李阳,苗壮,李一,赵昕昕   

  1. 陆军工程大学 指挥控制工程学院,南京 210007

A Survey of Deep Online Multi-object Tracking Algorithms

LIU Wenqiang, QIU Hangping, LI Hang, YANG Li, LI Yang, MIAO Zhuang, LI Yi, ZHAO Xinxin   

  1. Command & Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China

摘要: 视频多目标跟踪是计算机视觉领域的一个关键任务,在工业、商业及军事领域有着广泛的应用前景。目前,深度学习的快速发展为解决多目标跟踪问题提供了多种方案。然而,目标外观发生突变、目标区域被严重遮挡以及目标的消失和出现等挑战性的问题还未完全解决。本文重点关注基于深度学习的在线多目标跟踪算法,总结了该领域的最新进展,按照目标特征预测、表观特征提取和数据关联三个重要模块,依据基于检测跟踪(Detection-Based-Tracking,DBT)和联合检测跟踪(Joint-Detection-Tracking,JDT)两个经典框架将深度在线多目标跟踪算法分为了六个小类,讨论不同类别算法的原理和优缺点。其中,DBT算法的多阶段设计结构清晰,容易优化,但多阶段的训练可能导致次优解;JDT算法融合检测和跟踪的子模块达到了更快的推理速度,但存在各模块协同训练的问题。目前,多目标跟踪开始关注目标的长期特征提取、遮挡目标处理、关联策略改进以及端到端框架的设计。最后,结合已有算法,本文总结了深度在线多目标跟踪亟待解决的问题并展望未来可能的研究方向。

关键词: 在线多目标跟踪, 深度学习, 特征提取, 数据关联

Abstract: Video multi-object tracking is a key task in the field of computer vision and has a wide application prospect in industry, commerce and military fields. At present, the rapid development of deep learning provides many solutions to solve the problem of multi-object tracking. However, the challenging problems such as mutation of target appearance, serious occlusion of target area, disappearance and appearance of target have not been completely solved. This paper focuses on online multi-object Tracking algorithm based on deep learning, and summarizes the latest progress in this field. According to the three important modules of target feature prediction, apparent feature extraction and data association, Detection-Based-Tracking (DBT) and Joint-Detection-Tracking (JDT) frameworks divide deep online multi-object Tracking algorithms into six sub-classes, and discuss the principles, advantages and disadvantages of different types of algorithms. Among them, the multi-stage design of the DBT algorithm has a clear structure and is easy to optimize, but multi-stage training may lead to sub-optimal solutions; the sub-modules of the JDT algorithm that integrates detection and tracking achieve faster inference speed, but there is a problem of collaborative training of each module. Currently, multi-target tracking begins to focus on long-term feature extraction of targets, occlusion target processing, association strategy improvement, and end-to-end framework design. Finally, combined with the existing algorithms, this paper summarizes the urgent problems to be solved in deep online multi-object tracking and looks into the possible research directions in the future.

Key words: Online multi-object tracking, Deep learning, Feature extraction, Data association