视频目标跟踪算法综述

doi:10.3778/j.issn.1673-9418.2111105

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (7): 1504-1515.DOI: 10.3778/j.issn.1673-9418.2111105

视频目标跟踪算法综述

刘艺¹^,⁺(), 李蒙蒙¹, 郑奇斌², 秦伟¹, 任小广¹

1.国防科技创新研究院,北京 100071
2.军事科学院,北京 100091

收稿日期:2021-11-22 修回日期:2022-01-20 出版日期:2022-07-01 发布日期:2022-07-25
作者简介:刘艺（1990—）,男,安徽蚌埠人,博士,助理研究员,主要研究方向为机器人操作系统、数据质量、演化算法。
LIU Yi, born in 1990, Ph.D., assistant researcher. His research interests include robot operating system, data quality and evolutionary algorithms.
李蒙蒙（1992—）,女,河北邯郸人,硕士研究生,主要研究方向为演化算法、数据质量、目标跟踪等。
LI Mengmeng, born in 1992, M.S. candidate. Her research interests include evolutionary algorithms, data quality, object tracking, etc.
郑奇斌（1990—）,男,甘肃兰州人,博士,助理研究员,主要研究方向为数据工程、数据挖掘、机器学习等。
ZHENG Qibin, born in 1990, Ph.D., assistant researcher. His research interests include data engineering, data mining, machine learning, etc.
秦伟（1983—）,男,安徽阜阳人,硕士,助理研究员,主要研究方向为智能信息系统管理。
QIN Wei, born in 1983, M.S., assistant researcher. His research interest is intelligent information system management.
任小广（1986—）,男,湖北随州人,博士,副研究员,主要研究方向为高性能计算、数值计算和模拟、机器人操作系统等。
REN Xiaoguang, born in 1986, Ph.D., associate research fellow. His research interests include high performance computing, numerical computation and simulation, robot operation systems, etc.
基金资助:
国家自然科学基金青年基金项目(61802426)

Survey on Video Object Tracking Algorithms

LIU Yi¹^,⁺(), LI Mengmeng¹, ZHENG Qibin², QIN Wei¹, REN Xiaoguang¹

1. Defense Innovation Institute, Beijing 100071, China
2. Academy of Military Science, Beijing 100091, China

Received:2021-11-22 Revised:2022-01-20 Online:2022-07-01 Published:2022-07-25
Supported by:
the National Natural Science Foundation for Young Scientists of China(61802426)

摘要/Abstract

摘要：

视频目标跟踪是计算机视觉领域重要的研究内容,主要研究在视频流或者图像序列中定位其中感兴趣的物体。视频目标跟踪在视频监控、无人驾驶、精确制导等领域中具有广泛的应用,因此,全面地综述视频目标跟踪算法具有重要的意义。首先根据挑战来源不同,将视频目标跟踪技术面临的挑战分为目标自身因素和背景因素两方面,并分别进行总结;其次将近些年典型的视频目标跟踪算法分为基于相关滤波的视频目标跟踪算法和基于深度学习的视频目标跟踪算法,并进一步将基于相关滤波的视频目标跟踪算法分为核相关滤波算法、尺度自适应相关滤波算法和多特征融合相关滤波算法三类,将基于深度学习的视频目标跟踪算法分为基于孪生网络的视频目标跟踪算法和基于卷积神经网络的视频目标跟踪算法两类,并对各类算法从研究动机、算法思想、优缺点等方面进行分析;然后介绍了视频目标跟踪算法中常用的数据集和评价指标;最后总结了全文,并指出视频目标跟踪领域未来的发展趋势。

关键词: 计算机视觉, 视频目标跟踪, 相关滤波, 深度学习

Abstract:

Video object tracking is an important research content in the field of computer vision, mainly studying the tracking of objects with interest in video streams or image sequences. Video object tracking has been widely used in cameras and surveillance, driverless, precision guidance and other fields. Therefore, a comprehensive review on video object tracking algorithms is of great significance. Firstly, according to different sources of challenges, the challenges faced by video object tracking are classified into two aspects, the objects’ factors and the backgrounds’ factors, and summed up respectively. Secondly, the typical video object tracking algorithms in recent years are classified into correlation filtering video object tracking algorithms and deep learning video object tracking algorithms. And further the correlation filtering video object tracking algorithms are classified into three categories: kernel correlation filtering algorithms, scale adaptive correlation filtering algorithms and multi-feature fusion corre-lation filtering algorithms. The deep learning video object tracking algorithms are classified into two categories: video object tracking algorithms based on siamese network and based on convolutional neural network. This paper analyzes various algorithms from the aspects of research motivation, algorithm ideas, advantages and disadvantages. Then, the widely used datasets and evaluation indicators are introduced. Finally, this paper sums up the research and looks forward to the development trends of video object tracking in the future.

Key words: computer vision, video object tracking, correlation filtering, deep learning

中图分类号:

TP391.4

刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515.

LIU Yi, LI Mengmeng, ZHENG Qibin, QIN Wei, REN Xiaoguang. Survey on Video Object Tracking Algorithms[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1504-1515.

图/表 6

参考文献 74

[1]	李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080.
	LI X, ZHA Y F, ZHANG T Z, et al. Survey of visual object tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080.
[2]	LIANG J W, JIANG L, NIEBLES J C, et al. Peeking into the future: predicting future person activities and locations in videos[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5725-5734.
[3]	JANI D, MANKODIA A. Comprehensive analysis of object detection and tracking methodologies from surveillance videos[C]// Proceedings of the 2021 International Conference on Computing Methodologies and Communication, Erode, Apr 8-10, 2021. Piscataway: IEEE, 2021: 963-970.
[4]	LI P, CHEN X, SHEN S. Stereo R-CNN based 3D object detection for autonomous driving[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 7644-7652.
[5]	BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]// Procee-dings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, Jun 13-18, 2010. Washin-gton: IEEE Computer Society, 2010: 2544-2550.
[6]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. DOI URL
[7]	HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploi-ting the circulant structure of tracking-by-detection with ker-nels[C]// LNCS 7575: Proceedings of the 12th European Confe-rence on Computer Vision, Florence, Oct 7-13, 2012. Berlin, Heidelberg: Springer, 2012: 702-715.
[8]	HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. DOI URL
[9]	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recog-nition, San Diego, Sep 21-23, 2005. Washington: IEEE Com-puter Society, 2005: 886-893.
[10]	BAO F, CAO Y, ZHANG S, et al. Using segmentation with multi-scale selective kernel for visual object tracking[J]. IEEE Signal Processing Letters, 2022, 29: 553-557. DOI URL
[11]	LI H, PU L. Correlation filtering tracking algorithm with joint scale estimation and occlusion processing[C]// Procee-dings of the 2021 International Conference on Intelligent Transportation, Big Data & Smart City, Xi’an, Mar 27-28, 2021. Piscataway: IEEE, 2021: 663-667.
[12]	FANG Y, JO G S, LEE C H. RSINet: rotation-scale invariant network for online visual tracking[C]// Proceedings of the 2021 International Conference on Pattern Recognition, Milan, Jan 10-15, 2021. Piscataway: IEEE, 2021: 4153-4160.
[13]	SHAO J, DU B, WU C, et al. Can we track targets from space? A hybrid kernel correlation filter tracker for satellite video[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(11): 8719-8731. DOI URL
[14]	马珺, 王昱皓. 结合自适应更新策略和再检测技术的跟踪算法[J]. 计算机工程与应用, 2021, 57(9): 217-224.
	MA J, WANG Y H. Object tracking algorithm based on adap-tive update strategy and re-detection technology[J]. Computer Engineering and Applications, 2021, 57(9): 217-224.
[15]	YUAN Y, CHU J, LENG L, et al. A scale-adaptive object-tracking algorithm with occlusion detection[J]. Eurasip Journal on Image and Video Processing, 2020(1): 7.
[16]	HE X, ZHAO L, CHEN Y C. Variable scale learning for visual object tracking[J]. Journal of Ambient Intelligence and Huma-nized Computing, 2021. DOI: 10.1007/s12652-021-03469-2. DOI
[17]	HAN W, DONG X, KHAN F S, et al. Learning to fuse asym-metric feature maps in siamese trackers[C]// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 16570-16580.
[18]	LI C, LIU X, ZHANG X, et al. Design of UAV single object tracking algorithm based on feature fusion[C]// Proceedings of the 2021 Chinese Control Conference, Shanghai, Jul 26-28, 2021. Piscataway: IEEE, 2021: 3088-3092.
[19]	ZHANG K H, ZHANG L, LIU Q S, et al. Fast visual trac-king via dense spatio-temporal context learning[C]// LNCS 8693: Proceedings of the 13th European Conference on Com-puter Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 127-141.
[20]	COMANICIU D, RAMESH V, MEER P. Kernel-based object tracking[J]. IEEE Transactions on Pattern Analysis and Mac-hine Intelligence, 2003, 25(5): 564-575.
[21]	DANELLJAN M, KHAN F S, FELSBERG M, et al. Ada-ptive color attributes for real-time visual tracking[C]// Procee-dings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washin-gton: IEEE Computer Society, 2014: 1090-1097.
[22]	YIN Y K, DU X P, CHU W B, et al. A color histogram based large motion trend fusion algorithm for vehicle tracking[J]. IEEE Access, 2021, 9: 83394-83401. DOI URL
[23]	MORRIS R, MIRZAEI S. Efficient FPGA implementation of parameterized real time color based object tracking[C]// Proceedings of the 2021 IEEE Annual Information Techno-logy, Electronics and Mobile Communication Conference, Van-couver, Oct 27-30, 2021. Piscataway: IEEE, 2021: 102-105.
[24]	ZHANG P, ZHAO J, BO C, et al. Jointly modeling motion and appearance cues for robust RGB-T tracking[J]. IEEE Transac-tions on Image Processing, 2021, 30: 3335-3347.
[25]	DANELLJAN M, HAGER G, KHAN F S, et al. Convolu-tional features for correlation filter based visual tracking[C]// Proceedings of the 2015 International Conference on Com-puter Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 621-629.
[26]	DANELLJAN M, HAGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]// Proceedings of the 2015 International Conference on Com-puter Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 4310-4318.
[27]	DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking[C]// LNCS 9909: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 8-16, 2016. Cham: Springer, 2016: 472-488.
[28]	WANG L, GUO S, HUANG W, et al. Places205-VGGNet models for scene recognition[J]. arXiv:1508.01667v1, 2015.
[29]	DANELLJAN M, BHAT G, KHAN F S, et al. ECO: effi-cient convolution operators for tracking[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 6931-6939.
[30]	BHAT G, JOHNANDER J, DANELLJAN M, et al. Unvei-ling the power of deep tracking[C]// LNCS 11206: Proceedings of the 15th European Conference on Computer Vision, Mun-ich, Sep 8-14, 2018. Cham: Springer, 2018: 493-509.
[31]	LI D L, LU R T, YANG X G. Object tracking based on kernel correlation filter and multi-feature fusion[C]// Procee-dings of the 2019 Chinese Automation Congress, Hangzhou, Nov 23-24, 2019. Piscataway: IEEE, 2019: 4192-4196.
[32]	BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking[C]// LNCS 9914: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 8-16, 2016. Cham: Spri-nger, 2016: 850-865.
[33]	LI B, YAN J, WU W, et al. High performance visual tracking with siamese region proposal network[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Reco-gnition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 8971-8980.
[34]	WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: a unifying approach[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Pisca-taway: IEEE, 2019: 1328-1338.
[35]	FAN H, LING H. Siamese cascaded region proposal networks for real-time visual tracking[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 7952-7961.
[36]	WANG G, LUO C, XIONG Z, et al. SPM-Tracker: series-parallel matching for real-time visual object tracking[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 3643-3652.
[37]	王玲, 王家沛, 王鹏, 等. 融合注意力机制的孪生网络目标跟踪算法研究[J]. 计算机工程与应用, 2021, 57(8): 169-174.
	WANG L, WANG J P, WANG P, et al. Siamese network trac-king algorithms for hierarchical fusion of attention mecha-nism[J]. Computer Engineering and Applications, 2021, 57(8): 169-174.
[38]	李勇, 杨德东, 韩亚君, 等. 融合扰动感知模型的孪生神经网络目标跟踪[J]. 光学学报, 2020, 40(4): 114-125.
	LI Y, YANG D D, HAN Y J, et al. Siamese neural network object tracking with distractor-aware model[J]. Acta Optica Sinica, 2020, 40(4): 114-125.
[39]	LI X, MA C, WU B Y, et al. Target-aware deep tracking[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Pisca-taway: IEEE, 2019: 1369-1378.
[40]	ZHU Z, WANG Q, LI B, et al. Distractor-aware siamese networks for visual object tracking[C]// LNCS 11213: Procee-dings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 103-119.
[41]	GUO Q, FENG W, ZHOU C, et al. Learning dynamic sia-mese network for visual object tracking[C]// Proceedings of the 2017 International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 1781-1789.
[42]	ABDELPAKEY M H, SHEHATA M S. DP-Siam: dynamic policy siamese network for robust object tracking[J]. IEEE Transactions on Image Processing, 2020, 29: 1479-1492. DOI URL
[43]	WANG Q, GAO J, XING J, et al. DCFNet: discriminant correlation filters network for visual tracking[J]. arXiv:1704.04057v1, 2017.
[44]	VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 5000-5008.
[45]	GUPTA D K, ARYA D, GAVVES E. Rotation equivariant sia-mese networks for tracking[C]// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 12362-12371.
[46]	YAN B, ZHANG X, WANG D, et al. Alpha-refine: boos-ting tracking performance by precise bounding box estima-tion[C]// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 5289-5298.
[47]	ZHANG Z, PENG H. Deeper and wider siamese networks for real-time visual tracking[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 4591-4600.
[48]	LI B, WU W, WANG Q, et al. SiamRPN++: evolution of siamese visual tracking with very deep networks[C]// Procee-dings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Pisca-taway: IEEE, 2019: 4282-4291.
[49]	HE A, LUO C, TIAN X, et al. A twofold siamese network for real-time object tracking[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recog-nition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 4834-4843.
[50]	HE A F, LUO C, TIAN X M, et al. Towards a better match in siamese network based visual object tracker[C]// LNCS 11129: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 132-147.
[51]	田朗, 黄平牧, 吕铁军. SA-Siam++: 基于双分支孪生网络的目标跟踪算法[J]. 北京邮电大学学报, 2019, 42(6): 105-110.
	TIAN L, HUANG P M, LV T J. SA-Siam++: two-branch siamese network-based object rracking algorithm[J]. Journal of Beijing University of Posts and Telecommunications, 2019, 42(6): 105-110. DOI
[52]	TANG Y. Deep learning using linear support vector machines[J]. arXiv:1306.0239v4, 2013.
[53]	NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 4293-4302.
[54]	NAM H, BAEK M, HAN B. Modeling and propagating CNNs in a tree structure for visual tracking[J]. arXiv:1608.07242v1, 2016.
[55]	YUN S, CHOI J, YOO Y, et al. Action-decision networks for visual tracking with deep reinforcement learning[C]// Procee-dings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 1349-1358.
[56]	FAN H, LING H. SANet: structure-aware network for visual tracking[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2217-2224.
[57]	VOIGTLAENDER P, LUITEN J, TORR P H S, et al. Siam R-CNN: visual tracking by re-detection[C]// Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Reco-gnition, Seattle, Jun 14-19, 2020. Washington: IEEE Computer Society, 2020: 6577-6587.
[58]	DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: accurate tracking by overlap maximization[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 4655-4664.
[59]	JIANG B, LUO R, MAO J, et al. Acquisition of localization confidence for accurate object detection[C]// LNCS 11218: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 816-832.
[60]	YAN B, PENG H, WU K, et al. LightTrack: finding light-weight neural networks for object tracking via one-shot archi-tecture search[C]// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 15180-15189.
[61]	WU Y, LIM J, YANG M H. Online object tracking: a bench-mark[C]// Proceedings of the 2013 IEEE Conference on Com-puter Vision and Pattern Recognition, Portland, Jun 23-28, 2013. Washington: IEEE Computer Society, 2013: 2411-2418.
[62]	WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 2015, 37(9): 1834-1848.
[63]	KRISTAN M, PFLUGFELDER R, LEONARDIS A, et al. The visual object tracking VOT2013 challenge results[C]// Proceedings of the 2013 International Conference on Com-puter Vision, Berlin, Oct 1-8, 2013. Washington: IEEE Com-puter Society, 2013: 98-111.
[64]	KRISTAN M, PFLUGFELDER R P, LEONARDIS A, et al. The visual object tracking VOT2014 challenge results[C]// LNCS 8926: Proceedings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 191-217.
[65]	KRISTAN M, MATAS J, LEONARDIS A, et al. The visual object tracking VOT2015 challenge results[C]// Proceedings of the 2015 International Conference on Computer Vision, San-tiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 564-586.
[66]	KRISTAN M, LEONARDIS A, MATAS J, et al. The visual object tracking VOT2016 challenge results[C]// LNCS 9914: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 8-16, 2016. Cham: Springer, 2016: 777-823.
[67]	KRISTAN M, LEONARDIS A, MATAS J, et al. The visual object tracking VOT2017 challenge results[C]// Proceedings of the 2017 International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Soci-ety, 2017: 1949-1972.
[68]	KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]// LNCS 11129: Proceedings of the 15th European Conference on Com-puter Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-53.
[69]	KRISTAN M, BERG A, ZHENG L, et al. The seventh visual object tracking VOT2019 challenge results[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-28, 2019. Piscataway: IEEE, 2019: 2206-2241.
[70]	MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]// LNCS 9905: Proceedings of the 14th European Conference on Computer Vision, Amster-dam, Oct 8-16, 2016. Cham: Springer, 2016: 445-461.
[71]	MÜLLER M, BIBI A, GIANCOLA S, et al. TrackingNet: a large-scale dataset and benchmark for object tracking in the wild[C]// LNCS 11205: Proceedings of the 15th European Con-ference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 310-327.
[72]	HUANG L, ZHAO X, HUANG K. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Inte-lligence, 2021, 43(5): 1562-1577.
[73]	FAN H, LIN L, YANG F, et al. LaSOT: a high-quality bench-mark for large-scale single object tracking[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscata-way: IEEE, 2019: 5374-5383.
[74]	REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 658-666.

编辑推荐 0

Metrics

阅读次数

全文

981

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	202	46	0	733

来源	本网站	其他网站

次数	976	5
比例	99%	1%

摘要

1199

最新录用	在线预览	正式出版

87	0	1112

来源	本网站	其他网站

次数	1198	1
比例	100%	0%

特征	优点	缺点
深层特征	包含高层语义信息,对目标外观变化具有不变性,鲁棒性较强	空间分辨率较低,无法精确定位,容易导致目标漂移,准确性较弱
浅层特征	空间分辨率高,适合高精度定位,准确性较高	目标跟踪的鲁棒性较弱

特征	优点	缺点
深层特征	包含高层语义信息,对目标外观变化具有不变性,鲁棒性较强	空间分辨率较低,无法精确定位,容易导致目标漂移,准确性较弱
浅层特征	空间分辨率高,适合高精度定位,准确性较高	目标跟踪的鲁棒性较弱

类型	文献	算法名称	特点	优点	缺点
相关滤波	[5]	MOSSE	将相关滤波引入到视频目标跟踪领域,用滤波器与候选区域的特征图做卷积操作,响应最大值所在位置即为当前帧跟踪目标所在位置	速度快,可达669 frame/s	精度较低,单通道灰度特征
核相关	[7]	CSK	增加了正则化项,有效地防止了滤波器的过拟合;采用循环矩阵的方法进行稠密采样;引入了核技巧,提高了算法在高维空间中的速度	速度快,计算量有所减少	单一尺度,单通道灰度特征
	[8]	KCF/DCF	训练了一个目标检测器,判断预测位置是否为目标位置;引进了基于多通道的HOG特征	速度快,可达172 frame/s;多通道HOG特征,精度显著提升	单一尺度
	[13]	鲁棒跟踪算法	将灰度特征、HOG特征、LAB颜色特征进行融合;提出损失辨别和重定位策略缓解目标遮挡问题;采用多尺度滤波器缓解目标漂移的问题	中心位置误差较低	仅采用手工特征,未结合深度特征
	[14]	HKCF	针对卫星数据进行研究,有效缓解了目标较小且与背景相似的问题	特征融合,速度快,可达100 frame/s	仅采用手工特征,未结合深度特征
多尺度跟踪	[11]	DSST	将视频目标跟踪看作目标中心平移和目标尺度变化两个独立的问题,训练了两个滤波器：平移滤波器和尺度滤波器	33个尺度,多尺度跟踪,精度较高	速度较慢25.4 frame/s,边界效应
	[10]	SAMF	HOG特征、颜色特征和灰度特征融合;提出尺度池策略,小范围内实现了尺度自适应跟踪	HOG、颜色、灰度特征融合,7个尺度跟踪,提高精度	仅在尺度池内效果较好,没有做到真正意义的自适应
	[16]	尺度自适应算法	从ResNet网络的不同层提取特征生成响应图,然后基于AdaBoost算法进行融合,再利用尺度滤波器估计目标尺寸,实现准确跟踪	多特征融合,尺度滤波器	速度较慢;未采用手工特征,鲁棒性较差
	[17]	可变尺度学习跟踪算法	尺度因子可学习,不断调整;多尺度跟踪框纵横比方法共同缓解目标尺度变化问题	针对尺度变化问题效果较好	未进行特征融合
多特征融合	[27]	C-COT	将深度特征和手工特征（HOG特征和颜色特征）进行融合	13个滤波器,跟踪精度较高	速度较慢1.5 frame/s,算法参数较多
	[30]	UPDT	系统地分析了深层和浅层特征在视频目标跟踪中的影响,提出一种深层和浅层特征自适应融合的跟踪算法	精度较高	虽然速度有所提升,但仍较慢
	[31]	ACM	融合目标和搜索区域中不同尺寸的特征图,结合先验信息和视觉特征,可以容易地集成到现有跟踪器中	泛化性能较好,可直接集成到其他跟踪器中	跟踪效果与选用的跟踪器关系较大

类型	文献	算法名称	特点	优点	缺点
相关滤波	[5]	MOSSE	将相关滤波引入到视频目标跟踪领域,用滤波器与候选区域的特征图做卷积操作,响应最大值所在位置即为当前帧跟踪目标所在位置	速度快,可达669 frame/s	精度较低,单通道灰度特征
核相关	[7]	CSK	增加了正则化项,有效地防止了滤波器的过拟合;采用循环矩阵的方法进行稠密采样;引入了核技巧,提高了算法在高维空间中的速度	速度快,计算量有所减少	单一尺度,单通道灰度特征
	[8]	KCF/DCF	训练了一个目标检测器,判断预测位置是否为目标位置;引进了基于多通道的HOG特征	速度快,可达172 frame/s;多通道HOG特征,精度显著提升	单一尺度
	[13]	鲁棒跟踪算法	将灰度特征、HOG特征、LAB颜色特征进行融合;提出损失辨别和重定位策略缓解目标遮挡问题;采用多尺度滤波器缓解目标漂移的问题	中心位置误差较低	仅采用手工特征,未结合深度特征
	[14]	HKCF	针对卫星数据进行研究,有效缓解了目标较小且与背景相似的问题	特征融合,速度快,可达100 frame/s	仅采用手工特征,未结合深度特征
多尺度跟踪	[11]	DSST	将视频目标跟踪看作目标中心平移和目标尺度变化两个独立的问题,训练了两个滤波器：平移滤波器和尺度滤波器	33个尺度,多尺度跟踪,精度较高	速度较慢25.4 frame/s,边界效应
	[10]	SAMF	HOG特征、颜色特征和灰度特征融合;提出尺度池策略,小范围内实现了尺度自适应跟踪	HOG、颜色、灰度特征融合,7个尺度跟踪,提高精度	仅在尺度池内效果较好,没有做到真正意义的自适应
	[16]	尺度自适应算法	从ResNet网络的不同层提取特征生成响应图,然后基于AdaBoost算法进行融合,再利用尺度滤波器估计目标尺寸,实现准确跟踪	多特征融合,尺度滤波器	速度较慢;未采用手工特征,鲁棒性较差
	[17]	可变尺度学习跟踪算法	尺度因子可学习,不断调整;多尺度跟踪框纵横比方法共同缓解目标尺度变化问题	针对尺度变化问题效果较好	未进行特征融合
多特征融合	[27]	C-COT	将深度特征和手工特征（HOG特征和颜色特征）进行融合	13个滤波器,跟踪精度较高	速度较慢1.5 frame/s,算法参数较多
	[30]	UPDT	系统地分析了深层和浅层特征在视频目标跟踪中的影响,提出一种深层和浅层特征自适应融合的跟踪算法	精度较高	虽然速度有所提升,但仍较慢
	[31]	ACM	融合目标和搜索区域中不同尺寸的特征图,结合先验信息和视觉特征,可以容易地集成到现有跟踪器中	泛化性能较好,可直接集成到其他跟踪器中	跟踪效果与选用的跟踪器关系较大

数据集	年份	视频数	帧数	平均长度/帧	类别	特点
OTB-2013	2013	51	29 000	578	10	包含25%的灰度序列;11种常见的视频属性标注：光照变化、尺度变化、遮挡、形变、运动模糊、快速移动、平面内旋转、平面外旋转、消失、相似背景干扰、低分辨率;随机帧开始
OTB-2015	2015	98	59 000	598	16	在OTB-2013的基础上增加了视频序列
VOT	2013	16	—	—	—	为彩色序列,平均时长较短,分辨率较高;第一帧初始化开始;VOT2018和VOT2019均在VOT2017的基础上加入了长时跟踪视频序列
	2014	25	10 000	409	11
	2015	60	22 000	358	24
	2016	60	22 000	358	24
	2017	60	22 000	356	24
	2018	60	22 000	356	24
	2019	60	22 000	356	24
UAV123	2016	123	113 000	915	9	特殊场景数据集,均由低空无人机捕获;视频序列背景干净,视角变化丰富
UAV20L	2016	20	59 000	2 934	5	视频序列平均时长较长,常应用于长时跟踪
TrackingNet	2018	30 643	14 432 000	467	27	规模较大,主要针对野外目标的短时跟踪;训练集和测试集互不相交
GOT-10K	2019	10 000	1 500 000	150	563	数据集种类较多,时长较短,常应用于短时跟踪;训练集和测试集互不相交
LaSOT	2019	1 400	3 520 000	2 506	70	大规模的长时跟踪数据集;提供了可视化的边界框注释,当目标消失时,出现“目标不存在”的注释

视频目标跟踪算法综述

Survey on Video Object Tracking Algorithms

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 74

相关文章 15

编辑推荐 0

Metrics

[1]	安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897.
[2]	曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763.
[3]	夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
[4]	赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503.
[5]	张好聪, 李涛, 邢立冬, 潘风蕊. OpenVX特征抽取函数在可编程并行架构的实现[J]. 计算机科学与探索, 2022, 16(7): 1583-1593.
[6]	孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259.
[7]	刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290.
[8]	董文轩, 梁宏涛, 刘国柱, 胡强, 于旭. 深度卷积应用于目标检测算法综述[J]. 计算机科学与探索, 2022, 16(5): 1025-1042.
[9]	程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154.
[10]	钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990.
[11]	裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790.
[12]	许嘉, 韦婷婷, 于戈, 黄欣悦, 吕品. 题目难度评估方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 734-759.
[13]	朱伟杰, 陈莹. 双流时间域信息交互的微表情识别卷积网络[J]. 计算机科学与探索, 2022, 16(4): 950-958.
[14]	姜艺, 胥加洁, 柳絮, 朱俊武. 边缘指导图像修复算法研究[J]. 计算机科学与探索, 2022, 16(3): 669-682.
[15]	张全贵, 胡嘉燕, 王丽. 耦合用户公共特征的单类协同过滤推荐算法[J]. 计算机科学与探索, 2022, 16(3): 637-648.