深度学习中的单阶段小目标检测方法综述

doi:10.3778/j.issn.1673-9418.2110003

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (1): 41-58.DOI: 10.3778/j.issn.1673-9418.2110003

深度学习中的单阶段小目标检测方法综述

李科岑¹, 王晓强¹^,⁺(), 林浩², 李雷孝³, 杨艳艳³, 孟闯³, 高静⁴

1.内蒙古工业大学信息工程学院,呼和浩特 010080
2.天津理工大学计算机科学与工程学院,天津 300384
3.内蒙古工业大学数据科学与应用学院,呼和浩特 010080
4.内蒙古农业大学计算机与信息工程学院,呼和浩特 010011

收稿日期:2021-09-17 修回日期:2021-11-08 出版日期:2022-01-01 发布日期:2021-11-20
通讯作者: + E-mail: wangxiaoqiang@imut.edu.cn
作者简介:李科岑（1997—）,女,山西人,硕士研究生,主要研究方向为深度学习、目标检测。
王晓强（1978—）,男,内蒙古人,副教授,硕士生导师,主要研究方向为大数据分析、智能交通。
林浩（1995—）,男,天津人,博士研究生,主要研究方向为网络安全、数据挖掘。
李雷孝（1978—）,男,山东人,博士,教授,主要研究方向为云计算、大数据分析、数据挖掘、目标检测。
杨艳艳（1997—）,女,山东人,硕士研究生,主要研究方向为云计算、大数据分析、计算机视觉。
孟闯（1997—）,男,山东人,硕士研究生,主要研究方向为人工智能、目标检测、图像处理、云计算、大数据分析。
高静（1970—）,女,博士,教授,主要研究方向为云计算、大数据分析、农业信息化。
基金资助:
内蒙古自治区关键技术攻关计划项目(2019GG273);内蒙古自治区科技成果转化专项资金项目(2020CG0073);内蒙古自治区科技成果转化专项资金项目(2021CG0033);内蒙古自治区科技重大专项(2019ZD015);内蒙古自治区科技重大专项(2019ZD016);内蒙古自治区科技计划项目(2020GG0104)

Survey of One-Stage Small Object Detection Methods in Deep Learning

LI Kecen¹, WANG Xiaoqiang¹^,⁺(), LIN Hao², LI Leixiao³, YANG Yanyan³, MENG Chuang³, GAO Jing⁴

1. College of Information Engineering, Inner Mongolia University of Technology, Hohhot 010080, China
2. College of Computer Science and Engineering, Tianjin University of Technology, Tianjin 300384, China
3. College of Data Science and Application, Inner Mongolia University of Technology, Hohhot 010080, China
4. College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot 010011, China

Received:2021-09-17 Revised:2021-11-08 Online:2022-01-01 Published:2021-11-20
About author:LI Kecen, born in 1997, M.S. candidate. Her research interests include deep learning and object detection.
WANG Xiaoqiang, born in 1978, associate professor, M.S. supervisor. His research interests include big data analysis and intelligent transportation.
LIN Hao, born in 1995, Ph.D. candidate. His research interests include network security and data mining.
LI Leixiao, born in 1978, Ph.D., professor. His research interests include cloud computing, big data analysis, data mining and object detection.
YANG Yanyan, born in 1997, M.S. candidate. Her research interests include cloud computing, big data analysis and computer vision.
MENG Chuang, born in 1997, M.S. candidate. His research interests include artificial intelligence, object detection, image processing, cloud computing and big data analysis.
GAO Jing, born in 1970, Ph.D., professor. Her research interests include cloud computing, big data analysis and agricultural informatization.
Supported by:
Key Technology Research Program of Inner Mongolia(2019GG273);Project for Transformation of Scientific and Technological Achievements of Inner Mongolia(2020CG0073);Project for Transformation of Scientific and Technological Achievements of Inner Mongolia(2021CG0033);Major Science and Technology Project of Inner Mongolia(2019ZD015);Major Science and Technology Project of Inner Mongolia(2019ZD016);Science and Technology Plan Project of Inner Mongolia(2020GG0104)

摘要/Abstract

摘要：

随着深度学习的不断发展,目标检测技术逐步从基于传统的手工检测方法向基于深度神经网络的检测方法转变。在众多基于深度学习的目标检测方法中,基于深度学习的单阶段目标检测方法因其网络结构较简单、运行速度较快以及具有更高的检测效率而被广泛运用。但现有的基于深度学习的单阶段目标检测方法由于小目标物体包含的特征信息较少、分辨率较低、背景信息较复杂、细节信息不明显以及定位精度要求较高等原因,导致在检测过程中对小目标物体的检测效果不理想,使得模型检测精度降低。针对目前基于深度学习的单阶段目标检测方法存在的问题,研究了大量基于深度学习的单阶段小目标检测技术。首先从单阶段目标检测方法的Anchor Box、网络结构、交并比函数以及损失函数等几个方面,系统地总结了针对小目标检测的优化方法;其次列举了常用的小目标检测数据集及其应用领域,并给出在各小目标检测数据集上的检测结果图;最后探讨了基于深度学习的单阶段小目标检测方法的未来研究方向。

关键词: 深度学习, 单阶段目标检测, 小目标检测

Abstract:

With the development of deep learning, object detection technology has gradually changed from traditional manual detection methods to deep neural network detection methods. Among many object detection algorithms based on deep learning, the one-stage object detection method based on deep learning is widely used because of its simple network structure, fast running speed and higher detection efficiency. However, the existing one-stage object detection methods based on deep learning do not have ideal detection results for small target objects in the detection process due to the lack of feature information, low resolution, complicated background information, unobvious details and higher positioning accuracy, which reduces the detection accuracy of the model. Aiming at the existing problems of one-stage object detection method based on deep learning, a large amount of one-stage small object detection technologies based on deep learning are studied. Firstly, the optimization methods for small object detection are systematically summarized from the aspects of Anchor Box, network structure, IoU (intersection over union) and loss function in the one-stage object detection methods. Secondly, the commonly used small object detection datasets and their application fields are listed, and the detection graphs on each small object detection dataset are given. Finally, the future research direction of one-stage small object detection methods based on deep learning is investigated.

Key words: deep learning, one-stage object detection, small object detection

中图分类号:

TP391.41

李科岑, 王晓强, 林浩, 李雷孝, 杨艳艳, 孟闯, 高静. 深度学习中的单阶段小目标检测方法综述[J]. 计算机科学与探索, 2022, 16(1): 41-58.

LI Kecen, WANG Xiaoqiang, LIN Hao, LI Leixiao, YANG Yanyan, MENG Chuang, GAO Jing. Survey of One-Stage Small Object Detection Methods in Deep Learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 41-58.

图/表 11

参考文献 110

[1]	GU J, WANG Z, KUEN J, et al. Recent advances in convolutional neural networks[J]. arXiv:1512.07108, 2015.
[2]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 779-788.
[3]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 6517-6525.
[4]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[5]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 21-37.
[6]	FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector[J]. arXiv:1701.06659, 2017.
[7]	SHEN Z Q, LIU Z, LI J G, et al. DSOD: learning deeply supervised object detectors from scratch[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 1937-1945.
[8]	JEONG J, PARK H, KWAK N. Enhancement of SSD by concatenating feature maps for object detection[J]. arXiv:1705.09587, 2017.
[9]	LI Z X, ZHOU F Q. FSSD: feature fusion single shot multibox detector[J]. arXiv:1712.00960, 2018.
[10]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2999-3007.
[11]	王迪聪, 白晨帅, 邬开俊. 基于深度学习的视频目标检测综述[J]. 计算机科学与探索, 2021, 15(9):1563-1577.
	WANG D C, BAI C S, WU K J. Survey of video object detection based on deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(9):1563-1577.
[12]	史彩娟, 张卫明, 陈厚儒, 等. 基于深度学习的显著性目标检测综述[J]. 计算机科学与探索, 2021, 15(2):219-232.
	SHI C J, ZHANG W M, CHEN H R, et al. Survey of salient object detection based on deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(2):219-232.
[13]	蔡强, 刘亚奇, 曹健, 等. 图像目标类别检测综述[J]. 计算机科学与探索, 2015, 9(3):257-265.
	CAI Q, LIU Y Q, CAO J, et al. Review on object class detection of images[J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(3):257-265.
[14]	许德刚, 王露, 李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8):10-25.
	XU D G, WANG L, LI F. Review of typical object detection algorithms for deep learning[J]. Computer Engineering and Applications, 2021, 57(8):10-25.
[15]	谢文亮, 朱丹, 佟新鑫. 一种基于视觉注意的小目标检测方法[J]. 计算机工程与应用, 2013, 49(12):125-128.
	XIE W L, ZHU D, TONG X X. Small target detection method based on visual attention[J]. Computer Engineering and Applications, 2013, 49(12):125-128.
[16]	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 13th European Conference on Computer Vision, Zurich, Sep 6-12, 2014. Cham: Springer, 2014: 740-755.
[17]	CHEN C, LIU M Y, TUZEL O, et al. R-CNN for small object detection[C]// Proceedings of the 13th Asian Conference on Computer Vision, Taipei, China, Nov 20-24, 2016. Cham: Springer, 2016: 214-230.
[18]	BRAUN M, KREBS S, FLOHR F, et al. The EuroCity persons dataset: a novel benchmark for object detection[J]. arXiv.1805.07193, 2018.
[19]	XIA G S, BAI X, DING J, et al. DOTA: a large-scale dataset for object detection in aerial images[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Piscataway: IEEE, 2018: 3974-3983.
[20]	YANG S, LUO P, LOY C C, et al. WIDER FACE: a face detection benchmark[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 5525-5533.
[21]	ZHANG S, BENENSON R, SCHIELE B. CityPersons: a diverse dataset for pedestrian detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 3213-3221.
[22]	YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[J]. arXiv:1912.10664, 2019.
[23]	MACQUEEN J. Some methods for classification and analysis of multivariate observations[C]// Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967: 281-297.
[24]	KE W, ZHANG T, HUANG Z, et al. Multiple anchor learning for visual object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10206-10215.
[25]	周慧, 严凤龙, 褚娜, 等. 一种改进复杂场景下小目标检测模型的方法[J/OL]. 计算机工程与应用 [2021-09-04]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210419.1404.049.html.
	ZHOU H, YAN F L, CHU N, et al. An approach to improve detection model for small object in complex scenes[J/OL]. Computer Engineering and Applications [2021-09-04]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210419.1404.049.html.
[26]	ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]// Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, 1996. Menlo Park: AAAI, 1996: 226-231.
[27]	岳晓新, 贾君霞, 陈喜东, 等. 改进YOLO V3的道路小目标检测[J]. 计算机工程与应用, 2020, 56(21):218-223.
	YUE X X, JIA J X, CHEN X D, et al. Road small target detection algorithm based on improved YOLO V3[J]. Computer Engineering and Applications, 2020, 56(21):218-223.
[28]	李云红, 张轩, 李传真, 等. 融合DBSCAN的改进YOLOv3目标检测算法[J/OL]. 计算机工程与应用 [2021-09-04]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210327.1437.002.html.
	LI Y H, ZHANG X, LI C Z, et al. Improved YOLOv3 target detection algorithm combined with DBSCAN[J/OL]. Computer Engineering and Applications [2021-09-04]. .
[29]	罗建华, 黄俊, 白鑫宇, 改进YOLOv3的道路小目标检测方法[J/OL]. 小型微型计算机系统 [2021-09-12]. http://kns.cnki.net/kcms/detail/21.1106.TP.20210420.1111.023.html.
	LUO J H, HUANG J, BAI X Y. Road small target detection method based on improved YOLOv3[J/OL]. Journal of Chinese Computer Systems [2021-09-12]. http://kns.cnki.net/kcms/detail/21.1106.TP.20210420.1111.023.html.
[30]	刘家乐, 吴怀宇, 陈志环. 改进YOLOv3的工业指针式仪表检测方法[J]. 计算机工程与设计, 2021, 42(7):2001-2008.
	LIU J L, WU H Y, CHEN Z H. Improved YOLOv3 industrial pointer instrument detection method[J]. Computer Engineering and Design, 2021, 42(7):2001-2008.
[31]	MU X, LIN Y, LIU J, et al. Surface navigation target detection and recognition based on SSD[C]// Proceedings of the 2019 3rd International Conference on Electronic Information Technology and Computer Engineering, Xiamen, Oct 18-20, 2019. Washington: IEEE Computer Society, 2019: 649-653.
[32]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv1409. 1556, 2014.
[33]	HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916. DOI URL
[34]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[35]	HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017.
[36]	HUANG G, LIU Z, et al. Densely connected convolutional networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2261-2269.
[37]	SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, Jun 7-12, 2015. Washington: IEEE Computer Society, 2015: 1-8.
[38]	ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington: IEEE Computer Society, 2018: 6848-6856.
[39]	ADAMDAD. Keras-YOLOv3-mobilenet.[EB/OL]. (2019)[2021-08-20]. https://github.com/Adamdad/keras-YOLOv3-mobilenet.
[40]	王建军, 魏江, 梅少辉, 等. 面向遥感图像小目标检测的改进YOLOv3算法[J]. 计算机工程与应用, 2021, 57(20):133-141.
	WANG J J, WEI J, MEI S H, et al. Improved YOLOv3 for small object detection in remote sensing images[J]. Computer Engineering and Applications, 2021, 57(20):133-141.
[41]	LI Y, LV C. SS-YOLO: an object detection algorithm based on YOLOv3 and ShuffleNet[C]// Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference, Chongqing, Jun 12-14, 2020. Piscataway: IEEE, 2020: 769-772.
[42]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 7132-7141.
[43]	BOCHKOVSKIY A, WANG C Y, LIAO H . M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv: 2004.10934, 2020.
[44]	MAHTO P, GARG P, SETH P, et al. Refining YOLOV4 for vehicle detection[J]. International Journal of Advanced Research in Engineering and Technology, 2020, 11(5):409-419.
[45]	谢俊章, 彭辉, 唐健峰, 等. 改进YOLOv4的密集遥感目标检测[J]. 计算机工程与应用, 2021, 57(22):247-256.
	XIE J Z, PENG H, TANG J F, et al. Improve YOLOv4 for dense remote sensing target detection[J]. Computer Engineering and Applications, 2021, 57(22):247-256.
[46]	JOCHER G. YOLOv5[EB/OL]. [2020-08-10]. https://github.com/ultralytics/yolov5.
[47]	PAN Z H, CHEN Y. Object detection algorithm based on dense connection[C]// Proceedings of the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference, Chengdu, Dec 20-22, 2019. Piscataway: IEEE, 2019: 1558-1562.
[48]	李航, 朱明. 基于深度卷积神经网络的小目标检测算法[J]. 计算机工程与科学, 2020, 42(4):649-657.
	LI H, ZHU M. A small object detection algorithm based on deep convolutional neural network[J]. Computer Engineering and Science, 2020, 42(4):649-657.
[49]	齐榕, 贾瑞生, 徐志峰, 等. 基于YOLOv3的轻量级目标检测网络[J]. 计算机应用与软件, 2020, 37(10):208-213.
	QI R, JIA R S, XU Z F, et al. Lightweight object detection network based on YOLOv3[J]. Computer Applications and Software, 2020, 37(10):208-213.
[50]	郑秋梅, 王璐璐, 王风华. 基于改进卷积神经网络的交通场景小目标检测[J]. 计算机工程, 2020, 46(6):26-33.
	ZHENG Q M, WANG L L, WANG F H. Small object detection in traffic scene based on improved convolutional neural network[J]. Computer Engineering, 2020, 46(6):26-33.
[51]	张侣, 周博文, 吴亮红. 基于改进卷积注意力模块与残差结构的SSD网络[J/OL]. 计算机科学 [2021-09-12]. .
	ZHANG L, ZHOU B W, WU L H. SSD network based on improved convolutional attention module and residual structure[J/OL]. Computer Science [2021-09-12]. http://kns.cnki.net/kcms/detail/50.1075.TP.20211012.1354.006. html.
[52]	赵鹏飞, 谢林柏, 彭力. 融合注意力机制的深层次小目标检测算法[J/OL]. 计算机科学与探索 [2021-10-18]. http://kns.cnki.net/kcms/detail/11.5602.TP20211014.2108.010.html.
	ZHAO P F, XIE L B, PENG L. Deep small object detection algorithm integrating attention mechanism[J/OL]. Journal of Frontiers of Computer Science and Technology [2021-10-18]. http://kns.cnki.net/kcms/detail/11.5602.TP.20211014.2108.010.html.
[53]	奚琦, 张正道, 彭力. 基于改进密集网络与二次回归的小目标检测算法[J]. 计算机工程, 2021, 47(4):241-247.
	XI Q, ZHANG Z D, PENG L. Small object detection algorithm based on improved dense network and quadratic regression[J]. Computer Engineering, 2021, 47(4):241-247.
[54]	徐先峰, 赵万福, 邹浩泉, 等. 基于MobileNet-SSD的安全帽佩戴检测算法[J]. 计算机工程, 2021, 47(10):298-305.
	XU X F, ZHAO W F, ZOU H Q, et al. Safety helmet wearing detection method based on MobileNet-SSD[J]. Computer Engineering, 2021, 47(10):298-305.
[55]	LU X, KANG X, NISHIDE S, et al. Object detection based on SSD-ResNet[C]// Proceedings of the 2019 IEEE 6th International Conference on Cloud Computing and Intelligence Systems, Singapore, Dec 19-21, 2019. Piscataway: IEEE, 2019: 89-92.
[56]	CHENG M, BAI J, LI L, et al. Tiny-RetinaNet: a one-stage detector for realtime object detection[C]// SPIE 11373: Proceedings of the 11th International Conference on Graphics and Image Processing, Hangzhou, 2020.
[57]	李文涛, 彭力. 多尺度通道注意力融合网络的小目标检测算法[J]. 计算机科学与探索, 2021, 15(12):2390-2400.
	LI W T, PENG L. Small objects detection algorithm with multi-scale channel attention fusion network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12):2390-2400.
[58]	ZHAI S P, SHANG D R, WANG S H, et al. DF-SSD: an improved SSD object detection algorithm based on DenseNet and feature fusion[J]. IEEE Access, 2020, 8:24344-24357. DOI URL
[59]	ZHU X, CHENG D, ZHANG Z, et al. An empirical study of spatial attention mechanisms in deep networks[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 6687-6696.
[60]	徐诚极, 王晓峰, 杨亚东. Attention-YOLO: 引入注意力机制的YOLO检测算法[J]. 计算机工程与应用, 2019, 55(6):13-23.
	XU C J, WANG X F, YANG Y D. Attention-YOLO: YOLO detection algorithm that introduces attention mechanism[J]. Computer Engineering and Applications, 2019, 55(6):13-23.
[61]	张陶宁, 陈恩庆, 肖文福. 一种改进MobileNet_YOLOv3网络的快速目标检测方法[J]. 小型微型计算机系统, 2021, 42(5):1008-1014.
	ZHAG T N, CHEN E Q, XIAO W F. Fast target detection method for improving MobileNet_YOLOv3 network[J]. Journal of Chinese Computer Systems, 2021, 42(5):1008-1014.
[62]	LI Y, LI S, DU H, et al. YOLO-ACN: focusing on small target and occluded object detection[J]. IEEE Access, 2020, 8:227288-227303. DOI URL
[63]	麻森权, 周克. 基于注意力机制和特征融合改进的小目标检测算法[J]. 计算机应用与软件, 2020, 37(5):194-199.
	MA S Q, ZHOU K. An improved small object detection algorithm based on attention mechanism and feature fusion[J]. Computer Applications and Software, 2020, 37(5):194-199.
[64]	WANG Q, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition, Seattle, Jun 14-19, 2020. Piscataway: IEEE, 2020: 11531-11539.
[65]	于敏, 屈丹, 司念文. 改进型RetinaNet的目标检测算法[J/OL]. 计算机工程 [2021-08-27]. https://doi.org/10.19678/j.issn.1000-3428.0062134.
	YU M, QU D, SI N W. Object detection algorithm for improved RetinaNet[J/OL]. Computer Engineering [2021-08-27]. https://doi.org/10.19678/j.issn.1000-3428.0062134.
[66]	QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: frequency channel attention networks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 783-792.
[67]	刘建政, 梁鸿, 崔学荣, 等. 融入特征融合与特征增强的SSD目标检测[J/OL]. 计算机工程与应用 [2021-09-16]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210203.1148.022.html.
	LIU J Z, LIANG H, CUI X R, et al. SSD visual target detector based on feature integration and feature enhancement[J/OL]. Computer Engineering and Applications [2021-09-16]. http://kns.cnki.net/kcms/detail/11.2127.TP.20210203.1148.022.html.
[68]	张海涛, 张梦. 引入通道注意力机制的SSD目标检测算法[J]. 计算机工程, 2020, 46(8):264-270.
	ZHANG H T, ZHANG M. SSD target detection algorithm with channel attention mechanism[J]. Computer Engineering, 2020, 46(8):264-270.
[69]	PAN H, CHEN G, JIANG J. Adaptively dense feature pyramid network for object detection[J]. IEEE Access, 2019, 7:81132-81144. DOI URL
[70]	ZHU G, WEI Z, LIN F. An object detection method combining multi-level feature fusion and region channel attention[J]. IEEE Access, 2021, 9:25101-25109. DOI URL
[71]	赵文清, 杨盼盼. 双向特征融合与注意力机制结合的目标检测[J]. 智能系统学报, 2021, 16(6):1098-1105.
	ZHAO W Q, YANG P P. Target detection based on bidirectional feature fusion and attention mechanism[J]. CAAI Transactions on Intelligent Systems, 2021, 16(6):1098-1105.
[72]	鞠默然, 罗江宁, 王仲博, 等. 融合注意力机制的多尺度目标检测算法[J]. 光学学报, 2020, 40(13):132-140.
	JU M R, LUO J N, WANG Z B, et al. Multi-scale target detection algorithm based on attention mechanism[J]. Acta Optica Sinica, 2020, 40(13):132-140.
[73]	宋忠浩, 谷雨, 陈旭, 等. 基于加权策略的高分辨率遥感图像目标检测[J]. 计算机工程与应用, 2021, 57(13):199-206.
	SONG Z H, GU Y, CHEN X, et al. Target detection in high-resolution remote sensing image based on weighted strategy[J]. Computer Engineering and Applications, 2021, 57(13):199-206.
[74]	SHI W, BAO S, TAN D. FFESSD: an accurate and efficient single-shot detector for target detection[J]. Applied Sciences, 2019, 9(20):4276. DOI URL
[75]	WOO S, HWANG S, KWEON I S. StairNet: top-down semantic aggregation for accurate one shot detection[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, Mar 12-15, 2018. Piscataway: IEEE, 2018: 1093-1102.
[76]	张思宇, 张轶. 基于多尺度特征融合的小目标行人检测[J]. 计算机工程与科学, 2019, 41(9):1627-1634.
	ZHANG S Y, ZHANG Y. Small target pedestrian detection based on multi-scale feature fusion[J]. Computer Engineering and Science, 2019, 41(9):1627-1634.
[77]	王燕妮, 余丽仙. 注意力与多尺度有效融合的SSD目标检测算法[J/OL]. 计算机科学与探索 [2021-09-12]. http://kns.cnki.net/kcms/detail/11.5602.TP.20210722.1315.004.html.
	WANG Y N, YU L X. SSD object detection algorithm with effective fusion of attention and multi-scale[J/OL]. Journal of Frontiers of Computer Science and Technology [2021-09-12]. http://kns.cnki.net/kcms/detail/11.5602.TP.20210722.1315.004.html.
[78]	赵文清, 孔子旭, 周震东, 等. 增强小目标特征的航空遥感目标检测[J]. 中国图象图形学报, 2021, 26(3):644-653.
	ZHAO W Q, KONG Z X, ZHOU Z D, et al. Target detection algorithm of aerial remote sensing based on feature enhancement technology[J]. Journal of Image and Graphics, 26(3):644-653.
[79]	SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5693-5703.
[80]	SHI X, CHEN Z, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation[J]. Advances in Neural Information Processing Systems, 2015, 28:802-810.
[81]	鲁博, 瞿绍军. 融合BiFPN和改进YOLOv3-Tiny网络的航拍图像车辆检测方法[J]. 小型微型计算机系统, 2021, 42(8):1694-1698.
	LU B, QU S J. Vehicle detection method in aerial images based on BiFPN and improved YOLOv3-Tiny network[J]. Journal of Chinese Computer Systems, 2021, 42(8):1694-1698.
[82]	潘昕晖, 邵清, 卢军国. 基于CBD-YOLOv3的小目标检测算法[J/OL]. 小型微型计算机系统 [2021-09-05]. http://kns.cnki.net/kcms/detail/21.1106.TP.20210818.1055.032.html.
	PAN X H, SHAO Q, LU J G. Small object detection algorithm based on CBD-YOLOv3[J/OL]. Journal of Chinese Computer Systems [2021-09-05]. http://kns.cnki.net/kcms/detail/21.1106.TP.20210818.1055.032.html.
[83]	REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 658-666.
[84]	YANG Y M, LIAO Y R, CHENG L F, et al. Remote sensing image aircraft target detection based on GIoU-YOLOv3[C]// Proceedings of the 2021 6th International Conference on Intelligent Computing and Signal Processing, Xi’an, Apr 9-11, 2021. Piscataway: IEEE, 2021: 474-478.
[85]	邹承明, 薛榕刚. 融合GIoU和Focal loss的YOLOv3目标检测算法[J]. 计算机工程与应用, 2020, 56(24):214-222.
	ZOU C M, XUE R G. Improved YOLOv3 object detection algorithm: combining GIoU and Focal loss[J]. Computer Engineering and Applications, 2020, 56(24):214-222.
[86]	ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 12993-13000.
[87]	LIANG T J, BAO H. A optimized YOLO method for object detection[C]// Proceedings of the 2020 16th International Conference on Computational Intelligence and Security, Nanning, Nov 27-30, 2020. Piscataway: IEEE, 2020: 30-34.
[88]	ZHENG Z, WANG P, REN D, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. arXiv:2005.03572, 2020.
[89]	单美静, 秦龙飞, 张会兵. L-YOLO: 适用于车载边缘计算的实时交通标识检测模型[J]. 计算机科学, 2021, 48(1):89-95.
	SHAN M J, QIN L F, ZHANG H B. L-YOLO: real time traffic sign detection model for vehicle edge computing[J]. Computer Science, 2021, 48(1):89-95.
[90]	张炳力, 秦浩然, 江尚, 等. 基于RetinaNet及优化损失函数的夜间车辆检测方法[J]. 汽车工程, 2021, 43(8):1195-1202.
	ZHANG B L, QIN H R, JIANG S, et al. A method of vehicle detection at night based on RetinaNet and optimized loss functions[J]. Automotive Engineering, 2021, 43(8):1195-1202.
[91]	HU Y Y, WU X J, ZHENG G D, et al. Object detection of UAV for Anti-UAV based on improved YOLO v3[C]// Proceedings of the 2019 Chinese Control Conference, Guangzhou, Jul 27-30 2019. Piscataway: IEEE, 2019: 8386-8390.
[92]	黄凤琪, 陈明, 冯国富. 基于可变形卷积改进的YOLO目标检测算法[J]. 计算机工程, 2021, 47(10):269-275.
	HUANG F Q, CHEN M, FENG G F. Improved YOLO object detection algorithm based on deformable convolution[J]. Computer Engineering, 2021, 47(10):269-275.
[93]	WANG Z H, LI L H, LI L, et al. Object detection algorithm based on improved YOLOv3-Tiny network in traffic scenes[C]// Proceedings of the 2020 4th CAA International Conference on Vehicular Control and Intelligence, Hangzhou, Dec 18-20, 2020. Piscataway: IEEE, 2020: 514-518.
[94]	黄同愿, 杨雪姣, 向国徽, 等. 基于单目视觉的小目标行人检测与测距研究[J]. 计算机科学, 2020, 47(11):205-211.
	HUANG T Y, YANG X J, XIANG G H, et al. Study on small target pedestrian detection and ranging based on monocular vision[J]. Computer Science, 2020, 47(11):205-211.
[95]	CHENG X B, QIU G H, JIANG Y, et al. An improved small object detection method based on YOLO V3[J]. Pattern Analysis and Applications, 2021, 24(3):1347-1355. DOI URL
[96]	ZHAO Q J, SHENG T, WANG Y T, et al. M2Det: a single-shot object detector based on multi-level feature pyramid network[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park, AAAI, 2019: 9259-9266.
[97]	王鹏, 陆振宇, 詹天明, 等. 基于PDSSD改进型神经网络的小目标检测算法[J]. 计算机应用与软件, 2021, 38(1):149-156.
	WANG P, LU Z Y, ZHAN T M, et al. Small object detection algorithm based on PDSSD improved neural network[J]. Computer Applications and Software, 2021, 38(1):149-156.
[98]	张新良, 谢恒, 赵运基, 等. 融合多维空洞卷积算子和多层次特征的深度网络检测算法[J]. 模式识别与人工智能, 2020, 33(10):898-905.
	ZHANG X L, XIE H, ZHAO Y J, et al. Deep networks detection algorithm fusing multiple dilated convolution operator and multi-level characteristics[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(10):898-905.
[99]	陈灏然, 彭力. 感受野下的小目标检测算法[J]. 计算机科学与探索, 2021, 15(2):346-353.
	CHEN H R, PENG L. Detection algorithm of small target in receptive field block[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(2):346-353.
[100]	宋云博, 陈冬艳, 郝赟, 等. 基于级联卷积神经网络的高效目标检测方法[J]. 计算机工程与应用, 2021, 57(5):139-145.
	SONG Y B, CHEN D Y, HAO Y, et al. Efficient object detection method based on cascaded convolutional neural network[J]. Computer Engineering and Applications, 2021, 57(5):139-145.
[101]	LAW H, DENG J. CornerNet: detecting objects as paired keypoints[C]// Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 765-781.
[102]	DUAN K, BAI S, XIE L, et al. CenterNet: keypoint triplets for object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 6569-6578.
[103]	TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 9626-9635.
[104]	BEHRENDT K, NOVAK L, BOTROS R. A deep learning approach to traffic lights: detection, tracking, and classification[C]// Proceedings of the 2017 IEEE International Conference on Robotics and Automation, Singapore, May 29- Jun 3, 2017. Piscataway: IEEE, 2017: 1370-1377.
[105]	WANG J W, YANG W, GUO H W, et al. Tiny object detection in aerial images[C]// Proceedings of the 2020 25th International Conference on Pattern Recognition, Milan, Jan 10-15, 2021. Piscataway: IEEE, 2021: 3791-3798.
[106]	ZAMIR S W, ARORA A, GUPTA A, et al. iSAID: a large-scale dataset for instance segmentation in aerial images[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 28-37.
[107]	BONDI E, JAIN R, AGGRAWAL P, et al. BIRDSAI: a data-set for detection and tracking in aerial thermal infrared videos[C]// Proceedings of the 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, Mar 1-5, 2020. Piscataway: IEEE, 2020: 1747-1756.
[108]	ZHANG S F, XIE Y L, WAN J, et al. WiderPerson: a diverse dataset for dense pedestrian detection in the wild[J]. IEEE Transactions on Multimedia, 2020, 22(2):380-393. DOI URL
[109]	NEUMANN L, KARG M, ZHANG S, et al. NightOwls: a pedestrians at night dataset[C]// Proceedings of the 14th Asian Conference on Computer Vision, Perth, Dec 2-6, 2018. Cham: Springer, 2018: 691-705.
[110]	TUGGENER L, ELEZI I, SCHMIDHUBER J, et al. DeepScores—a dataset for segmentation, detection and classification of tiny objects[C]// Proceedings of the 2018 24th International Conference on Pattern Recognition, Beijing, Aug 20-24, 2018. Piscataway: IEEE, 2018: 3704-3709.

文献	年份	骨干网络	方法	优点	局限性
Adamdad^[39]	2019	MobileNet	该模型将YOLO V3中的骨干网络修改为MobileNet网络,去掉了原网络中的全连接层和SoftMax层	避免了普通卷积层中任意卷积核都需要对所有通道进行操作的缺陷	下采样操作导致浅层特征信息被忽略;待检目标被遮挡时,模型检测精度较低
Li等人^[41]	2021	Shuffle_SENet	在ShuffleNet模块之后引入SENet,使用全局平均池化将信道空间特征转化为全局特征,然后使用FC层降低模型复杂度	使用较少的组卷积来最小化内存访问成本,在保证速度的同时提升了模型的准确性	易产生边界效应,某个输出通道可能仅来自输入通道的一部分
YOLO V4^[43]	2020	CSPDarkNet-53	包含29个3×3的卷积层,725×725大小的感受野以及2.76×10⁷个参数	具有更大的感受野以及大量参数模型,能够在一张图像上检测不同尺寸的多个物体	采用CIoU损失,不能同时增大或缩小长和宽,导致收敛速度减慢
谢俊章等人^[45]	2021	CSPDarkNet-53	修改残差卷积次数为（×1,×3,×4,×6,×3）,结合PANet,使用残差连接代替连续卷积操作,生成3个不同尺度的特征图	并行提取目标的类别和位置信息,适用于密集分布和混合分布目标	模型无法达到较高的检测精度
YOLO V5^[46]	2020	Focus+CSP1_X	Focus模块进行切片操作,CSP1_X借鉴CSPNet,由3个卷积层和X个Res unint模块拼接组成	使用更轻量级的网络,模型简单,速度更快	通过长宽比筛选并过滤了大小和长宽比较极端的真实目标框
Pan等人^[47]	2019	DenseNet	在每个dense block中设置相同的特征图输出,每个dense block之间增加1×1卷积	减少了冗余参数,避免了梯度消失问题	与YOLOv3_Tiny相比,检测速度较慢,占用内存较大
李航等人^[48]	2020	slim-densenet	采用跳跃式连接的方式,使得特征可以跳过部分网络直接传递至较深的网络层,并将7×7、5×5和3×3卷积换为深度可分离卷积	减少参数量,加速特征在网络中的传递,减轻了梯度消失问题	模型通道冗余,可通过剪枝操作压缩模型,进一步提升训练速度
齐榕等人^[49]	2020	MobileNet	在Tiny-YOLO V3的基础上修改骨干网络为MobileNet,将原来的双尺度预测变为三尺度预测	适应不同类型的物体,增加网络层数,提高了检测精度	未考虑复杂环境下的目标,对于极端拍摄环境易造成漏检
郑秋梅等人^[50]	2020	DarkNet-50	过渡层使用1×1和3×3交替卷积,利用ResNet进行特征提取,并去除YOLO检测前的两组卷积层	避免了下采样操作导致的信息丢失问题,注重浅层特征信息,去除了冗余卷积层	依赖于车辆运动信息,模型泛化能力弱

文献	年份	骨干网络	方法	优点	局限性
Adamdad^[39]	2019	MobileNet	该模型将YOLO V3中的骨干网络修改为MobileNet网络,去掉了原网络中的全连接层和SoftMax层	避免了普通卷积层中任意卷积核都需要对所有通道进行操作的缺陷	下采样操作导致浅层特征信息被忽略;待检目标被遮挡时,模型检测精度较低
Li等人^[41]	2021	Shuffle_SENet	在ShuffleNet模块之后引入SENet,使用全局平均池化将信道空间特征转化为全局特征,然后使用FC层降低模型复杂度	使用较少的组卷积来最小化内存访问成本,在保证速度的同时提升了模型的准确性	易产生边界效应,某个输出通道可能仅来自输入通道的一部分
YOLO V4^[43]	2020	CSPDarkNet-53	包含29个3×3的卷积层,725×725大小的感受野以及2.76×10⁷个参数	具有更大的感受野以及大量参数模型,能够在一张图像上检测不同尺寸的多个物体	采用CIoU损失,不能同时增大或缩小长和宽,导致收敛速度减慢
谢俊章等人^[45]	2021	CSPDarkNet-53	修改残差卷积次数为（×1,×3,×4,×6,×3）,结合PANet,使用残差连接代替连续卷积操作,生成3个不同尺度的特征图	并行提取目标的类别和位置信息,适用于密集分布和混合分布目标	模型无法达到较高的检测精度
YOLO V5^[46]	2020	Focus+CSP1_X	Focus模块进行切片操作,CSP1_X借鉴CSPNet,由3个卷积层和X个Res unint模块拼接组成	使用更轻量级的网络,模型简单,速度更快	通过长宽比筛选并过滤了大小和长宽比较极端的真实目标框
Pan等人^[47]	2019	DenseNet	在每个dense block中设置相同的特征图输出,每个dense block之间增加1×1卷积	减少了冗余参数,避免了梯度消失问题	与YOLOv3_Tiny相比,检测速度较慢,占用内存较大
李航等人^[48]	2020	slim-densenet	采用跳跃式连接的方式,使得特征可以跳过部分网络直接传递至较深的网络层,并将7×7、5×5和3×3卷积换为深度可分离卷积	减少参数量,加速特征在网络中的传递,减轻了梯度消失问题	模型通道冗余,可通过剪枝操作压缩模型,进一步提升训练速度
齐榕等人^[49]	2020	MobileNet	在Tiny-YOLO V3的基础上修改骨干网络为MobileNet,将原来的双尺度预测变为三尺度预测	适应不同类型的物体,增加网络层数,提高了检测精度	未考虑复杂环境下的目标,对于极端拍摄环境易造成漏检
郑秋梅等人^[50]	2020	DarkNet-50	过渡层使用1×1和3×3交替卷积,利用ResNet进行特征提取,并去除YOLO检测前的两组卷积层	避免了下采样操作导致的信息丢失问题,注重浅层特征信息,去除了冗余卷积层	依赖于车辆运动信息,模型泛化能力弱

文献	年份	骨干网络	方法	优点	局限性
Fu等人^[6]	2017	ResNet101	在ResNet101后增加残差块,并使用卷积层3、5、6、7、8、9的输出作为预测层的输入;增加反卷积模块和预测模块	相比SSD,对小目标的检测效果得到显著提升	模型检测速度远不及SSD检测模型
Shen等人^[7]	2017	DenseNet	对DenseNet变形,包含Stem块、4个密集连接块、2个transition layer以及2个transition w/o pooling layer	缓解梯度消失,保证特征图分辨率,增加模型鲁棒性	预训练模型大,参数多,优化空间存在差异,应用领域受限
张侣等人^[51]	2021	Res-Am	一条独立的支路进行LCBAM操作,另一支路进行两次残差连接,最后使用Add进行特征融合	增强了网络的特征提取能力,引入注意力机制进行特征自适应学习,减少数据冗余	模型结构较复杂,较难满足网络实时性要求
赵鹏飞等人^[52]	2021	I-Darknet53	将1×1后的卷积层分为s个宽、高相同的通道组,输出包含不同感受野大小的组合	扩展网络宽度,提取更多全局信息,提升骨干网络的特征提取能力	在一定程度上加重了模型训练参数,速度减慢
奚琦等人^[53]	2021	DenseNet	采用3个连续的3×3卷积核代替7×7卷积核,得到19×19像素的特征图	减少参数,降低输入图像特征信息的消耗,最大程度地保留了目标细节信息	需进行多次concat操作,数据需多次复制,增加显存消耗
徐先峰等人^[54]	2020	MobileNet	构建深度可分离卷积,将Conv11和Conv13卷积层送入预测器预测	减少参数,降低模型复杂度,降低内存占用空间	检测精度不及SSD,当目标遮挡面积较大时容易产生漏检
Lu等人^[55]	2019	ResNet	将SSD中的VGG16至Conv8_2网络框架换为ResNet网络,对38×38、19×19、10×10、5×5、3×3、1×1像素的特征图进行检测	在复杂环境下学习更多特征,解决了网络退化的问题	网络层数增加,训练参数增多

文献	年份	骨干网络	方法	优点	局限性
Fu等人^[6]	2017	ResNet101	在ResNet101后增加残差块,并使用卷积层3、5、6、7、8、9的输出作为预测层的输入;增加反卷积模块和预测模块	相比SSD,对小目标的检测效果得到显著提升	模型检测速度远不及SSD检测模型
Shen等人^[7]	2017	DenseNet	对DenseNet变形,包含Stem块、4个密集连接块、2个transition layer以及2个transition w/o pooling layer	缓解梯度消失,保证特征图分辨率,增加模型鲁棒性	预训练模型大,参数多,优化空间存在差异,应用领域受限
张侣等人^[51]	2021	Res-Am	一条独立的支路进行LCBAM操作,另一支路进行两次残差连接,最后使用Add进行特征融合	增强了网络的特征提取能力,引入注意力机制进行特征自适应学习,减少数据冗余	模型结构较复杂,较难满足网络实时性要求
赵鹏飞等人^[52]	2021	I-Darknet53	将1×1后的卷积层分为s个宽、高相同的通道组,输出包含不同感受野大小的组合	扩展网络宽度,提取更多全局信息,提升骨干网络的特征提取能力	在一定程度上加重了模型训练参数,速度减慢
奚琦等人^[53]	2021	DenseNet	采用3个连续的3×3卷积核代替7×7卷积核,得到19×19像素的特征图	减少参数,降低输入图像特征信息的消耗,最大程度地保留了目标细节信息	需进行多次concat操作,数据需多次复制,增加显存消耗
徐先峰等人^[54]	2020	MobileNet	构建深度可分离卷积,将Conv11和Conv13卷积层送入预测器预测	减少参数,降低模型复杂度,降低内存占用空间	检测精度不及SSD,当目标遮挡面积较大时容易产生漏检
Lu等人^[55]	2019	ResNet	将SSD中的VGG16至Conv8_2网络框架换为ResNet网络,对38×38、19×19、10×10、5×5、3×3、1×1像素的特征图进行检测	在复杂环境下学习更多特征,解决了网络退化的问题	网络层数增加,训练参数增多

文献	骨干网络	速度/(frame/s)	mAP/%
文献	骨干网络	速度/(frame/s)	VOC2007	VOC2012	MS COCO
DSSD321^[6]	ResNet-101	9.5	78.60	76.3	33.20
DSSD513^[6]	ResNet-101	5.5	81.50	80.8	33.20
Shen等人^[7]	DenseNet	17.4	77.70	76.3	29.30
Adamdad^[39]	MobileNet	29.0	64.22	—	—
齐榕等人^[49]	MobileNet	—	73.30	—	40.20
Li等人^[41]	Shuffle_SENet	29.0	64.70	—	—
YOLOv3_Tiny	Darknet-Tiny	25.0	58.20	—	33.30
YOLO V5l^[46]	CSPNet	99.0	—	—	48.80
Pan等人^[47]	DenseNet	12.0	65.93	—	—
张侣等人^[51]	Res-Am	51.0	71.40	—	—
赵鹏飞等人^[52]	I-Darknet53	32.0	82.30	—	—
奚琦等人^[53]	DenseNet	58.0	82.30	—	—
Cheng等人^[56]	MobileNetV2-FPN	97.0	73.80	71.4	—
李文涛等人^[57]	ResNet	30.0	82.70	—	—
Zhai等人^[58]	DenseNet-S-32-1	11.6	78.90	76.5	29.50

深度学习中的单阶段小目标检测方法综述

Survey of One-Stage Small Object Detection Methods in Deep Learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 110

相关文章 15

编辑推荐

Metrics

文献	算法	速度/(frame/s)	mAP/%
文献	算法	速度/(frame/s)	VOC2007	COCO
李文涛等人^[57]	KNCA-Fusion	30.0	82.70	—
赵鹏飞等人^[52]	—	32.0	82.30	—
徐诚极等人^[60]	Attention- YOLO-A	26.0	81.70	32.6
张陶宁等人^[61]	MSFAN	46.0	75.50	33.6
Li等人^[62]	Attention-YOLO-B	25.0	81.90	33.5
于敏等人^[65]	改进型RetinaNet	—	79.50	40.9
刘建政等人^[67]	FIENet	36.0	82.30	33.8
张海涛等人^[68]	—	25.0	79.70	—
Pan等人^[69]	ADFPNet300	62.5	81.10	31.8
Pan等人^[69]	ADFPNet512	43.5	82.50	36.4
Zhu等人^[70]	ODMC300	34.8	79.60	30.2
Zhu等人^[70]	ODMC512	14.5	81.80	34.6
赵文清等人^[71]	改进型SSD300	33.0	80.60	—
	改进型SSD321	30.0	81.50
	改进型SSD512	12.0	82.50
鞠默然等人^[72]	AM-YOLO V3 416	38.9	82.69	—
鞠默然等人^[72]	AM-YOLO V3 544	26.5	83.43	—

文献	算法	特征融合策略	速度/(frame/s)	mAP/%
文献	算法	特征融合策略	速度/(frame/s)	VOC2007	VOC2012	COCO
Li等人^[9]	FSSD512	将Conv4_3、Fc7以及Conv7_2的特征图进行串联融合操作	35.7	80.9	82.4	31.8
赵鹏飞等人^[52]	—	采用多尺度空洞卷积级联扩大特征图的感受野,然后使用1×1卷积将特征信息融合;并将38×38大小的特征图与10×10和19×19的特征图拼接,最后采用ECAM模块进行通道加权	32.0	82.3	—	—
于敏等人^[65]	改进型RetinaNet	设计了一个自底向上的路径聚合模块,将聚合后的结果经过缩放整合与优化,最后得到多特征{M3,M4,M5,M6,M7},并与N3~N7的特征层再次相加融合	—	79.5	—	40.9
刘建政等人^[67]	FIENet	将Conv4_3和Fc7两个特征层融合,增加不同特征层之间的特征映射关系;将输入通道通过1×1卷积分成4份之后进行级联,增强细节语义信息	36.1	82.3	—	33.8
Shi等人^[74]	FFESSD300 FFESSD512	设计了两种不同模式的FFM模块,在其中一种FFM模块中利用反卷积调整特征图大小;同时,设计了浅层特征增强和深层特征增强模块	54.3 30.2	79.1 81.8	—	—
Woo等人^[75]	StairNet	通过反卷积与浅层特征信息融合,同时将融合后的特征传递到下一个反卷积层,以自顶向下的方式增强目标语义信息	30.0	78.8	76.4	33.6
张思宇等人^[76]	hgSSD	在SSD300的基础上对不同尺度的特征图进行反卷积操作与浅层特征融合,并在不同尺度的特征图上进行特征预测	36.0	82.3	82.3	33.8
王燕妮等人^[77]	改进的SSD	结合CSPNet的思想设计跳变连接网络,将特征图划分为两个分支,通过跨阶段层次结构合并	37.3	79.6	—	—

数据集名称	图像总数	实例数	类别数量	年份	应用场景
EuroCity Persons^[18]	47 300	238 200	7	2019	行人检测
DOTA^[19]	2 806	188 282	15	2018	航空图像检测
WIDER FACE^[20]	32 203	393 703	60	2016	人脸检测
CityPersons^[21]	5 000	约35 000	4	2017	行人检测
TinyPerson^[22]	1 610	72 651	5	2020	航空图像中的行人检测
Behrendt^[104]	8 334	13 493	4	2017	交通灯检测
AI-TOD^[105]	28 036	700 621	8	2021	航空图像检测
iSAID^[106]	2 806	655 451	15	2019	航空图像检测
WiderPerson^[108]	13 382	399 786	5	2019	行人检测
DeepScores^[110]	300 000	数百万	118	2018	乐谱图像检测

[1]	安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897.
[2]	曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763.
[3]	刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515.
[4]	赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503.
[5]	夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610.
[6]	孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259.
[7]	刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290.
[8]	程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154.
[9]	钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990.
[10]	赵鹏飞, 谢林柏, 彭力. 融合注意力机制的深层次小目标检测算法[J]. 计算机科学与探索, 2022, 16(4): 927-937.
[11]	裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790.
[12]	许嘉, 韦婷婷, 于戈, 黄欣悦, 吕品. 题目难度评估方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 734-759.
[13]	朱伟杰, 陈莹. 双流时间域信息交互的微表情识别卷积网络[J]. 计算机科学与探索, 2022, 16(4): 950-958.
[14]	姜艺, 胥加洁, 柳絮, 朱俊武. 边缘指导图像修复算法研究[J]. 计算机科学与探索, 2022, 16(3): 669-682.
[15]	张全贵, 胡嘉燕, 王丽. 耦合用户公共特征的单类协同过滤推荐算法[J]. 计算机科学与探索, 2022, 16(3): 637-648.