计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (9): 2221-2238.DOI: 10.3778/j.issn.1673-9418.2402044

• 前沿·综述 • 上一篇    下一篇

YOLO系列目标检测算法综述

徐彦威,李军,董元方,张小利   

  1. 1. 吉林财经大学 管理科学与信息工程学院,长春 130117
    2. 吉林财经大学 人工智能研究中心,长春 130117
    3. 长春理工大学 经济管理学院,长春 130022
    4. 吉林大学 计算机科学与技术学院,长春 130015
  • 出版日期:2024-09-01 发布日期:2024-09-01

Survey of Development of YOLO Object Detection Algorithms

XU Yanwei, LI Jun, DONG Yuanfang, ZHANG Xiaoli   

  1. 1. School of Management Science and Information Engineering, Jilin University of Finance and Economics, Changchun 130117, China
    2. Center for Artificial Intelligence, Jilin University of Finance and Economics, Changchun 130117, China
    3. School of Economics and Management, Changchun University of Science and Technology, Changchun 130022, China
    4. College of Computer Science and Technology, Jilin University, Changchun 130015, China
  • Online:2024-09-01 Published:2024-09-01

摘要: 近年来,基于深度学习的目标检测算法是计算机视觉研究热点,YOLO算法作为一种优秀的目标检测算法,其发展历程中网络架构的改进,对于提高检测速度和精度起到了重要作用。对YOLOv1~YOLOv9的整体框架进行了横向分析,从网络架构(骨干网络、颈部层、头部层)、损失函数方面进行了对比分析,充分讨论了不同改进方法的优势和局限性,具体评估了改进方法对模型精度的提升效果。讨论了数据集的选择与构建方法、不同评价指标的选择依据,及其在不同应用场景中的适用性和局限性,深入研究了在五个应用领域(工业、交通、遥感、农业、生物)YOLO算法的具体改进,并对检测速度、检测精度及复杂度之间的平衡进行探讨。分析了YOLO在各领域的发展现状,通过具体实例总结YOLO算法研究中存在的问题,并结合应用领域的发展趋势,展望YOLO系列算法的未来,详细探讨了YOLO算法的四个研究方向(多任务学习、边缘计算、多模态结合、虚拟和增强现实技术)。

关键词: YOLO算法, 目标检测, 计算机视觉, 特征提取, 卷积神经网络

Abstract: In recent years, deep learning-based object detection algorithms have been a hot topic in computer vision research, with the YOLO (you only look once) algorithm standing out as an excellent object detection algorithm. The evolution of its network architecture has played a crucial role in improving detection speed and accuracy. This paper conducts a comprehensive horizontal analysis of the overall frameworks of YOLOv1 to YOLOv9, comparing  the network architecture (backbone network, neck layers and head layers) and loss functions. The strengths and limitations of different improvement methods are thoroughly discussed, with a specific evaluation of the impact of these improvements on model accuracy. This paper also delves into discussions on dataset selection and construction methods, the rationale behind choosing different evaluation metrics, and their applicability and limitations in various application scenarios. It further explores specific improvement methods for YOLO algorithm in five application domains (industrial, transportation, remote sensing, agriculture, biology), and discusses the balance among detection speed, accuracy, and complexity in these application domains. Finally, this paper analyzes the current development status of YOLO in various fields, summarizes existing issues in YOLO algorithm research through specific examples, and in conjunction with the trends in application domains, provides an outlook on the future of the YOLO algorithm. It also offers detailed explanations for four future research directions of YOLO (multi-task learning, edge computing, multimodal integration, virtual and augmented reality technology).

Key words: YOLO algorithm, object detection, computer vision, feature extraction, convolutional neural network