计算机科学与探索 ›› 2024, Vol. 18 ›› Issue (5): 1301-1317.DOI: 10.3778/j.issn.1673-9418.2401007

• 图形·图像 • 上一篇    下一篇

三维多层次特征协同的无人机遥感目标检测算法

吕伏,傅宇恒,贺丽娜,杨冬鹏   

  1. 1. 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
    2. 辽宁工程技术大学 基础教学部,辽宁 葫芦岛 125105
    3. 无锡飞谱电子信息技术有限公司,江苏 无锡 214000
    4. 辽宁省水利水电勘测设计研究院有限责任公司,沈阳 110000
  • 出版日期:2024-05-01 发布日期:2024-04-29

UAV Remote Sensing Object Detection Based on 3D Multi-layer Feature Collaboration

LYU Fu, FU Yuheng, HE Lina, YANG Dongpeng   

  1. 1. School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China
    2. Department of Basic Teaching, Liaoning Technical University, Huludao, Liaoning 125105, China
    3. Wuxi Feipu Electronic Information Technology Co., Ltd., Wuxi, Jiangsu 214000, China
    4. Liaoning Water Conservancy and Hydropower Survey and Design Research Institute Co., Ltd., Shenyang 110000, China
  • Online:2024-05-01 Published:2024-04-29

摘要: 针对无人机航拍图像小目标占比大和背景复杂的特点,当前目标检测模型存在精度低和小目标漏检等问题。基于YOLOv8s模型,提出了三维多层次特征协同的无人机遥感目标检测算法。首先,在坐标注意力的基础上提出了三维多分支坐标注意力(MBCA),通过增加通道维度的信息交互和扩展分支的拆分融合,减少空间维度的计算量,提高了模型全局特征提取能力。其次,采用SPD-Conv替换部分标准卷积,在下采样时有效保留更多特征信息并加快推理速度。然后,在C2f模块中采用了更高效的FastDBB_Bottleneck模块,结合PConv与DBB结构重参数化叠加,以进一步降低模型计算量。最终,通过引入PG-Detect检测头,显著减少计算量并有效降低小目标的漏检率。在VisDrone2019数据集上的实验结果显示,该方法的mAP50值达到了44.5%,较YOLOv8s基线模型提升了5.7个百分点。同时,在自建水坝裂缝数据集上,进行裂缝检测验证实验,改进方法的mAP50值相比YOLOv8s提升了3.3个百分点,FPS达到289帧。实验结果表明在复杂场景目标检测中,所提方法提升了检测模型的精度和实时性,具有良好的适应性和鲁棒性。

关键词: 无人机遥感, 三维多分支坐标注意力(MBCA), YOLOv8, 多层次特征融合, 小目标检测

Abstract: To solve the large proportion of small targets and complex background in UAV (unmanned aerial vehicle) aerial image, the current object detection model has the problems of low accuracy and missed detection of small targets. Based on the YOLOv8s model, this paper proposes a 3D multi-layer feature collaboration UAV remote sensing object detection algorithm. Firstly, based on the coordinate attention, this paper proposes 3D multi-branch coordinate attention (MBCA), which improves the global feature extraction ability of the model and reduces the computation of spatial dimension by increasing the information interaction of channel dimension and the splitting and fusion of extended branches. Secondly, SPD-Conv is used to replace part of the standard convolution, which effectively retains more feature information and speeds up inference during downsampling. Then, a more efficient FastDBB_Bottleneck module is used in the C2f module, combining PConv and DBB structure reparameterization superposition to further reduce the calculation of the model. Finally, PG-Detect detection head is introduced to significantly reduce the calculation and effectively reduce the missed detection rate of small targets. Experimental results on the VisDrone2019 dataset show that the mAP50 value of the proposed method reaches 44.5%, which is 5.7 percentage points higher than that of the YOLOv8s baseline model. Simultaneously, the crack detection verification experiment is carried out on the self-built dam crack dataset, and the mAP50 value of the improved method is 3.3 percentage points higher than that of YOLOv8s, the FPS reaches 289 frames. Experimental results show that the proposed method improves the accuracy and real-time performance of the detection model in complex scene object detection, and has good adaptability and robustness.

Key words: UAV remote sensing, 3D multi-branch coordinate attention (MBCA), YOLOv8, multi-layer feature fusion, small target detection