YOLO-Mamba contraband detection method integrating global perception and multi-Scale collaboration

doi:10.3778/j.issn.1673-9418.2503063

Abstract

Abstract: To address the challenges including spatial multi-scale variations, target occlusion, and the high false negative and false positive rates of small contraband objects in X-ray security images, this paper proposes a YOLO-Mamba contraband detection algorithm that integrates global perception and multi-scale collaboration. Using YOLOv10 as the baseline, a GSSBlock-enhanced state-space model is incorporated into the backbone to optimize spatial information modeling and effectively focus on key region features. Additionally, a Multi-Residual Connected Pooling (M-RCP) structure is designed to enhance the perception of both global and edge information, improving foreground-background differentiation and mitigating the impact of complex background interference. In the neck, a Deep Feature Fusion Pyramid Network (DFFPN) is introduced, employing bidirectional cross-scale interactions and multi-level feature fusion to strengthen multi-scale feature perception and reduce false positives and false negatives. Furthermore, a dual-branch depthwise separable convolution (DWConv) fusion module is utilized to extract information from different receptive fields, effectively capturing fine-grained details of small contraband objects while maintaining computational efficiency. The proposed method is trained and evaluated on three public datasets: OPIXray, HIXray, and SIXray, achieving mAP50 scores of 94.0%, 82.9%, and 94.6%, with improvements of 5.7%, 2.7%, and 4.6% over the baseline. Experimental results demonstrate that the proposed approach outperforms various state-of-the-art algorithms while maintaining a lower complexity, effectively balancing detection accuracy and computational efficiency, making it a competitive solution for contraband detection in X-ray security screening.

Key words: X-ray images, contraband detection, state-space model, multi-residual connected pooling structure, multi-scale fusion

摘要： 针对X光安检图像空间多尺度变化、目标重叠遮挡以及小尺寸违禁品易漏检误检等问题，提出了一种融合全局感知与多尺度协同的YOLO-Mamba违禁品检测方法。以YOLOv10为基线，在主干部分引入门控结构感知块（GSSBlock）改进的状态空间模型，优化空间信息建模能力，有效关注关键区域特征，同时，设计多残差连接池化结构（M-RCP）提升对全局信息和边缘信息的感知能力，更好的区分X光图像前后景，降低复杂背景信息带来的干扰；在模型颈部设计深度特征金字塔网络（DFFPN），采用双向跨尺度和多层级交互的方式加强特征融合，增强多尺度特征感知能力，改善目标漏检误检问题，其中利用双支路深度可分离卷积（DWConv）设计融合模块，拾取不同感受野的信息，有效捕捉小尺寸违禁品细节特征同时保持较低的计算量。所提方法在OPIXray、HIXray和SIXray等3种公开数据集上进行了训练和测试，mAP50分别达到94.0%、82.9%和94.6%，较基线分别提升5.7%、2.7%和4.6%，实验结果优于诸多先进算法，且参数量较小，较好的兼顾了检测准确率与速率，是一种性能较为优异的违禁品检测方法。

关键词: X光图像, 违禁品检测, 状态空间模型, 多残差连接池化结构, 多尺度融合

SHENG Chunlei, LIU Chengkai, LI Zelong, LU Shuhua. YOLO-Mamba contraband detection method integrating global perception and multi-Scale collaboration[J]. Journal of Frontiers of Computer Science and Technology, DOI: 10.3778/j.issn.1673-9418.2503063.

生春雷, 刘成恺, 李泽龙, 卢树华. 融合全局感知与多尺度协同的YOLO-Mamba违禁品检测方法[J]. 计算机科学与探索, DOI: 10.3778/j.issn.1673-9418.2503063.

[1]	GUO Xiangyu, PNEG Lilan, LI Chongshou, LI Tianrui. Multi-scale Fusion and Dynamic Adaptive Graph Bus Passenger Flow Prediction Model [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1879-1888.
[2]	WANG Longye, XIAO Yue, ZENG Xiaoli, ZHANG Kaixin, MA Ao. Skin Disease Segmentation Method Combining Dense Encoder and Dual-Path Attention [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 978-989.

YOLO-Mamba contraband detection method integrating global perception and multi-Scale collaboration

融合全局感知与多尺度协同的YOLO-Mamba违禁品检测方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 2

Recommended Articles

Metrics