Journal of Frontiers of Computer Science and Technology

• Science Researches •     Next Articles

YOLO-Mamba contraband detection method integrating global perception and multi-Scale collaboration

SHENG Chunlei,  LIU Chengkai,  LI Zelong,  LU Shuhua   

  1. College of Information and Cyber Security, People’s Public Security University of China, Beijing 102600, China

融合全局感知与多尺度协同的YOLO-Mamba违禁品检测方法

生春雷,刘成恺,李泽龙,卢树华   

  1. 中国人民公安大学 信息网络安全学院,北京 102600

Abstract: To address the challenges including spatial multi-scale variations, target occlusion, and the high false negative and false positive rates of small contraband objects in X-ray security images, this paper proposes a YOLO-Mamba contraband detection algorithm that integrates global perception and multi-scale collaboration. Using YOLOv10 as the baseline, a GSSBlock-enhanced state-space model is incorporated into the backbone to optimize spatial information modeling and effectively focus on key region features. Additionally, a Multi-Residual Connected Pooling (M-RCP) structure is designed to enhance the perception of both global and edge information, improving foreground-background differentiation and mitigating the impact of complex background interference. In the neck, a Deep Feature Fusion Pyramid Network (DFFPN) is introduced, employing bidirectional cross-scale interactions and multi-level feature fusion to strengthen multi-scale feature perception and reduce false positives and false negatives. Furthermore, a dual-branch depthwise separable convolution (DWConv) fusion module is utilized to extract information from different receptive fields, effectively capturing fine-grained details of small contraband objects while maintaining computational efficiency. The proposed method is trained and evaluated on three public datasets: OPIXray, HIXray, and SIXray, achieving mAP50 scores of 94.0%, 82.9%, and 94.6%, with improvements of 5.7%, 2.7%, and 4.6% over the baseline. Experimental results demonstrate that the proposed approach outperforms various state-of-the-art algorithms while maintaining a lower complexity, effectively balancing detection accuracy and computational efficiency, making it a competitive solution for contraband detection in X-ray security screening.

Key words: X-ray images, contraband detection, state-space model, multi-residual connected pooling structure, multi-scale fusion

摘要: 针对X光安检图像空间多尺度变化、目标重叠遮挡以及小尺寸违禁品易漏检误检等问题,提出了一种融合全局感知与多尺度协同的YOLO-Mamba违禁品检测方法。以YOLOv10为基线,在主干部分引入门控结构感知块(GSSBlock)改进的状态空间模型,优化空间信息建模能力,有效关注关键区域特征,同时,设计多残差连接池化结构(M-RCP)提升对全局信息和边缘信息的感知能力,更好的区分X光图像前后景,降低复杂背景信息带来的干扰;在模型颈部设计深度特征金字塔网络(DFFPN),采用双向跨尺度和多层级交互的方式加强特征融合,增强多尺度特征感知能力,改善目标漏检误检问题,其中利用双支路深度可分离卷积(DWConv)设计融合模块,拾取不同感受野的信息,有效捕捉小尺寸违禁品细节特征同时保持较低的计算量。所提方法在OPIXray、HIXray和SIXray等3种公开数据集上进行了训练和测试,mAP50分别达到94.0%、82.9%和94.6%,较基线分别提升5.7%、2.7%和4.6%,实验结果优于诸多先进算法,且参数量较小,较好的兼顾了检测准确率与速率,是一种性能较为优异的违禁品检测方法。

关键词: X光图像, 违禁品检测, 状态空间模型, 多残差连接池化结构, 多尺度融合