自适应全景聚焦X射线图像违禁品检测算法

崔丽群,杨莹莹,金海波,等. 自适应全景聚焦X射线图像违禁品检测算法[J]. 光电工程,2025,52(4): 240286. doi: 10.12086/oee.2025.240286
引用本文: 崔丽群,杨莹莹,金海波,等. 自适应全景聚焦X射线图像违禁品检测算法[J]. 光电工程,2025,52(4): 240286. doi: 10.12086/oee.2025.240286
Cui L Q, Yang Y Y, Jin H B, et al. Adaptive panoramic focusing X-ray image contraband detection algorithm[J]. Opto-Electron Eng, 2025, 52(4): 240286. doi: 10.12086/oee.2025.240286
Citation: Cui L Q, Yang Y Y, Jin H B, et al. Adaptive panoramic focusing X-ray image contraband detection algorithm[J]. Opto-Electron Eng, 2025, 52(4): 240286. doi: 10.12086/oee.2025.240286

自适应全景聚焦X射线图像违禁品检测算法

  • 基金项目:
    国家自然科学基金(62173171);辽宁省高等学校基本科研项目(LJKMZ20220699)
详细信息
    作者简介:
    *通讯作者: 杨莹莹,1429337791@qq.com
  • 中图分类号: TP391.4

  • CSTR: 32245.14.oee.2025.240286

Adaptive panoramic focusing X-ray image contraband detection algorithm

  • Fund Project: National Natural Science Foundation of China (62173171), Basic Scientific Research Project of Liaoning Provincial Universities (LJKMZ20220699)
More Information
  • 针对X射线安检图像中样本重叠遮挡占比高、关键特征提取困难、背景噪声大导致的漏检和误检问题,提出一种自适应全景聚焦X射线图像违禁品检测算法。首先,设计前景特征感知模块,通过强化前景目标的边缘结构和纹理细节,精准区分违禁品和背景噪声,提高特征表达的准确性和完整性。然后,结合多分支结构和双重交叉注意力机制构造多路径双维信息整合模块,优化通道和空间维度的特征交互与融合,加强关键特征的提取能力,有效抑制背景干扰。最后,构建全景动态聚焦检测头,通过频率自适应空洞卷积实现感受野的动态调整,以适配小尺寸违禁品目标的特征频率分布,增强模型对小目标的识别能力。在公开数据集SIXray和OPIXray上进行训练和测试,mAP@0.5分别达到93.3%和92.5%,优于其他对比算法。实验结果表明,该模型显著改善了X射线图像中违禁品的漏检和误检情况,具有较高的准确性和鲁棒性。

  • Overview: X-ray image detection of prohibited items plays a crucial role in various fields, including public transportation, logistics, and customs inspection. It is a key technology in image processing and object detection, with the primary task of accurately identifying the category and location of prohibited items within complex background environments to ensure the safety of human life, property, and goods transportation. Unlike natural images, prohibited item images are generated using X-ray imaging technology, where the targets exhibit diverse categories and varying shapes. Moreover, these images are often affected by challenges such as target stacking, occlusion, low contrast, and complex backgrounds, making it difficult to accurately identify the correct targets, thereby leading to missed and false detections. Consequently, achieving precise identification of prohibited items and improving detection efficiency have become critical challenges and focal points in current research. To address the issues of target overlap and occlusion, difficulty in key feature extraction, and missed detection of small-sized contraband in X-ray images, this paper proposes an adaptive panoramic focus X-ray contraband detection algorithm based on the YOLOv8n model. This algorithm incorporates several novel components designed to enhance detection accuracy and efficiency. First, a foreground feature awareness module (FFAM) is proposed to significantly enhance the model's ability to represent the features of foreground targets, enabling accurate identification of contraband objects in overlapping and occluded scenes. Second, a multi-path two-dimensional information integration (MPTI) module is designed to enhance the model's ability to recognize key features by optimizing the interaction and integration of multi-scale features across both channel and spatial dimensions, enabling the extraction of more comprehensive and richer contextual information. Finally, a panoramic dynamic focus detection head (PDF_Detect) is introduced. By incorporating frequency-adaptive dilated convolutions and a dynamic focusing mechanism, the model can adaptively select the optimal receptive field size based on the frequency distribution of features. This enhances the model's ability to focus on small-sized contraband targets, effectively improving the detection of small targets and reducing both missed and false detections in complex scenes. Experiments were conducted on the public datasets SIXray and OPIXray. The experimental results show that the proposed method achieved mAP@0.5 values of 93.3% and 92.5%, representing improvements of 3.6% and 2.8% over the baseline model, respectively, and outperforming other comparative algorithms. These results demonstrate that the proposed algorithm significantly reduces missed and false detections of contraband in X-ray images, exhibiting high accuracy and robustness.

  • 加载中
  • 图 1  算法的整体结构

    Figure 1.  Overall structure of algorithm

    图 2  前景特征感知模块(FFAM)的结构

    Figure 2.  Structure of foreground feature awareness module (FFAM)

    图 3  区域信息聚合(RIA)

    Figure 3.  Region information aggregation (RIA)

    图 4  多路径双维信息整合(MPTI)结构

    Figure 4.  Structure of multi-path two-dimensional information integration (MPTI)

    图 5  双重交叉注意力机制(DCA)结构。(a) DCA模块;(b) CCA模块;(c) SCA模块

    Figure 5.  Structure of dual cross attention mechanism (DCA). (a) DCA module; (b) CCA module; (c) SCA module

    图 6  频率自适应空洞卷积(FADC)结构

    Figure 6.  Structure of frequency adaptive dilated convolution (FADC)

    图 7  全景动态聚焦检测头(PDF_Detect)结构

    Figure 7.  Structure of panoramic dynamic focus detection head (PDF_Detect)

    图 8  YOLOv8n与改进模型评价指标对比图

    Figure 8.  Comparison of evaluation indicators between the YOLOv8n and improved model

    图 9  基线模型YOLOv8n(左)与改进模型(右)在SIXray数据集上的F1-curve对比图

    Figure 9.  F1-curve comparison diagram between the baseline model YOLOv8n (left) and improved model (right) on the SIXray dataset

    图 10  改进算法在SIXray数据集上的收敛曲线

    Figure 10.  Convergence curves of the improved algorithm on the SIXray dataset

    图 11  检测效果对比

    Figure 11.  Comparison of detection results

    表 1  改进算法在SIXray数据集的消融实验结果

    Table 1.  Ablation experiment results of the improved algorithm in the SIXray dataset

    No.YOLOv8nABCP/%R/%mAP@0.5/%mAP@0.5∶0.95/%Params/MGFLOPs
    1×××91.384.289.766.13.018.1
    2××93.885.491.868.64.1711.3
    3××92.084.090.666.22.486.2
    4××91.884.991.366.52.787.7
    5×94.186.392.269.42.867.8
    6×94.786.792.670.73.599.4
    7×94.387.391.968.62.656.8
    894.688.893.372.42.927.9
    下载: 导出CSV

    表 2  消融实验各类别精度对比结果

    Table 2.  Comparison results of the accuracy of ablation experiments by category

    No.YOLOv8nABCAP/%mAP@0.5/%
    GUKNWRPLSC
    1×××97.886.587.694.584.889.7
    2××98.587.890.595.387.491.8
    3××98.286.988.695.185.290.6
    4××98.487.488.494.687.991.3
    5×99.288.490.796.086.592.2
    6×99.188.292.195.886.892.6
    7×98.786.892.695.685.891.9
    899.489.093.596.487.193.3
    下载: 导出CSV

    表 3  SIXray数据集对比实验结果

    Table 3.  Comparison experiment results on the SIXray dataset

    ModelAP/%P/%R/%mAP@0.5/%FPSParams/MGFLOPs
    GUKNWRPLSC
    SSD91.674.869.880.983.483.570.879.660.426.2862.7
    Faster R-CNN89.279.480.186.484.287.975.185.531136.72369.8
    YOLOv5n98.788.282.690.579.792.778.787.91012.507.1
    YOLOv697.183.787.092.281.187.482.188.2121.64.2311.8
    YOLOv7-Tiny97.782.685.089.878.786.681.386.794.56.0213.2
    YOLOv8n97.886.587.694.584.891.384.289.7108.63.018.1
    YOLOv998.486.688.092.984.891.482.590.1111.52.278.2
    YOLOv10n98.387.590.295.886.090.185.791.4116.32.718.4
    Ours99.489.093.596.487.194.688.893.3125.92.927.9
    下载: 导出CSV

    表 4  OPIXray数据集对比实验结果

    Table 4.  Comparison experiment results on OPIXray dataset

    ModelAP/%P/%R/%mAP@0.5/%FPSParams/MGFLOPs
    STFOSCUTMU
    SSD33.573.489.564.180.870.562.865.753.226.2862.7
    Faster R-CNN68.388.790.082.589.489.082.784.825.4136.72369.8
    YOLOv5n72.692.498.585.193.387.883.988.4109.82.507.1
    YOLOv678.692.098.387.194.689.286.190.1119.54.2311.8
    YOLOv7-Tiny65.791.697.984.084.090.781.486.4102.96.0213.2
    YOLOv8n76.994.398.085.393.890.585.589.7116.83.018.1
    YOLOv975.091.698.683.093.588.585.288.3111.72.278.2
    YOLOv10n68.993.897.481.893.488.381.584.9108.52.718.4
    Ours79.495.999.288.596.193.289.492.5123.82.927.9
    下载: 导出CSV
  • [1]

    常青青, 陈嘉敏, 李维姣. 城市轨道交通安检中基于X射线图像的危险品识别技术研究[J]. 城市轨道交通研究, 2022, 25(4): 205−209. doi: 10.16037/j.1007-869x.2022.04.044

    Chang Q Q, Chen J M, Li W J. Dangerous goods detection technology based on X-ray images in urban rail transit security inspection[J]. Urban Mass Transit, 2022, 25(4): 205−209. doi: 10.16037/j.1007-869x.2022.04.044

    [2]

    陈志强, 张丽, 金鑫. X射线安全检查技术研究新进展[J]. 科学通报, 2017, 62(13): 1350−1364. doi: 10.1360/N972016-00698

    Chen Z Q, Zhang L, Jin X. Recent progress on X-ray security inspection technologies[J]. Chin Sci Bull, 2017, 62(13): 1350−1364. doi: 10.1360/N972016-00698

    [3]

    Akcay S, Breckon T. Towards automatic threat detection: a survey of advances of deep learning within X-ray security imaging[J]. Pattern Recognit, 2022, 122: 108245. doi: 10.1016/j.patcog.2021.108245

    [4]

    梁添汾, 张南峰, 张艳喜, 等. 违禁品X光图像检测技术应用研究进展综述[J]. 计算机工程与应用, 2021, 57(16): 74−82. doi: 10.3778/j.issn.1002-8331.2103-0476

    Liang T F, Zhang N F, Zhang Y X, et al. Summary of research progress on application of prohibited item detection in X-ray images[J]. Comput Eng Appl, 2021, 57(16): 74−82. doi: 10.3778/j.issn.1002-8331.2103-0476

    [5]

    Mery D, Riffo V, Zuccar I, et al. Object recognition in X-ray testing using an efficient search algorithm in multiple views[J]. Insight-Non-Destr Test Cond Monit, 2017, 59(2): 85−92. doi: 10.1784/insi.2017.59.2.85

    [6]

    王宇, 邹文辉, 杨晓敏, 等. 基于计算机视觉的X射线图像异物分类研究[J]. 液晶与显示, 2017, 32(4): 287−293. doi: 10.3788/YJYXS20173204.0287

    Wang Y, Zou W H, Yang X M, et al. X-ray image illegal object classification based on computer vision[J]. Chin J Liq Cryst Disp, 2017, 32(4): 287−293. doi: 10.3788/YJYXS20173204.0287

    [7]

    Turcsany D, Mouton A, Breckon T P. Improving feature-based object recognition for X-ray baggage security screening using primed visualwords[C]//Proceedings of 2013 IEEE International Conference on Industrial Technology, 2013: 1140–1145. https://doi.org/10.1109/ICIT.2013.6505833.

    [8]

    Dai J F, Li Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 379–387.

    [9]

    Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031

    [10]

    Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.

    [11]

    Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[J]. IEEE Trans Pattern Anal Mach Intell, 2020, 42(2): 318−327. doi: 10.1109/TPAMI.2018.2858826

    [12]

    Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779–788. https://doi.org/10.1109/CVPR.2016.91.

    [13]

    Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517–6525. https://doi.org/10.1109/CVPR.2017.690.

    [14]

    Redmon J, Farhadi A. Yolov3: an incremental improvement[Z]. arXiv: 1804.02767, 2018. https://arxiv.org/abs/1804.02767.

    [15]

    Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[Z]. arXiv: 2004.10934, 2020. https://arxiv.org/abs/2004.10934.

    [16]

    Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding YOLO series in 2021[Z]. arXiv: 2107.08430, 2021. https://arxiv.org/abs/2107.08430.

    [17]

    Zhu Z M, Zhu Y, Wang H R, et al. FDTNet: enhancing frequency-aware representation for prohibited object detection from X-ray images via dual-stream transformers[J]. Eng Appl Artif Intell, 2024, 133: 108076. doi: 10.1016/j.engappai.2024.108076

    [18]

    Ahmed A, Velayudhan D, Hassan T, et al. Enhancing security in X-ray baggage scans: a contour-driven learning approach for abnormality classification and instance segmentation[J]. Eng Appl Artif Intell, 2024, 130: 107639. doi: 10.1016/j.engappai.2023.107639

    [19]

    董乙杉, 郭靖圆, 李明泽, 等. 基于反向瓶颈和LCBAM设计的X光违禁品检测[J]. 计算机科学与探索, 2024, 18(5): 1259−1270. doi: 10.3778/j.issn.1673-9418.2301041

    Dong Y S, Guo J Y, Li M Z, et al. X-ray prohibited items detection based on inverted bottleneck and light convolution block attention module[J]. J Front Comput Sci Technol, 2024, 18(5): 1259−1270. doi: 10.3778/j.issn.1673-9418.2301041

    [20]

    Zhou Y T, Cao K Y, Piao J C. Fine-YOLO: a simplified X-ray prohibited object detection network based on feature aggregation and normalized Wasserstein distance[J]. Sensors (Basel), 2024, 24(11): 3588. doi: 10.3390/s24113588

    [21]

    Han L, Ma C H, Liu Y, et al. SC-Lite: an efficient lightweight model for real-time X-ray security check[J]. IEEE Access, 2024, 12: 103419−103432. doi: 10.1109/ACCESS.2024.3433455

    [22]

    Han L, Ma C H, Liu Y, et al. SC-YOLOv8: a security check model for the inspection of prohibited items in X-ray images[J]. Electronics, 2023, 12(20): 4208. doi: 10.3390/electronics12204208

    [23]

    Wang A L, Yuan P F, Wu H B, et al. Improved YOLOv8 for dangerous goods detection in X-ray security images[J]. Electronics, 2024, 13(16): 3238. doi: 10.3390/electronics13163238

    [24]

    Wang Z S, Wang X H, Shi Y T, et al. Lightweight detection method for X-ray security inspection with occlusion[J]. Sensors, 2024, 24(3): 1002. doi: 10.3390/s24031002

    [25]

    Chaple G, Daruwala R D. Design of Sobel operator based image edge detection algorithm on FPGA[C]//Proceedings of 2014 International Conference on Communication and Signal Processing, 2014: 788–792. https://doi.org/10.1109/ICCSP.2014.6949951.

    [26]

    Lin M, Chen Q, Yan S C. Network in network[Z]. arXiv: 1312.4400, 2014. https://arxiv.org/abs/1312.4400.

    [27]

    Yin X Y, Goudriaan J, Lantinga E A, et al. A flexible sigmoid function of determinate growth[J]. Ann Bot, 2003, 91(3): 361−371. doi: 10.1093/aob/mcg029

    [28]

    Yang F H, Jiang R Q, Yan Y, et al. Dual-mode learning for multi-dataset X-ray security image detection[J]. IEEE Trans Inf Foren Secur, 2024, 19: 3510−3524. doi: 10.1109/TIFS.2024.3364368

    [29]

    Wei Y L, Tao R S, Wu Z J, et al. Occluded prohibited items detection: an X-ray security inspection benchmark and de-occlusion attention module[C]//Proceedings of the 28th ACM International Conference on Multimedia, 2020: 138–146. https://doi.org/10.1145/3394171.3413828.

    [30]

    Guo S C, Jin Q Z, Wang H Z, et al. Learnable gated convolutional neural network for semantic segmentation in remote-sensing images[J]. Remote Sens, 2019, 11(16): 1922. doi: 10.3390/rs11161922

    [31]

    Arora R, Basu A, Mianjy P, et al. Understanding deep neural networks with rectified linear units[C]//Proceedings of the 6th International Conference on Learning Representations, 2018.

    [32]

    Ding M Y, Xiao B, Codella N, et al. DaViT: dual attention vision transformers[C]//Proceedings of the 17th European Conference on Computer Vision, 2022: 74–92. https://doi.org/10.1007/978-3-031-20053-3_5.

    [33]

    Sun G Q, Pan Y Z, Kong W K, et al. DA-TransUNet: integrating spatial and channel dual attention with transformer U-net for medical image segmentation[J]. Front Bioeng Biotechnol, 2024, 12: 1398237. doi: 10.3389/fbioe.2024.1398237

    [34]

    Wang H N, Cao P, Wang J Q, et al. UCTransNet: rethinking the skip connections in U-Net from a channel-wise perspective with transformer[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence, 2022: 2441–2449. https://doi.org/10.1609/aaai.v36i3.20144.

    [35]

    Lei Ba J, Kiros J R, Hinton G E. Layer normalization[Z]. arXiv: 1607.06450, 2016. https://arxiv.org/abs/1607.06450.

    [36]

    Hendrycks D, Gimpel K. Gaussian error linear units (GELUs)[Z]. arXiv: 1606.08415v4, 2023. https://arxiv.org/abs/1606.08415v4.

    [37]

    Chollet F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1800–1807. https://doi.org/10.1109/CVPR.2017.195.

    [38]

    Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[C]//Proceedings of the 4th International Conference on Learning Representations, 2016.

    [39]

    Yu F, Koltun V, Funkhouser T. Dilated residual networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 636–644. https://doi.org/10.1109/CVPR.2017.75.

    [40]

    Chen L W, Gu L, Zheng D Z, et al. Frequency-adaptive dilated convolution for semantic segmentation[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 3414–3425. https://doi.org/10.1109/CVPR52733.2024.00328.

    [41]

    Wahab M F, Gritti F, O'Haver T C. Discrete Fourier transform techniques for noise reduction and digital enhancement of analytical signals[J]. TrAC Trends Anal Chem, 2021, 143: 116354. doi: 10.1016/j.trac.2021.116354

    [42]

    Miao C J, Xie L X, Wan F, et al. SIXray: a large-scale security inspection X-ray benchmark for prohibited item discovery in overlapping images[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 2114–2123. https://doi.org/10.1109/CVPR.2019.00222.

    [43]

    Li C Y, Li L L, Jiang H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[Z]. arXiv: 2209.02976, 2022. https://arxiv.org/abs/2209.02976.

    [44]

    Liu C, Hong Z Y, Yu W H, et al. An efficient helmet wearing detection method based on YOLOv7-tiny[C]//Proceedings of the 6th International Conference on Machine Learning and Machine Intelligence, 2023: 92–99. https://doi.org/10.1145/3635638.3635652.

    [45]

    Wang C Y, Yeh I H, Mark Liao H Y. YOLOv9: learning what you want to learn using programmable gradient information[C]//Proceedings of the 18th European Conference on Computer Vision, 2024: 1–21. https://doi.org/10.1007/978-3-031-72751-1_1.

    [46]

    Wang A, Chen H, Liu L H, et al. Yolov10: real-time end-to-end object detection[C]//Proceedings of the 38th Conference on Neural Information Processing Systems, 2024.

  • 加载中

(12)

(4)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2024-12-05
修回日期:  2025-02-16
录用日期:  2025-02-17
刊出日期:  2025-04-25

目录

/

返回文章
返回