-
摘要
为增强钢材表面检测中对小目标缺陷的检测能力,提出一种改进的YOLOv8-SOE模型。该模型通过设计FSCConv模块来处理P2层特征,通过压缩P2层特征并将其与P3层特征深度融合,有效增强模型对小目标特征的敏感性,同时避免引入额外检测层带来的计算负担。为进一步优化多尺度特征融合能力,采用CSP-OK (cross stage partial omni-kernel)模块优化多尺度特征融合,提高不同尺度特征的整合效率。通过引入SIoU损失函数优化边界框回归,进一步提升定位精度。实验结果表明,YOLOv8-SOE模型在NEU-DET数据集上的mAP达80.7%,较基准模型提升5.4%,且在VOC2012数据集上具有较强的泛化能力。该模型在提升小目标检测精度的同时,保持较高的计算效率,展现出良好的应用潜力。
Abstract
In order to improve the detection capability of small target defects in steel surface inspection, an improved YOLOv8-SOE model is proposed. The model processes the P2 layer features by designing the FSCConv module. By compressing the P2 layer features and deeply fusing them with the P3 layer features, the model's sensitivity to small target features is effectively enhanced, while avoiding the computational burden caused by the introduction of additional detection layers. In order to further optimize the multi-scale feature fusion capability, cross stage partial omni-kernel (CSP-OK) module is used to optimize the multi-scale feature fusion, which improves the integration efficiency of features of different scales. The SIoU loss function is introduced to optimize the bounding box regression, which further improves the positioning accuracy. Experimental results show that the mAP of the YOLOv8-SOE model on the NEU-DET dataset achieves 80.7%, which is 5.4% higher than the baseline model, and has good generalization ability on the VOC2012 dataset. While improving the accuracy of small target detection, the model maintains a high computational efficiency and has good application prospects.
-
Key words:
- YOLOv8 /
- defect detection /
- small object detection /
- feature fusion /
- loss function
-
Overview
Overview: In industrial applications such as steel surface defect detection, small target detection remains a challenging task due to the limited resolution of conventional detection layers, making it difficult to capture fine-grained defect details. Although YOLOv8 has shown remarkable performance in multi-scale target detection and complex environments, the model struggles with small target detection, especially when it comes to tiny defects on steel surfaces. Traditional solutions to enhance small target perception, such as adding additional detection heads like the P2 layer, often lead to increased computational overhead and inference time. To address this issue, this paper proposes an improved YOLOv8-SOE model that specifically enhances the detection performance for small defects on steel surfaces. The YOLOv8-SOE model incorporates several innovations aimed at improving both detection accuracy and computational efficiency. First, a novel feature processing module, FSCConv, is introduced to handle the P2 layer features. FSCConv leverages dilated convolutions to capture multi-scale contextual information while preserving fine details of small targets. This approach enhances small target perception without the need for additional detection layers, thus avoiding the computational burden typically associated with such modifications. Next, the processed P2 features are fused with P3 layer features, improving small target detection further without incurring significant computational costs. To optimize the feature fusion process, a cross-stage local network combined with Omni-Kernel (CSP-OK) is proposed. CSP-OK primarily leverages the CSPNet approach to reduce redundant gradient computations and integrates the Omni-Kernel to prevent the repetitive extraction of similar information at different layers, thereby improving information utilization efficiency. This optimization reduces redundant information computations and effectively utilizes inter-layer features, resulting in a more efficient and detailed integration of multi-scale information. In addition, the model uses the SIoU loss function for bounding box regression. This loss function not only considers the overlap between the predicted and ground truth boxes but also incorporates their distance, angular deviation, and shape similarity. By integrating these factors, the SIoU loss function provides a more comprehensive optimization strategy, thereby improving the accuracy of target localization. Experimental results demonstrate that the YOLOv8-SOE model achieves a mean average precision (mAP) of 80.7% on the NEU-DET dataset, a 5.4% improvement over the baseline YOLOv8 model. The model also exhibits excellent generalization ability on the VOC2012 dataset. Overall, the proposed YOLOv8-SOE model significantly enhances small target detection precision while maintaining high computational efficiency, making it a promising solution for real-world industrial defect detection applications.
-
-
表 1 FSCConv在不同位置的实验结果
Table 1. Experiment results on FSCConv at different positions
Scheme Precision Recall FLOPs/G FPS mAP/% Baseline model 0.759 0.722 8.1 218.2 75.3 A 0.827 0.677 11.6 112.2 77.0 B 0.754 0.768 9.1 187.9 79.4 表 2 CSP-OK模块在不同位置的实验结果
Table 2. Experiment results on CSP-OK module at different positions
Scheme Precision Recall FLOPs/G FPS mAP/% Baseline model 0.759 0.722 8.1 218.2 75.3 C 0.766 0.779 19.9 118.8 79.1 D 0.762 0.743 9.8 116.6 79.1 表 3 消融实验结果
Table 3. Results of ablation experiments
FSCConv CSP-OK SIoU Precision Recall FLOPs/G FPS mAP/% rm/% rf/% — — — 0.759 0.722 8.1 218.2 75.3 27.33 21.78 √ — — 0.760 0.762 9.1 185.3 79.2 26.18 21.40 √ — √ 0.754 0.768 9.1 187.9 79.4 24.89 20.48 — √ — 0.762 0.743 9.8 116.6 79.1 25.72 23.79 — √ √ 0.731 0.786 9.8 137.2 79.3 21.58 26.89 — — √ 0.738 0.761 8.1 215.7 77.4 23.88 26.21 √ √ — 0.783 0.746 11.8 164.0 80.4 24.93 19.41 √ √ √ 0.739 0.767 11.8 172.0 80.7 21.48 23.79 表 4 对比实验结果
Table 4. Results of comparative experiments
Model AP/% FLOPs/G FPS mAP/% Cr In Pa Ps Rs Sc Faster-RCNN 37.2 84.1 89.7 82.3 72.7 93.2 134.0 34.200 76.5 RT-DETR 44.5 77.9 89.0 67.5 56.1 92.2 57.0 48.100 72.5 YOLOv5s 53.4 79.8 93.2 81.1 51.5 96.0 15.8 74.505 75.8 YOLOv7 57.2 84.6 91.1 84.9 54.8 93.3 103.2 65.790 77.7 YOLOX 30.2 77.5 85.2 75.2 39.0 88.8 13.32 43.200 66.6 YOLOv8n 44.1 75.7 91.9 82.7 61.0 96.3 8.1 218.200 75.3 YOLOv10n 42.2 78.2 88.9 76.9 55.3 86.5 6.5 192.500 71.3 GBS-YOLOv7t[24] 32.7 69.6 92.4 96.5 57.7 88.6 — 104.1 72.9 PIC2f-YOLO[25] — — — — — — 10.6 80 78.0 RFB-YOLOv5-E[26] 52.1 75.5 95.5 97.3 62.3 92.4 22.4 122 79.2 Ref. [27] 39.5 86.0 92.1 78.9 62.6 85.3 12.2 — 74.1 Ours 60.0 85.6 93.1 84.8 62.2 98.2 11.8 172.0 80.7 表 5 在不同光照环境下的对比实验结果
Table 5. Comparative experimental results under different lighting environments
Dataset Model Precision Recall mAP/% NEU-DET YOLOv8n 0.759 0.722 75.3 Ours 0.739 0.767 80.7 NEU-DET-brightness YOLOv8n 0.730 0.672 74.9 Ours 0.757 0.745 79.6 表 6 在模糊与噪声环境下的对比实验结果
Table 6. Comparative experimental results in fuzzy and noisy environments
Dataset Model Precision Recall mAP/% NEU-DET YOLOv8n 0.759 0.722 75.3 Ours 0.739 0.767 80.7 NEU-DET-augmented YOLOv8n 0.644 0.688 72.7 Ours 0.789 0.726 78.0 表 7 在VOC2012数据集上的对比实验结果
Table 7. Comparative experiment results on VOC2012 dataset
Model Percision Recall mAP/% YOLOv5s 0.724 0.646 68.2 YOLOX 0.715 0.608 61.6 YOLOv10n 0.702 0.536 60.3 YOLOv8n 0.689 0.569 63.0 Ours 0.754 0.621 69.4 -
参考文献
[1] Luo Q W, Fang X X, Liu L, et al. Automated visual defect detection for flat steel surface: a survey[J]. IEEE Trans Instrum Meas, 2020, 69(3): 626−644. doi: 10.1109/TIM.2019.2963555
[2] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580–587. https://doi.org/10.1109/CVPR.2014.81.
[3] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440–1448. https://doi.org/10.1109/ICCV.2015.169.
[4] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Patt Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031
[5] Shi X C, Zhou S K, Tai Y C, et al. An improved faster R-CNN for steel surface defect detection[C]//2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), 2022: 1–5. https://doi.org/10.1109/MMSP55362.2022.9949350.
[6] Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.
[7] Zhao Y A, Lv W Y, Xu S L, et al. DETRs beat YOLOs on real-time object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 16965–16974. https://doi.org/10.1109/CVPR52733.2024.01605.
[8] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779–788. https://doi.org/10.1109/CVPR.2016.91.
[9] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263–7271. https://doi.org/10.1109/CVPR.2017.690.
[10] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980–2988. https://doi.org/10.1109/ICCV.2017.324.
[11] Li J J, Chen M X. DEW-YOLO: an efficient algorithm for steel surface defect detection[J]. Appl Sci, 2024, 14(12): 5171. doi: 10.3390/app14125171
[12] Yang S X, Xie Y, Wu J Q, et al. CFE-YOLOv8s: improved YOLOv8s for steel surface defect detection[J]. Electronics, 2024, 13(14): 2771. doi: 10.3390/electronics13142771
[13] 彭菊红, 张弛, 高谦, 等. 基于改进的YOLOv8算法的钢材缺陷检测[J/OL]. 计算机工程, 1-9. https://doi.org/10.19678/j.issn.1000-3428.00EC0069283.
Peng J H, Zhang C, Gao Q, et al. Steel defect detection basedon improved YOLOv8 algorithm[J/OL]. Comput Eng,1-9. https://doi.org/10.19678/j.issn.1000-3428.00EC0069283.
[14] Zhang X R, Wang Y L, Fang H S. Steel surface defect detection algorithm based on ESI-YOLOv8[J]. Mater Res Express, 2024, 11(5): 056509. doi: 10.1088/2053-1591/ad46ec
[15] Huang Y, Tan W Z, Li L, et al. WFRE-YOLOv8s: a new type of defect detector for steel surfaces[J]. Coatings, 2023, 13(12): 2011. doi: 10.3390/coatings13122011
[16] Cui Y N, Ren W Q, Knoll A. Omni-kernel network for image restoration[C]//Proceedings of the 38th AAAI Conference on Artificial Intelligence, 2024: 1426–1434. https://doi.org/10.1609/aaai.v38i2.27907.
[17] Gevorgyan Z. SIoU loss: more powerful learning for bounding box regression[Z]. arXiv: 2205.12740, 2022. https://arxiv.org/abs/2205.12740.
[18] Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 390–391. https://doi.org/10.1109/CVPRW50498.2020.00203.
[19] Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Trans Cybern, 2022, 52(8): 8574−8586. doi: 10.1109/TCYB.2021.3095305
[20] Tsai F J, Peng Y T, Lin Y Y, et al. Stripformer: strip transformer for fast image deblurring[C]//Proceedings of the 17th European Conference on Computer Vision, 2022: 146–162. https://doi.org/10.1007/978-3-031-19800-7_9.
[21] Bao Y Q, Song K C, Liu J, et al. Triplet-graph reasoning network for few-shot metal generic surface defect segmentation[J]. IEEE Trans Instrum Meas, 2021, 70: 5011111. doi: 10.1109/TIM.2021.3083561
[22] Song K C, Yan Y H. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects[J]. Appl Surf Sci, 2013, 285: 858−864. doi: 10.1016/j.apsusc.2013.09.002
[23] He Y, Song K C, Meng Q G, et al. An end-to-end steel surface defect detection approach via fusing multiple hierarchical features[J]. IEEE Trans Instrum Meas, 2020, 69(4): 1493−1504. doi: 10.1109/TIM.2019.2915404
[24] 梁礼明, 龙鹏威, 卢宝贺, 等. 改进GBS-YOLOv7t的钢材表面缺陷检测[J]. 光电工程, 2024, 51(5): 240044. doi: 10.12086/oee.2024.240044
Liang L M, Long P W, Lu B H, et al. Improvement of GBS-YOLOv7t for steel surface defect detection[J]. Opto-Electron Eng, 2024, 51(5): 240044. doi: 10.12086/oee.2024.240044
[25] 胡依伦, 杨俊, 许聪源, 等. PIC2f-YOLO: 金属表面缺陷检测轻量化方法[J]. 光电工程, 2025, 52(1): 240250. doi: 10.12086/oee.2025.240250
Hu Y L, Yang J, Xu C Y, et al. PIC2f-YOLO: a lightweight method for the detection of metal surface defects[J]. Opto-Electron Eng, 2025, 52(1): 240250. doi: 10.12086/oee.2025.240250
[26] 黄硕清, 黄金贵. 基于RFB和YOLOv5特征增强融合改进的钢材缺陷检测方法[J]. 计算机工程, 2025, 51(4): 249−260. doi: 10.19678/j.issn.1000-3428.0068476
Huang S Q, Huang J G. Improved steel defect detection method based on enhanced fusion of RFB and YOLOv5 features[J]. Comput Eng, 2025, 51(4): 249−260. doi: 10.19678/j.issn.1000-3428.0068476
[27] 阳丽莎, 李茂军, 胡建文, 等. 基于改进YOLOv7-tiny的带钢表面缺陷检测算法[J]. 计算机工程, 2025, 51(1): 208−215. doi: 10.19678/j.issn.1000-3428.0068397
Yang L S, Li M J, Hu J W, et al. Strip steel surface defect detection algorithm based on improved YOLOv7-tiny[J]. Comput Eng, 2025, 51(1): 208−215. doi: 10.19678/j.issn.1000-3428.0068397
[28] Everingham M, Eslami S M A, Van Gool L, et al. The pascal visual object classes challenge: A retrospective[J]. Int J Comput Vis, 2015, 111(1): 98−136. doi: 10.1007/s11263-014-0733-5
-
访问统计