多级特征筛选和任务动态对齐的声呐图像小目标检测

王燕,王宏辉,刘树东,等. 多级特征筛选和任务动态对齐的声呐图像小目标检测[J]. 光电工程,2024,51(10): 240196. doi: 10.12086/oee.2024.240196
引用本文: 王燕,王宏辉,刘树东,等. 多级特征筛选和任务动态对齐的声呐图像小目标检测[J]. 光电工程,2024,51(10): 240196. doi: 10.12086/oee.2024.240196
Wang Y, Wang H H, Liu S D, et al. Small target detection in sonar images with multilevel feature screening and task dynamic alignment[J]. Opto-Electron Eng, 2024, 51(10): 240196. doi: 10.12086/oee.2024.240196
Citation: Wang Y, Wang H H, Liu S D, et al. Small target detection in sonar images with multilevel feature screening and task dynamic alignment[J]. Opto-Electron Eng, 2024, 51(10): 240196. doi: 10.12086/oee.2024.240196

多级特征筛选和任务动态对齐的声呐图像小目标检测

  • 基金项目:
    天津市哲学社会科学规划项目(TJGL19XSX-045)
详细信息
    作者简介:
    *通讯作者: 刘树东,liushudong@tcu.edu.cn。
  • 中图分类号: TP391.4

  • CSTR: 32245.14.oee.2024.240196

Small target detection in sonar images with multilevel feature screening and task dynamic alignment

  • Fund Project: Project supported by Tianjin Philosophy and Social Science Planning Project (TJGL19xSX-045)
More Information
  • 针对声呐图像中小目标检测难度大、精度低、容易出现错检漏检的问题,本文提出一种基于YOLOv8s的声呐图像小目标检测改进算法。首先,考虑到声呐图像中的小目标通常具有低对比度且易被噪声淹没,提出了高效多级筛选特征金字塔网络(EMS-FPN)。其次,由于解耦头的分类分支和定位分支是独立的,会增加模型的参数量,同时难以有效地适应不同尺度目标的检测需求,导致对于小目标的检测效果不佳,设计了任务动态对齐检测头模块(TDADH)。最后为了验证本文模型的有效性,在URPC2021和SCTD扩充声呐数据集上进行了相应的验证,mAP0.5较YOLOv8s分别提高了0.3%和1.8%,参数量降低了22.5%。结果表明,本文提出的方法在声呐图像目标检测任务中不仅提高了精度,还显著降低了模型参数量。

  • Overview: Sonar technology has an important application value in the marine field, and is widely used in seabed geological exploration, marine environmental pollution monitoring, underwater target detection, marine resources development, and other fields. However, the detection of small targets in sonar images has always been a challenging problem due to the fact that sonar imaging is affected by a variety of factors, such as the marine environment and underwater target characteristics. Small targets, such as round cages and balls, often face difficulties such as weak signals, complex backgrounds, low resolution, and noise interference in sonar images, and their effective detection is crucial to ensure the safety of underwater navigation and the development of marine resources. To solve the problem of small target detection in sonar images, which is difficult, low precision, and prone to wrong detection and leakage, a lightweight sonar image small target detection algorithm based on YOLOv8s with efficient multilevel feature fusion is proposed. Firstly, considering that small targets in sonar images usually have low contrast and are easily overwhelmed by noise, an efficient multilevel screening feature fusion pyramid EMS-FPN module is proposed. It can highlight the important features through the screening mechanism, suppress irrelevant background noise, and extracting features from different scales achieve multilevel fusion so as to improve the detection capability of small targets. Secondly, since the classification branch and the localization branch of the decoupling head are independent, it will increase the number of parameters of the model and lead to the problem of lack of interaction between the two tasks. It is difficult to effectively adapt to the detection needs of targets at different scales, resulting in poor detection of small targets, therefore, the task dynamic align detection head (TDADH) module is designed to learn the task interaction features from multiple convolutional layers through a feature extractor to obtain joint features to effectively adapt to the detection needs of targets at different scales, and finally, to validate the effectiveness of the model in this paper, corresponding validation is carried out on the URPC2021 and SCTD sonar datasets, and the detection accuracy mAP50 is improved compared with that of YOLOv8s respectively by 0.3% and 1.8%, and the number of parameters is reduced by 22.5%. The results show that the sonar image target detection algorithm proposed in this paper improves the accuracy and significantly reduces the number of model parameters.

  • 加载中
  • 图 1  YOLOv8s网络模型

    Figure 1.  YOLOv8s network model

    图 2  EMS-FPN模块

    Figure 2.  EMS-FPN module

    图 3  ELA注意力机制

    Figure 3.  ELA attention mechanisms

    图 4  ELA-SFF模块

    Figure 4.  ELA-SFF module

    图 5  TDADH 模块

    Figure 5.  TDADH module

    图 6  任务分解模块

    Figure 6.  Task decomposition module

    图 7  改进的YOLOv8s目标检测网络模型

    Figure 7.  Improved YOLOv8s target detection network model

    图 8  SCTD声呐图像数据增强。(a)原图;(b)随机去除像素点;(c)锐化处理;(d)调整亮度;(e)调节色调;(f)图像翻转

    Figure 8.  SCTD sonar image data enhancement. (a) Original image; (b) Random removal of pixel points; (c) Sharpening process; (d) Adjustment of brightness; (e) Adjustment of hue; (f) Image flipping

    图 9  URPC 2021中声呐图像不同方法检测结果的比较。(a) RetinaNet[28];(b) PAA[29];(c) CenterNet[30]; (d) SparseRCNN[31];(e) YOLOF[32];(f) TOOD[33];(g) VarifocalNet[34];(h) DSA-Net[35]

    Figure 9.  Comparison of detection results of different methods for sonar images in URPC 2021. (a) RetinaNet[28]; (b) PAA[29]; (c) CenterNet[30];(d) SparseRCNN[31]; (e) YOLOF[32]; (f) TOOD[33]; (g) VarifocalNet[34];(h) DSA-Net[35]

    图 10  URPC 2021中声呐图像不同方法检测结果的比较。(a) YOLOv5;(b) YOLOv6;(c) YOLOv7;(d) YOLOv8s;(e)本文模型

    Figure 10.  Comparison of detection results of different methods for sonar images in URPC 2021. (a) YOLOv5; (b) YOLOv6; (c) YOLOv7; (d) YOLOv8s; (e) This paper

    表 1  URPC2021数据集消融实验结果

    Table 1.  Results of ablation experiments on URPC2021 dataset

    AlgorithmHSFPNEMS-FPNTDADHPrecision/%Recall/%Params/MmAP0.5/%
    YOLOv8s×××97.396.411.197.9
    YOLOv8s××97.496.17.197.8
    YOLOv8s××96.996.67.398.1
    YOLOv8s××96.996.28.897.8
    YOLOv8s×97.496.88.698.2
    下载: 导出CSV

    表 2  URPC2021数据集中小目标类别消融实验结果

    Table 2.  Results of small target category ablation experiments in the URPC2021 dataset

    AlgorithmHSFPNEMS-FPNTDADHParams/MPrecision/%Recall/%Ball/%Cylinder/%Tyre/%
    YOLOv8s×××11.197.396.498.696.896.7
    YOLOv8s××7.197.496.198.896.897
    YOLOv8s××7.396.996.698.997.497.5
    YOLOv8s××8.896.996.298.396.696.5
    YOLOv8s×8.697.496.899.398.297.5
    下载: 导出CSV

    表 3  URPC2021数据集上不同水下物体精度与mAP@50与其他方法的比较

    Table 3.  Comparison of accuracy and mAP@50 for different underwater objects on URPC2021 dataset with other methods

    Algorithm Cube/% Ball/% Cylinder/% Human body/% Trye/% Circle cage/% Square cage/% Metal bucket/% mAP0.5/%
    RetinaNet[28] 92.6 94.0 83.0 86.5 72.7 83.3 92.0 58.5 82.8
    PAA[29] 95.3 93.6 88.2 94.1 73.7 91.5 95.1 81.4 89.1
    CenterNet[30] 91.9 96.8 84.2 96.4 89.5 85.2 97.7 88.1 91.2
    SparseRCNN[31] 97.4 96.8 89.0 97.6 89.2 93.1 98.5 88.0 93.7
    YOLOF[32] 95.5 91.9 87.1 92.0 66.7 72.9 88.8 62.0 82.1
    TOOD[33] 96.4 95.1 88.6 93.4 76.9 91.2 96.9 85.8 90.5
    VarifocalNet[34] 96.4 94.6 90.1 95.0 79.5 91.5 96.7 85.4 91.2
    DSA-Net[35] 97.9 98.0 94.2 98.4 94.9 93.2 99.2 94.0 96.2
    YOLOv5 97.9 98.3 95.9 98.8 97.5 96.0 99.4 95.5 97.4
    YOLOv6 98.0 98.7 96.2 99.4 96.9 96.1 99.3 97.4 97.7
    YOLOv7 96.2 97.5 92.9 94.9 91.3 91.6 98.5 98.0 95.1
    YOLOv8s 97.7 98.6 96.8 99.5 96.7 97.5 99.2 97.2 97.9
    本文模型 97.7 99.3 98.2 99.5 97.5 97.3 98.7 97.8 98.2
    下载: 导出CSV

    表 4  SCTD数据集对比实验结果

    Table 4.  Results of comparison experiments on SCTD dataset

    AlgorithmHSFPNEMS-FPNTDADHParams/MPrecision/%Recall/%Ship(small)/%mAP0.5/%
    YOLOv8s×××11.194.092.493.496.3
    YOLOv8s××7.195.291.595.497.0
    YOLOv8s××7.396.192.195.697.0
    YOLOv8s××8.897.494.197.397.4
    YOLOv8s×8.697.194.497.898.1
    下载: 导出CSV
  • [1]

    Li H S, Xu C, Zhou T. High-resolution integrated detection of underwater topography and geomorphology based on multibeam interferometric echo sounder[J]. Appl Mech Mater, 2012, 212-213: 345−350 doi: 10.4028/www.scientific.net/AMM.212-213.345

    [2]

    Wang L, Ye X F, Wang S L, et al. ULO: an underwater light-weight object detector for edge computing[J]. Machines, 2022, 10(8): 629. doi: 10.3390/machines10080629

    [3]

    Wang Z Y, Ye X F, Han Y T, et al. Improved real-time target detection algorithm for similar multiple targets in complex underwater environment based on YOLOv3[C]//Global Oceans 2020: SingaporeU. S. Gulf Coast, Biloxi, 2020: 1–6. https://doi.org/10.1109/IEEECONF38699.2020.9389108.

    [4]

    Lange H, Vincent L M. Advanced gray-scale morphological filters for the detection of sea mines in side-scan sonar imagery[J]. Proc SPIE, 2000, 4038: 362−372. doi: 10.1117/12.396263

    [5]

    Zhang W Y, Zhou T, Li J H, et al. An efficient method for detection and quantitation of underwater gas leakage based on a 300-kHz multibeam sonar[J]. Remote Sens, 2022, 14(17): 4301. doi: 10.3390/rs14174301

    [6]

    Li J W, An W, Xu C, et al. Sunken oil detection and classification using MBES backscatter data[J]. Mar Pollut Bull, 2022, 180: 113795. doi: 10.1016/j.marpolbul.2022.113795

    [7]

    Zhou T, Si J K, Wang L Y, et al. Automatic detection of underwater small targets using forward-looking sonar images[J]. IEEE Trans Geosci Remote Sens, 2022, 60: 4207912. doi: 10.1109/TGRS.2022.3181417

    [8]

    Park C, Kim Y, Lee H, et al. Development of a 2 MHz sonar sensor for inspection of bridge substructures[J]. Sensors, 2018, 18(4): 1222. doi: 10.3390/s18041222

    [9]

    赵冬冬, 谢墩翰, 陈朋, 等. 基于ZYNQ的轻量化YOLOv5声呐图像目标检测算法及实现[J]. 光电工程, 2024, 51(1): 230284. doi: 10.12086/oee.2024.230284

    Zhao D D, Xie D H, Chen P, et al. Lightweight YOLOv5 sonar image object detection algorithm and implementation based on ZYNQ[J]. Opto-Electron Eng, 2024, 51(1): 230284. doi: 10.12086/oee.2024.230284

    [10]

    Abu A, Diamant R. A statistically-based method for the detection of underwater objects in sonar imagery[J]. IEEE Sensors J, 2019, 19(16): 6858−6871. doi: 10.1109/JSEN.2019.2912325

    [11]

    Negahdaripour S. Application of forward-scan sonar stereo for 3-D scene reconstruction[J]. IEEE J Oceanic Eng, 2020, 45(2): 547−562. doi: 10.1109/JOE.2018.2875574

    [12]

    Shang Z G, Zhao C H, Wan J. Application of multi-resolution analysis in sonar image denoising[J]. J Syst Eng Electron, 2008, 19(6): 1082−1089. doi: 10.1016/S1004-4132(08)60201-7

    [13]

    Jin Y, Ku B, Ahn J, et al. Nonhomogeneous noise removal from side-scan sonar images using structural sparsity[J]. IEEE Geosci Remote Sens Lett, 2019, 16(8): 1215−1219. doi: 10.1109/LGRS.2019.2895843

    [14]

    Wang Z, Zhang S W, Huang W Z, et al. Sonar image target detection based on adaptive global feature enhancement network[J]. IEEE Sensors J, 2022, 22(2): 1509−1530. doi: 10.1109/JSEN.2021.3131645

    [15]

    赵冬冬, 叶逸飞, 陈朋, 等. 基于残差和注意力网络的声呐图像去噪方法[J]. 光电工程, 2023, 50(6): 230017. doi: 10.12086/oee.2023.230017

    Zhao D D, Ye Y F, Chen P, et al. Sonar image denoising method based on residual and attention network[J]. Opto-Electron Eng, 2023, 50(6): 230017. doi: 10.12086/oee.2023.230017

    [16]

    葛锡云, 魏柠阳, 周宏坤, 等. 基于侧扫声呐的水下小目标检测技术研究[J]. 数字海洋与水下攻防, 2023, 6(2): 155−161. doi: 10.19838/j.issn.2096-5753.2023.02.004

    Ge X Y, Wei N Y, Zhou H K, et al. Research on small underwater target detection technology based on side-scan sonar[J]. Digit Ocean Underwater Warf, 2023, 6(2): 155−161. doi: 10.19838/j.issn.2096-5753.2023.02.004

    [17]

    Wang C Y, Bochkovskiy A, Liao H Y M. Scaled-yolov4: scaling cross stage partial network[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13024–13033. https://doi.org/10.1109/CVPR46437.2021.01283.

    [18]

    Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding yolo series in 2021[Z]. arXiv: 2107.08430, 2021. https://doi.org/10.48550/arXiv.2107.08430.

    [19]

    Chen Y F, Zhang C Y, Chen B, et al. Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases[J]. Comput Biol Med, 2024, 170: 107917. doi: 10.1016/j.compbiomed.2024.107917

    [20]

    Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350.

    [21]

    Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, Lille, 2015: 448–456.

    [22]

    Xu W, Wan Y. ELA: efficient local attention for deep convolutional neural networks[Z]. arXiv: 2403.01123, 2024. https://doi.org/10.48550/arXiv.2403.01123.

    [23]

    Hou Q B, Zhang L, Cheng M M, et al. Strip pooling: rethinking spatial pooling for scene parsing[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 4002–4011. https://doi.org/10.1109/CVPR42600.2020.00406.

    [24]

    Wu Y X, He K M. Group normalization[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, 2018: 3–19. https://doi.org/10.1007/978-3-030-01261-8_1.

    [25]

    Tian Z, Shen C H, Chen H, et al. FCOS: a simple and strong anchor-free object detector[J]. IEEE Trans Pattern Anal Mach Intell, 2022, 44(4): 1922−1933. doi: 10.1109/TPAMI.2020.3032166

    [26]

    周彦, 陈少昌, 吴可, 等. SCTD1.0: 声呐常见目标检测数据集[J]. 计算机科学, 2021, 48(11A): 334−339. doi: 10.11896/jsjkx.210100138

    Zhou Y, Chen S C, Wu K, et al. SCTD 1.0: sonar common target detection dataset[J]. Comput Sci, 2021, 48(11A): 334−339. doi: 10.11896/jsjkx.210100138

    [27]

    Xie K B, Yang J, Qiu K. A dataset with multibeam forward-looking sonar for underwater object detection[J]. Sci Data, 2022, 9(1): 739. doi: 10.1038/s41597-022-01854-w

    [28]

    Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, 2017: 2999–3007. https://doi.org/10.1109/ICCV.2017.324.

    [29]

    Kim K, Lee H S. Probabilistic anchor assignment with IoU prediction for object detection[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 355–371. https://doi.org/10.1007/978-3-030-58595-2_22.

    [30]

    Zhou X Y, Wang D Q, Krähenbühl P. Objects as points[Z]. arXiv: 1904.07850, 2019. https://doi.org/10.48550/arXiv.1904.07850.

    [31]

    Sun P Z, Zhang R F, Jiang Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422.

    [32]

    Chen Q, Wang Y M, Yang T, et al. You only look one-level feature[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13034–13043. https://doi.org/10.1109/CVPR46437.2021.01284.

    [33]

    Feng C J, Zhong Y J, Gao Y, et al. TOOD: task-aligned one-stage object detection[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 3490–3499. https://doi.org/10.1109/ICCV48922.2021.00349.

    [34]

    Zhang H Y, Wang Y, Dayoub F, et al. VarifocalNet: an IoU-aware dense object detector[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 8510–8519. https://doi.org/10.1109/CVPR46437.2021.00841.

    [35]

    Li Z K, Xie Z J, Duan P H, et al. Dual spatial attention network for underwater object detection with sonar imagery[J]. IEEE Sens J, 2024, 24(5): 6998−7008. doi: 10.1109/JSEN.2023.3336899

  • 加载中

(11)

(4)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2024-08-20
修回日期:  2024-09-21
录用日期:  2024-09-21
刊出日期:  2024-10-25

目录

/

返回文章
返回