面向SAR图像舰船检测的多粒度特征与形位相似度量方法

李士博,肖振久,曲海成,等. 面向SAR图像舰船检测的多粒度特征与形位相似度量方法[J]. 光电工程,2025,52(2): 240254. doi: 10.12086/oee.2025.240254
引用本文: 李士博,肖振久,曲海成,等. 面向SAR图像舰船检测的多粒度特征与形位相似度量方法[J]. 光电工程,2025,52(2): 240254. doi: 10.12086/oee.2025.240254
Li S B, Xiao Z J, Qu H C, et al. Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images[J]. Opto-Electron Eng, 2025, 52(2): 240254. doi: 10.12086/oee.2025.240254
Citation: Li S B, Xiao Z J, Qu H C, et al. Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images[J]. Opto-Electron Eng, 2025, 52(2): 240254. doi: 10.12086/oee.2025.240254

面向SAR图像舰船检测的多粒度特征与形位相似度量方法

  • 基金项目:
    辽宁省高等学校基本科研项目 (LJKMZ20220699);辽宁工程技术大学学科创新团队资助项目 (LNTU20TD-23)
详细信息

Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images

  • Fund Project: The Scientific Research Foundation of the Higher Education Institutions of Liaoning Province (LJKMZ20220699), Discipline Innovation Team of Liaoning Technical University (LNTU20TD-23)
More Information
  • 针对合成孔径雷达 (SAR)图像背景复杂、目标尺度变化大,尤其在小目标密集场景中容易出现误检和漏检问题,提出一种面向SAR图像舰船检测的多粒度特征与形位相似度量方法。在特征提取阶段,设计包含双分支多粒度特征聚合结构。一个分支通过Haar小波变换对特征图级联分解,以扩大全局感受野,从而提取粗粒度特征;另一分支引入空间和通道重建卷积,用于捕捉细节纹理信息,以减少特征图的上下文信息损失。两分支通过协同利用局部和非局部特征的相互作用,有效抑制复杂背景和杂波干扰,实现多尺度特征的精确提取。在检测回归阶段,利用欧几里得距离,并结合位置与形状信息,提出形位相似度量方法,以解决小目标密集场景中位置偏差敏感性问题,从而平衡正负样本的分配。在SSDD和HRSID数据集上与双阶段、单阶段及DETR系列共11种检测器进行综合对比,本文方法在两数据集上mAP和mAP50分别达到68.8%、98.3%和70.8%、93.8%。此外,模型参数量仅为2.4 M,计算量为6.4 GFLOPs,优于对比方法。本文方法在复杂背景和不同尺度舰船目标下表现出优异的检测性能,在降低误检率和漏检率的同时,具有更低的模型参数量和计算量。

  • Overview: Synthetic aperture radar (SAR) images have all-weather and all-day observation capabilities, can penetrate clouds, rain, and fog, provide high-resolution images, and achieve large-scale coverage and fast scanning. Based on these advantages, SAR images have been widely used in ship detection, such as maritime patrol, search and rescue, fishery supervision, traffic management, and military surveillance. However, in actual detection, due to factors such as complex background interference, large changes in target scale, and dense distribution of small targets, missed detections and false detections are prone to occur. This paper proposes a multi-granularity feature and shape-position similarity method for ship detection in SAR images to meet these challenges and improve the detection effect. In the feature extraction stage, a dual-branch multi-granularity feature aggregation (MGFA) structure is designed to cope with the complex background and multi-scale targets in SAR images. MGFA uses two branches, coarse-grained and fine-grained, for bidirectional feature extraction. The coarse-grained feature extraction branch is implemented through the Haar wavelet transform, which can decompose the input features at multiple levels, thereby expanding the receptive field of the model and enhancing the ability to capture global context information. The fine-grained feature extraction branch uses spatial and channel reconstruction convolution (SCConv), which focuses more on the extraction of details and local information. SCConv can efficiently capture the detailed texture of small targets, reduce the influence of clutter interference and background noise, and thus improve the accuracy of small target detection. In the detection regression stage, the shape-position similarity (SPS) metric is proposed using Euclidean distance to replace the traditional IoU. For small ships, SPS solves the sensitivity problem of position deviation through center distance, and can accurately reflect the position deviation even when there is no overlap in the bounding box. For large ships, SPS introduces shape constraints, enhances the focus on the shape of the prediction box, and makes up for the deficiency of IoU focusing only on position overlap. By considering the similarity of position and shape, SPS effectively addresses the problem of ship detection in SAR images with dense small targets and significant scale changes. In a comprehensive comparison with 11 detectors from one-stage, two-stage, and DETR series on the SSDD and HRSID datasets, our method achieves mAP scores of 68.8% and 98.3%, and mAP50 scores of 70.8% and 93.8%, respectively. In addition, our model is highly efficient, with just 2.4 M parameters and a computational load of only 6.4 GFLOPs, outperforming the comparison methods.

  • 加载中
  • 图 1  本文方法整体网络结构图

    Figure 1.  Diagram of the overall network structure of the proposed method

    图 2  MGFA网络结构图

    Figure 2.  MGFA network structure diagram

    图 3  利用HWT增大感受野

    Figure 3.  Increase the receptive field with HWT

    图 4  单通道下二级HWT分解过程

    Figure 4.  Two-level HWT decomposition process in a single channel

    图 5  模型在使用HWT前后的感受野变化。使用HWT前(a) Stage2、(b) Stage3、(c) Stage4、(d) Stage5的感受野; 使用HWT后(e) Stage2、(f) Stage3、(g) Stage4、(h) Stage5的感受野

    Figure 5.  Changes in receptive field of the model before and after using HWT. Receptive fields of (a) Stage2, (b) Stage3, (c) Stage4, and (d) Stage5 before HWT; Receptive fields of (e) Stage2, (f) Stage3, (g) Stage4, and (h) Stage5 after HWT

    图 6  不同尺寸舰船IoU变化情况。(a) 小型舰船IoU变化;(b) 大型舰船IoU变化

    Figure 6.  IoU changes of ships with different sizes. (a) Changes in IoU for small ship; (b) Changes in IoU for large ship

    图 7  不同度量在不同偏差下的仿真对比。(a)偏差;(b) IoU偏移曲线; (c) SPS偏移曲线

    Figure 7.  Simulation comparison of different metrics under different deviations. (a) Deviation; (b) IoU deviation curves; (c) SPS deviation curves

    图 8  SSDD和HRSID舰船目标分布。(a) SSDD;(b) HRSID

    Figure 8.  SSDD and HRSID ship target distributions. (a) SSDD; (b) HRSID

    图 9  使用SPS前后模型回归损失、精度和召回率的变化情况。(a)回归损失;(b)精度;(d)召回率

    Figure 9.  Changes in regression loss, accuracy, and recall rates of the model before and after using SPS. (a) Regression loss; (b) Precision; (c) Recall

    图 10  不同方法PR曲线对比

    Figure 10.  Comparison of PR curves of different methods

    图 11  不同方法性能对比

    Figure 11.  Comparison of performances of different methods

    图 12  SSDD数据集上不同方法的可视化对比。(a)真实标注;(b) Dynamic R-cnn;(c) YOLOv8n;(d) YOLO11n;(e) Mamba YOLO;(f) 本文;(g) DINO;(h) RT-DETR

    Figure 12.  Visual comparison of different methods on SSDD dataset. (a) True labeling; (b) Dynamic R-cnn; (c) YOLOv8n; (d) YOLO11n; (e) Mamba YOLO; (f) Ours; (g) DINO; (h) RT-DETR

    图 13  HRSID数据集上不同方法的可视化对比。(a)真实标注;(b) Dynamic R-CNN;(c) YOLOv8n;(d) YOLO11n;(e) Mamba YOLO;(f) 本文;(g) DINO;(h) RT-DETR

    Figure 13.  Visual comparison of different methods on HRSID dataset. (a) True labeling; (b) Dynamic R-CNN; (c) YOLOv8n; (d) YOLO11n; (e) Mamba YOLO; (f) Ours; (g) DINO; (h) RT-DETR

    表 1  HWT中不同配置对模型性能影响

    Table 1.  Influence of different configurations in HWT on model performance

    LevelKernel sizeRFSmAP/%mAP50/%Params/kFLOPs/M
    13 × 3667.095.82.97.4
    5 × 51067.496.38.120.5
    7 × 71467.896.915.740.1
    23 × 31267.696.85.38.3
    5 × 52068.197.814.523.0
    7 × 72868.197.928.345.2
    33 × 32467.997.47.68.5
    5 × 54067.697.120.923.7
    7 × 75667.397.040.846.4
    下载: 导出CSV

    表 2  MGFA消融实验结果

    Table 2.  Results of MGFA ablation experiments

    Coarse-grained featureFine-grained featuremAP/%mAP50/%
    67.697.5
    66.995.8
    68.197.8
    下载: 导出CSV

    表 3  权重系数对模型性能的影响

    Table 3.  Effect of weighting coefficients on model performance

    α00.10.20.30.40.50.60.70.80.91.0
    mAP/%67.667.567.767.867.867.767.367.266.966.966.8
    mAP50/%96.796.797.197.397.196.996.496.496.095.895.6
    下载: 导出CSV

    表 4  本文方法消融实验结果

    Table 4.  Results of ablation experiments

    MGFASPSmAP/%mAP50/%
    67.296.1
    68.197.8
    67.897.3
    68.898.3
    下载: 导出CSV

    表 5  SSDD数据集上不同度量对比

    Table 5.  Comparison of different metrics on SSDD dataset

    MetricCIoUGIoUDIoUEIoUSIoUMPDIoUPowerful-IoUSPS
    mAP/%67.267.267.466.867.167.667.567.8
    mAP50/%96.196.096.696.496.196.696.897.3
    下载: 导出CSV

    表 6  SSDD数据集上不同方法对比

    Table 6.  Comparison of different methods on SSDD dataset

    Method mAP/% mAP50/% mAPs/% mAPm/% mAPl/% Params/M FLOPs/G
    Two
    stage
    Faster R-CNN 60.7 92.6 55.4 70.1 69.0 41.4 103.2
    Dynamic R-CNN 66.9 96.1 65.1 74.3 66.2 41.3 107.3
    Sparse R-CNN 54.4 91.2 50.2 62.1 60.3 106.0 77.5
    One
    stage
    TOOD 67.1 96.7 62.8 75.0 83.7 32.0 95.3
    RTMDet 68.0 97.7 66.5 78.7 74.5 4.9 8.0
    YOLOv8n 68.3 97.5 65.8 75.5 62.9 3.2 8.7
    YOLO11n 67.2 96.1 64.5 72.0 60.7 2.6 6.5
    Mamba YOLO 68.6 97.1 65.1 74.4 68.4 6.1 14.3
    Ours 68.8 98.3 66.8 72.5 63.1 2.4 6.4
    DETR
    like
    DDQ 67.1 96.4 58.1 72.0 80.1 48.3 123.3
    DINO 69.5 98.1 63.2 75.6 85.0 47.5 139.0
    RT-DETR 64.2 94.0 59.7 69.7 57.0 36.0 100.2
    下载: 导出CSV

    表 7  HRSID数据集上不同度量对比

    Table 7.  Comparison of different metrics on HRSID dataset

    MetricCIoUGIoUDIoUEIoUSIoUMPDIoUPowerful-IoUSPS
    mAP/%68.568.768.966.668.168.669.369.6
    mAP50/%91.392.091.891.091.191.192.292.8
    下载: 导出CSV

    表 8  HRSID数据集上不同方法对比

    Table 8.  Comparison of different methods on HRSID dataset

    Method mAP/% mAP50/% mAPs/% mAPm/% mAPl/% Params/M FLOPs/G
    Two
    stage
    Faster R-CNN 63.5 85.4 48.8 78.7 49.9 41.4 103.2
    Dynamic R-CNN 63.8 85.1 48.7 78.2 55.7 41.3 107.3
    Sparse R-CNN 61.0 83.4 46.4 75.2 46.6 106.0 77.5
    One
    stage
    TOOD 70.3 92.0 58.1 82.2 67.6 32.0 95.3
    RTMDet 69.0 90.5 55.9 81.3 63.7 4.9 8.0
    YOLOv8n 69.4 93.6 57.8 79.7 57.6 3.2 8.7
    YOLO11n 68.5 91.3 57.4 78.5 53.2 2.6 6.5
    Mamba YOLO 70.4 93.7 58.6 80.8 67.7 6.1 14.3
    Ours 70.8 93.8 60.2 79.3 57.9 2.4 6.4
    DETR
    like
    DDQ 70.5 93.4 59.4 81.1 71.3 48.3 123.3
    DINO 69.7 92.5 59.3 79.9 62.7 47.5 139.0
    RT-DETR 61.9 89.5 48.1 75.0 42.6 36.0 100.2
    下载: 导出CSV
  • [1]

    Ai J Q, Yang X Z, Song J T, et al. An adaptively truncated clutter-statistics-based two-parameter CFAR detector in SAR imagery[J]. IEEE J Oceanic Eng, 2018, 43(1): 267−279. doi: 10.1109/joe.2017.2768198

    [2]

    Eldhuset K. An automatic ship and ship wake detection system for spaceborne SAR images in coastal regions[J]. IEEE Trans Geosci Remote Sens, 1996, 34(4): 1010−1019. doi: 10.1109/36.508418

    [3]

    Liu T, Zhang J F, Gao G, et al. CFAR ship detection in polarimetric synthetic aperture radar images based on whitening filter[J]. IEEE Trans Geosci Remote Sens, 2020, 58(1): 58−81. doi: 10.1109/tgrs.2019.2931353

    [4]

    Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015: 1440–1448. https://doi.org/10.1109/iccv.2015.169.

    [5]

    Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/tpami.2016.2577031

    [6]

    Zhang H K, Chang H, Ma B P, et al. Dynamic R-CNN: towards high quality object detection via dynamic training[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 260–275. https://doi.org/10.1007/978-3-030-58555-6_16.

    [7]

    Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.

    [8]

    Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016: 779–788. https://doi.org/10.1109/cvpr.2016.91.

    [9]

    Tian Z, Shen C H, Chen H, et al. FCOS: a simple and strong anchor-free object detector[J]. IEEE Trans Pattern Anal Mach Intell, 2022, 44(4): 1922−1933. doi: 10.1109/tpami.2020.3032166

    [10]

    Guo H Y, Yang X, Wang N N, et al. A CenterNet++ model for ship detection in SAR images[J]. Pattern Recognit, 2021, 112: 107787. doi: 10.1016/j.patcog.2020.107787

    [11]

    徐安林, 杜丹, 王海红, 等. 结合层次化搜索与视觉残差网络的光学舰船目标检测方法[J]. 光电工程, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249

    Xu A L, Du D, Wang H H, et al. Optical ship target detection method combining hierarchical search and visual residual network[J]. Opto-Electron Eng, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249

    [12]

    Yang X, Zhang X, Wang N N, et al. A robust one-stage detector for multiscale ship detection with complex background in massive SAR images[J]. IEEE Trans Geosci Remote Sens, 2022, 60: 5217712. doi: 10.1109/tgrs.2021.3128060

    [13]

    丁俊华, 袁明辉. 基于双分支多尺度融合网络的毫米波SAR图像多目标语义分割方法[J]. 光电工程, 2023, 50(12): 230242. doi: 10.12086/oee.2023.230242

    Ding J H, Yuan M H. A multi-target semantic segmentation method for millimetre wave SAR images based on a dual-branch multi-scale fusion network[J]. Opto-Electron Eng, 2023, 50(12): 230242. doi: 10.12086/oee.2023.230242

    [14]

    Bai L, Yao C, Ye Z, et al. Feature enhancement pyramid and shallow feature reconstruction network for SAR ship detection[J]. IEEE J Sel Top Appl Earth Obs Remote Sens, 2023, 16: 1042−1056. doi: 10.1109/jstars.2022.3230859

    [15]

    Tang J, Han Y H, Xian Y T. SAR-ShipSwin: enhancing SAR ship detection with robustness in complex environment[J]. J Supercomput, 2024, 80(14): 2079−20812. doi: 10.1007/s11227-024-06237-z

    [16]

    Cao S, Zhao C X, Dong J, et al. Ship detection in synthetic aperture radar images under complex geographical environments, based on deep learning and morphological networks[J]. Sensors, 2024, 24(13): 4290. doi: 10.3390/s24134290

    [17]

    Zhang T W, Zhang X L, Ke X. Quad-FPN: a novel quad feature pyramid network for SAR ship detection[J]. Remote Sens, 2021, 13(14): 2771. doi: 10.3390/rs13142771

    [18]

    Tang G, Zhao H R, Claramunt C, et al. PPA-Net: pyramid pooling attention network for multi-scale ship detection in SAR images[J]. Remote Sens, 2023, 15(11): 2855. doi: 10.3390/rs15112855

    [19]

    袁志安, 谷雨, 马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程, 2023, 50(12): 230218. doi: 10.12086/oee.2023.230218

    Yuan Z A, Gu Y, Ma G. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electron Eng, 2023, 50(12): 230218. doi: 10.12086/oee.2023.230218

    [20]

    Li X N, Chen P, Yang J S, et al. TKP-net: a three keypoint detection network for ships using SAR imagery[J]. IEEE J Sel Top Appl Earth Obs Remote Sens, 2024, 17: 364−376. doi: 10.1109/jstars.2023.3329252

    [21]

    Tang Y, Wang S G, Wei J, et al. Scene-aware data augmentation for ship detection in SAR images[J]. Int J Remote Sens, 2024, 45(10): 3396−3411. doi: 10.1080/01431161.2024.2343433

    [22]

    Chen S W, Mei L Y. Structure similarity virtual map generation network for optical and SAR image matching[J]. Front Phys, 2024, 12: 1287050. doi: 10.3389/fphy.2024.1287050

    [23]

    Li Y X, Li X, Li W J, et al. SARDet-100k: towards open-source benchmark and toolkit for large-scale SAR object detection[Z]. arXiv: 2403.06534, 2024. https://doi.org/10.48550/arxiv.2403.06534.

    [24]

    Khanam R, Hussain M. YOLOv11: an overview of the key architectural enhancements[Z]. arXiv: 2410.17725, 2024. https://doi.org/10.48550/arXiv.2410.17725.

    [25]

    Feng C J, Zhong Y J, Gao Y, et al. TOOD: task-aligned one-stage object detection[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 2021: 3490–3499. https://doi.org/10.1109/ICCV48922.2021.00349.

    [26]

    Li J F, Wen Y, He L H. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 6153–6162. https://doi.org/10.1109/CVPR52729.2023.00596.

    [27]

    Zhang T W, Zhang X L, Li J W, et al. SAR ship detection dataset (SSDD): official release and comprehensive data analysis[J]. Remote Sens, 2021, 13(18): 3690. doi: 10.3390/rs13183690

    [28]

    Wei S J, Zeng X F, Qu Q Z, et al. HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8: 120234−120254. doi: 10.1109/access.2020.3005861

    [29]

    Zheng Z H, Wang P, Liu W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence, New York, 2020: 12993–13000. https://doi.org/10.1609/aaai.v34i07.6999.

    [30]

    Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 658–666. https://doi.org/10.1109/cvpr.2019.00075.

    [31]

    Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Trans Cybern, 2022, 52(8): 8574−8586. doi: 10.1109/TCYB.2021.3095305

    [32]

    Gevorgyan Z. SIoU loss: more powerful learning for bounding box regression[Z]. arXiv: 2205.12740, 2022. https://doi.org/10.48550/arxiv.2205.12740.

    [33]

    Ma S L, Xu Y. MPDIoU: a loss for efficient and accurate bounding box regression[Z]. arXiv: 2307.07662, 2023. https://doi.org/10.48550/arxiv.2307.07662.

    [34]

    Liu C, Wang K G, Li Q, et al. Powerful-IoU: more straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170: 276−284. doi: 10.1016/j.neunet.2023.11.041

    [35]

    Sun P Z, Zhang R F, Jiang Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422.

    [36]

    Lyu C Q, Zhang W W, Huang H A, et al. RTMDet: an empirical study of designing real-time object detectors[Z]. arXiv: 2212.07784, 2022. https://doi.org/10.48550/ARXIV.2212.07784.

    [37]

    Jocher G, Qiu J, Chaurasia A. YOLOv8 by Ultralytics[EB/OL]//GitHub. (2023-01-01)[2023-12-17]. https://github.com/ultralytics/ultralytics.

    [38]

    Wang Z Y, Li C, Xu H Y, et al. Mamba YOLO: a simple baseline for object detection with state space model[Z]. arXiv preprint arXiv: 2406.05835, 2024. https://doi.org/10.48550/arxiv.2406.05835.

    [39]

    Zhang S L, Wang X J, Wang J Q, et al. Dense distinct query for end-to-end object detection[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 7329–7338. https://doi.org/10.1109/cvpr52729.2023.00708.

    [40]

    Zhang H, Li F, Liu S L, et al. DINO: DETR with improved DeNoising anchor boxes for end-to-end object detection[C]//Proceedings of the 11th International Conference on Learning Representations, Kigali, 2022: 1–19.

    [41]

    Zhao Y, Lv W Y, Xu S L, et al. DETRs beat YOLOs on real-time object detection[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2024: 16965–16974. https://doi.org/10.1109/CVPR52733.2024.01605.

  • 加载中

(14)

(8)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2024-10-30
修回日期:  2024-12-16
录用日期:  2024-12-17
刊出日期:  2025-02-28

目录

/

返回文章
返回