融合空-频域的动态SAR图像目标检测

沈学利; 王嘉慧; 吴正伟

doi:10.12086/oee.2025.240245

融合空-频域的动态SAR图像目标检测

- 辽宁工程技术大学软件学院，辽宁葫芦岛 125105
基金项目:
国家自然科学基金面上项目 (62173171)

详细信息

作者简介:
沈学利 (1969-)，男，博士研究生，教授，CCF高级会员，主要从事推荐系统、网络及信息安全、图像与视觉信息计算方面的研究等。E-mail：Shenxueli@lntu.edu;

王嘉慧 (2000-)，女，硕士研究生，CCF学生会员，主要主要从事遥感图像目标检测方面的研究、推荐系统研究。E-mail：2245414310@qq.com;

吴正伟 (1998-)，男，硕士研究生，主要从事机器学习和图像与视觉信息计算方面的研究。E-mail：1525545769@qq.com

**^*通讯作者:** 王嘉慧，2245414310@qq.com。

中图分类号: TP391.4
CSTR: 32245.14.oee.2025.240245

收稿日期: 2024-10-19

修回日期: 2024-12-03

录用日期: 2024-12-10

刊出日期: 2025-01-25

Dynamic SAR image target detection by fusing space-frequency domain

- School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China
Fund Project: Project supported by the National Natural Science Foundation of China under the Upper Level Program (62173171)

More Information

**^*Corresponding author:** 2245414310@qq.com

CSTR: 32245.14.oee.2025.240245

Received Date 19 October 2024

Revised Date 03 December 2024

Accepted Date 10 December 2024

Published Date 25 January 2025

摘要

摘要:
针对合成孔径雷达 (synthetic aperture radar, SAR)图像样本特征差异大、目标尺度不均衡、背景散斑噪声高所导致的检测精度低、推理速度慢问题，提出一种融合空-频域的动态SAR图像目标检测算法。首先，采用分流感知策略构造空-频域感知单元，结合动态感受野及分数阶Gabor变换法，增强算法对空间多样性特征和频率散射特征的捕获能力与感知力，优化模型对全局上下文信息的保留能力，加快推理速度，降低特征映射模式相似性与背景噪声干扰，有效改善漏检、误检情况。其次，采用重参数学习法设计自适应特征融合模块，优化多尺度特征间的交互与整合，丰富特征的多样性，缓解特征采样引起的差异映射与信息丢失问题，加强小目标信息与关键频率信息在融合过程中的显著性，提高多尺度样本检测精度。最后，引入DY_IoU动态回归损失函数，利用自适应尺度惩罚因子与动态非单调注意力机制解决锚框膨胀和位置偏差问题，进一步增强模型对多尺度目标的定位与检测能力，加快模型收敛速度，减少模型计算量。在公开数据集SAR-Acraft-1.0和HRSID上进行相关实验，实验结果表明：该方法mAP@0.5数值达到了95.9%和98.8%，较基线模型分别提升5.2%和1.2%，且优于其他对比算法。表明该算法显著提升了检测精度，具备良好的鲁棒性与泛化性。
- SAR图像 /
- 分流感知 /
- 分数阶Gabor变换法 /
- 特征融合 /
- 多尺度样本 /
- 小目标 /
- DY_IoU
Abstract:
A dynamic SAR image target detection algorithm integrating space-frequency domains is proposed to address challenges such as significant feature differences in SAR image samples, imbalanced target scales, and high speckle noise in the background, which result in low detection accuracy and slow inference speed. First, a dual-stream perception strategy constructs spatial-frequency perception units, leveraging dynamic receptive fields and fractional-order Gabor transforms to enhance the model’s ability to capture spatial diversity and frequency scattering features. This way improves the retention of global contextual information, accelerates inference, reduces the similarity of feature mapping patterns, and mitigates background noise interference, effectively reducing missed and false detections. Second, a re-parameterization-based adaptive feature fusion module is designed to optimize interactions across multi-scale features, enriching feature diversity, alleviating mapping discrepancies and information loss caused by feature sampling, and enhancing the salience of small target and key frequency information during fusion, thereby improving detection precision. Finally, the DY_IoU dynamic regression loss function is introduced, utilizing adaptive scale penalty factors and a dynamic non-monotonic attention mechanism to address anchor box expansion and positional deviation, further enhancing the localization and detection capabilities for multi-scale targets. This way also accelerates model convergence and reduces computational overhead. Experiments conducted on the public datasets SAR-Acraft-1.0 and HRSID demonstrate that the proposed method achieves mAP@0.5 values of 95.9% and 98.8%, respectively, representing 5.2% and 1.2% improvements over baseline models and outperforming other comparison algorithms. These results indicate that the proposed algorithm improves detection accuracy and exhibits strong robustness and generalization capabilities.
- SAR image /
- dual-stream perception /
- fractional-order Gabor transform method /
- feature fusion /
- multi-scale samples /
- small target /
- DY_IoU

Overview

Overview: A dynamic SAR image target detection algorithm integrating spatial-frequency domains is proposed to address several challenges inherent to SAR imagery, including significant feature variability, imbalanced target scales, and high speckle noise in background regions. These challenges contribute to decreased detection accuracy and slower inference speeds, posing difficulties for real-time applications. The proposed method is specifically designed to overcome these limitations through multiple innovative components that enhance both detection performance and computational efficiency. The algorithm first employs a dual-stream perception strategy to construct spatial-frequency perception units. This design integrates both dynamic receptive fields and fractional-order Gabor transforms, significantly improving the model’s ability to capture spatial diversity and frequency scattering features. By expanding the receptive fields adaptively, the algorithm captures both local and global contexts, leading to more effective extraction of complex patterns in the input data. Using fractional-order Gabor transforms further enhances the model's sensitivity to fine-grained texture and frequency features, which helps retain important global contextual information. These improvements collectively speed up inference by minimizing redundant feature representations, reducing the interference from background noise, and decreasing the similarity of feature mapping patterns. Consequently, the algorithm effectively addresses common issues such as missed and false detections, are typical in cluttered SAR images. In the next stage, a re-parameterization-based adaptive feature fusion module is introduced to optimize the interaction between multi-scale features. This module facilitates the efficient integration of features across different scales, enriching feature diversity and mitigating the discrepancies introduced during the sampling process. Additionally, the fusion process highlights the salience of small targets and key frequency information, often challenging to detect in traditional SAR detection frameworks. This enhanced multi-scale feature integration improves the detection accuracy, particularly for small and subtle objects, which are crucial in applications like maritime surveillance and remote sensing. To further enhance the algorithm’s effectiveness, a dynamic regression loss function, DY_IoU, is incorporated. This loss function employs adaptive scale penalty factors and a dynamic non-monotonic attention mechanism to address anchor box expansion and positional deviations. By dynamically adjusting the focus during training, the model achieves more precise localization of multi-scale targets. Moreover, the improved loss function facilitates faster convergence, reduces the computational burden, and ensures the algorithm remains lightweight and efficient for practical deployment. The proposed method was evaluated on two publicly available datasets, SAR-Acraft-1.0 and HRSID. Experimental results show that the algorithm achieves mAP@0.5 values of 95.9% and 98.8%, respectively, representing 5.2% and 1.2% improvements over baseline models. Additionally, the proposed approach outperforms other comparison algorithms, demonstrating its superiority. These results confirm that the algorithm not only enhances detection accuracy but also exhibits strong robustness and generalization capabilities, making it suitable for a wide range of real-world applications.

HTML全文

图 1 YOLOv10s算法结构图

Figure 1. YOLOv10s algorithm structure diagram

下载: 全尺寸图片幻灯片

图 2 融合空-频域的动态SAR图像目标检测算法结构图

Figure 2. Structure of the dynamic SAR image target detection algorithm fusing spatial-frequency domain

下载: 全尺寸图片幻灯片

图 3 SFDS结构

Figure 3. Structure of SFDS

下载: 全尺寸图片幻灯片

图 4 全局空间感知 (GSA)模块结构

Figure 4. Global spatial awareness (GSA) module structure

下载: 全尺寸图片幻灯片

图 5 频域感知 (FDA)模块结构

Figure 5. Structure of frequency domain awareness (FDA) module

下载: 全尺寸图片幻灯片

图 6 当前特征融合方法。(a)逐元相加法；(b) 通道拼接法

Figure 6. Current feature fusion methods. (a) Element-by-element summation method; (b) Channel splicing method

下载: 全尺寸图片幻灯片

图 7 自适应特征融合 (AFF)模块结构

Figure 7. Adaptive feature fusion (AFF) module structure

下载: 全尺寸图片幻灯片

图 8 DY_IoU与CIoU回归计算可视化。(a) DY_IoU；(b) CIoU

Figure 8. Visualisation of DY_IoU and CIoU regression calculations. (a) DY_IoU; (b) CIoU

下载: 全尺寸图片幻灯片

图 9 基于锚框质量的梯度调整函数结构图

Figure 9. Structure of the gradient adjustment function based on the quality of the anchor frame

下载: 全尺寸图片幻灯片

图 10 回归过程比较

Figure 10. Comparison of regression processes

下载: 全尺寸图片幻灯片

图 11 对比试验可视化结果 (一)。(a) SKG-Net；(b) Center-Net；(c) Faster-RCNN；(d) YOLOv5s

Figure 11. Comparison test visualisation results (I). (a) SKG-Net; (b) Center-Net; (c) Faster-RCNN; (d) YOLOv5s

下载: 全尺寸图片幻灯片

图 12 对比试验可视化结果 (二)。(a) SFS-CNet；(b) YOLOv8s；(c) YOLOv10s；(d) 所提算法

Figure 12. Comparison test visualisation results (II). (a) SFS-CNet; (b) YOLOv8s; (c) YOLOv10s; (d) Proposed algorithm

下载: 全尺寸图片幻灯片

表 1 所提算法在SAR-AIRcarft-1.0数据集的消融实验

Table 1. Ablation experiments of the proposed algorithm on the SAR-AIRcarft-1.0 dataset

YOLOV10	SFDS	AFF	DY_IoU	Precision/%	Recall/%	mAP0.5/%	Params/10⁶	GFLOPs
√	—	—	—	86.9	89.4	90.7	2.60	8.4
√	√	—	—	87.6	92.1	92.9	1.98	6.6
√	—	√	—	82.3	90.0	90.8	3.18	9.3
√	—	—	√	84.9	90.9	91.7	2.47	7.8
√	√	√	—	88.8	94.0	94.5	2.58	8.1
√	√	—	√	90.9	95.7	95.1	2.43	7.7
√	—	√	√	86.0	92.8	93.1	2.61	8.4
√	√	√	√	97.1	91.1	95.9	2.55	8.0

下载: 导出CSV

表 2 SAR-AIRcarft-1.0数据集对比实验结果

Table 2. Results of comparison experiments on the SAR-AIRcarft-1.0 dataset

Model	Precision/%	Recall/%	mAP50/%	F1/%	GFLOPs
Faster R-CNN	79.0	68.5	75.9	73.4	137.5
Center-Net	62.8	71.9	70.8	67.6	51.6
YOLOv5s	90.5	81.1	86.9	85.5	16.5
SKG-Net	85.6	75.8	70.6	59.7	120
YOLOv8s	92.3	81.8	90.0	86.7	28.6
YOLOv10s	96.9	89.4	90.7	88.1	8.4
SFS-CNet	94.7	84.5	89.9	89.3	6.9
Ours	97.1	91.1	95.9	94.0	6.2

下载: 导出CSV

表 3 HRSID数据集对比实验结果

Table 3. Results of comparison experiments on the HRSID dataset

算法	Precision/%	Recall/%	mAP50/%	F1/%	GFLOPs
Faster R-CNN	86.3	81.6	86.3	83.9	137.5
Center-Net	90.5	74.1	83.3	81.5	51.6
YOLOv5s	94.3	89.4	88.1	91.7	16.5
SKG-Net	77.8	81.5	81.7	79.6	120
YOLOv8s	90.1	94.4	95.9	92.2	28.6
YOLOv10s	97.8	96.2	97.6	95.9	8.4
SFS-CNet	88.8	95.3	92.9	91.9	6.9
Ours	96.2	97.8	98.8	97.0	6.2

下载: 导出CSV

参考文献(42)

[1]	梁礼明, 陈康泉, 王成斌, 等. 融合视觉中心机制和并行补丁感知的遥感图像检测算法[J]. 光电工程, 2024, 51 (7): 240099. doi: 10.12086/oee.2024.240099 Liang L M, Chen K Q, Wang C B, et al. Remote sensing image detection algorithm integrating visual center mechanism and parallel patch perception[J]. Opto-Electron Eng, 2024, 51 (7): 240099. doi: 10.12086/oee.2024.240099
[2]	肖振久, 张杰浩, 林渤翰. 特征协同与细粒度感知的遥感图像小目标检测[J]. 光电工程, 2024, 51 (6): 240066. doi: 10.12086/oee.2024.240066 Xiao Z J, Zhang J H, Lin B H. Feature coordination and fine-grained perception of small targets in remote sensing images[J]. Opto-Electron Eng, 2024, 51 (6): 240066. doi: 10.12086/oee.2024.240066
[3]	马梁, 苟于涛, 雷涛, 等. 基于多尺度特征融合的遥感图像小目标检测[J]. 光电工程, 2022, 49 (4): 210363. doi: 10.12086/oee.2022.210363 Ma L, Gou Y T, Lei T, et al. Small object detection based on multi-scale feature fusion using remote sensing images[J]. Opto-Electron Eng, 2022, 49 (4): 210363. doi: 10.12086/oee.2022.210363
[4]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39 (6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031
[5]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779–788. https://doi.org/10.1109/CVPR.2016.91.
[6]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517–6525. https://doi.org/10.1109/CVPR.2017.690.
[7]	Redmon J, Farhadi A. YOLOv3: an incremental improvement[Z]. arXiv: 1804.02767, 2018. https://doi.org/10.48550/arXiv.1804.02767.
[8]	Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[Z]. arXiv: 2004.10934, 2020. https://doi.org/10.48550/arXiv.2004.10934.
[9]	Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding YOLO series in 2021[Z]. arXiv: 2107.08430, 2021. https://doi.org/10.48550/arXiv.2107.08430.
[10]	Lyu Z W, Jin H F, Zhen T, et al. Small object recognition algorithm of grain pests based on SSD feature fusion[J]. IEEE Access, 2021, 9: 43202−43213. doi: 10.1109/ACCESS.2021.3066510
[11]	Zhang L P, Liu Y, Zhao W D, et al. Frequency-adaptive learning for SAR ship detection in clutter scenes[J]. IEEE Trans Geosci Remote Sens, 2023, 61: 5215514. doi: 10.1109/TGRS.2023.3249349
[12]	Si J H, Song B B, Wu J X, et al. Maritime ship detection method for satellite images based on multiscale feature fusion[J]. IEEE J Sel Top Appl Earth Obs Remote Sens, 2023, 16: 6642−6655. doi: 10.1109/JSTARS.2023.3296898
[13]	Qin C, Wang X Q, Li G, et al. A semi-soft label-guided network with self-distillation for SAR inshore ship detection[J]. IEEE Trans Geosci Remote Sens, 2023, 61: 5211814. doi: 10.1109/TGRS.2023.3293535
[14]	胥小我, 张晓玲, 张天文, 等. 基于自适应锚框分配与IOU监督的复杂场景SAR舰船检测[J]. 雷达学报, 2023, 12 (5): 1097−1111. doi: 10.12000/JR23059 Xu X W, Zhang X L, Zhang T W, et al. SAR ship detection in complex scenes based on adaptive anchor assignment and IOU supervise[J]. J Radars, 2023, 12 (5): 1097−1111. doi: 10.12000/JR23059
[15]	肖振久, 林渤翰, 曲海成. 融合多重机制的SAR舰船检测[J]. 中国图象图形学报, 2024, 29 (2): 545−558. doi: 10.11834/jig.230166 Xiao Z J, Lin B H, Qu H C. SAR ship detection with multi-mechanism fusion[J]. J Image Graphics, 2024, 29 (2): 545−558. doi: 10.11834/jig.230166
[16]	孙培双, 温显斌. 基于改进YOLOv5模型的SAR图像舰船目标检测算法[J]. 电光与控制, 2024, 31 (8): 32−37,85. doi: 10.3969/j.issn.1671-637X.2024.08.005 Sun P S, Wen X B. An improved algorithm for detecting ship target in SAR images based on YOLOv5 model[J]. Electron-Opt Control, 2024, 31 (8): 32−37,85. doi: 10.3969/j.issn.1671-637X.2024.08.005
[17]	Li K, Wang D, Hu Z Y, et al. Unleashing channel potential: space-frequency selection convolution for SAR object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 17323–17332. https://doi.org/10.1109/CVPR52733.2024.01640.
[18]	Zhou J, Xiao C, Peng B, et al. DiffDet4SAR: diffusion-based aircraft target detection network for SAR images[J]. IEEE Geosci Remote Sens Lett, 2024, 21: 4007905. doi: 10.1109/LGRS.2024.3386020
[19]	Wang A, Chen H, Liu L H, et al. YOLOv10: real-time end-to-end object detection[Z]. arXiv: 2405.14458, 2024. https://doi.org/10.48550/arXiv.2405.14458.
[20]	Zhang P F, Lo E, Lu B T. High performance depthwise and pointwise convolutions on mobile devices[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020: 6795–6802. https://doi.org/10.1609/aaai.v34i04.6159.
[21]	Guo Y H, Li Y D, Wang L Q, et al. Depthwise convolution is all you need for learning multiple visual domains[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019: 8368–8375. https://doi.org/10.1609/aaai.v33i01.33018368.
[22]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778. https://doi.org/10.1109/CVPR.2016.90.
[23]	Wu H B, Kuo H C, Zheng N J, et al. Partially fake audio detection by self-attention-based fake span discovery[C]//ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022: 9236–9240. https://doi.org/10.1109/ICASSP43922.2022.9746162.
[24]	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000–6010.
[25]	Bebis G, Georgiopoulos M. Feed-forward neural networks[J]. IEEE Potentials, 1994, 13 (4): 27−31. doi: 10.1109/45.329294
[26]	Chollet F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1800–1807. https://doi.org/10.1109/CVPR.2017.195.
[27]	Neubeck A, Van Gool L. Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition (ICPR'06), 2006: 850–855. https://doi.org/10.1109/ICPR.2006.479.
[28]	Li J F, Wen Y, He L H. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 6153–6162. https://doi.org/10.1109/CVPR52729.2023.00596.
[29]	Hsiao T Y, Chang Y C, Chou H H, et al. Filter-based deep-compression with global average pooling for convolutional networks[J]. J Syst Archit, 2019, 95: 9−18. doi: 10.1016/j.sysarc.2019.02.008
[30]	McClenny L, Braga-Neto U. Self-adaptive physics-informed neural networks using a soft attention mechanism[Z]. arXiv:2009.04544, 2020. https://doi.org/10.48550/arXiv.2009.04544.
[31]	Chen J W, An D X, Ge B B, et al. Detection, parameters estimation, and imaging of moving targets based on extended post-Doppler STAP in multichannel WasSAR-GMTI[J]. IEEE Trans Geosci Remote Sens, 2024, 62: 5223515. doi: 10.1109/TGRS.2024.3465435
[32]	Koç E, Alikaşifoğlu T, Aras A C, et al. Trainable fractional Fourier transform[J]. IEEE Signal Process Lett, 2024, 31: 751−755. doi: 10.1109/LSP.2024.3372779
[33]	Luan S Z, Chen C, Zhang B C, et al. Gabor convolutional networks[J]. IEEE Trans Image Process, 2018, 27 (9): 4357−4366. doi: 10.1109/TIP.2018.2835143
[34]	Yin X Y, Goudriaan J, Lantinga E A, et al. A flexible sigmoid function of determinate growth[J]. Ann Bot, 2003, 91 (3): 361−371. doi: 10.1093/aob/mcg029
[35]	Gevorgyan Z. SIoU loss: more powerful learning for bounding box regression[Z]. arXiv: 2205.12740, 2022. https://doi.org/10.48550/arXiv.2205.12740.
[36]	Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Trans Cybern, 2022, 52 (8): 8574−8586. doi: 10.1109/TCYB.2021.3095305
[37]	Zhang Y F, Ren W Q, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146−157. doi: 10.1016/j.neucom.2022.07.042
[38]	Tong Z J, Chen Y H, Xu Z W, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[Z]. arXiv: 2301.10051, 2023. https://doi.org/10.48550/arXiv.2301.10051.
[39]	王智睿, 康玉卓, 曾璇, 等. SAR-AIRcraft-1.0: 高分辨率SAR飞机检测识别数据集[J]. 雷达学报, 2023, 12 (4): 906−922. doi: 10.12000/JR23043 Wang Z R, Kang Y Z, Zeng X, et al. SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset[J]. J Radars, 2023, 12 (4): 906−922. doi: 10.12000/JR23043
[40]	Wei S J, Zeng X F, Qu Q Z, et al. HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8: 120234−120254. doi: 10.1109/ACCESS.2020.3005861
[41]	Wang Y Y, Wang C, Zhang H, et al. Automatic ship detection based on RetinaNet using multi-resolution Gaofen-3 imagery[J]. Remote Sens, 2019, 11 (5): 531. doi: 10.3390/rs11050531
[42]	Pan D C, Gao X, Dai W, et al. SRT-net: scattering region topology network for oriented ship detection in large-scale SAR images[J]. IEEE Trans Geosci Remote Sens, 2024, 62: 5202318. doi: 10.1109/TGRS.2024.3351366

施引文献

资源附件(0)

访问统计

点击扫一扫

图(13)

表(3)

计量

文章访问数:
PDF下载数:
施引文献: 0

融合空-频域的动态SAR图像目标检测

**^*通讯作者:** 王嘉慧，2245414310@qq.com。

Dynamic SAR image target detection by fusing space-frequency domain

**^*Corresponding author:** 2245414310@qq.com

计量

目录

作者须知

其他内容

条款和政策

融合空-频域的动态SAR图像目标检测

*通讯作者: 王嘉慧，2245414310@qq.com。

Dynamic SAR image target detection by fusing space-frequency domain

*Corresponding author: 2245414310@qq.com

计量

出版历程

目录

作者须知

其他内容

条款和政策

**^*通讯作者:** 王嘉慧，2245414310@qq.com。

**^*Corresponding author:** 2245414310@qq.com