融合动态特征增强的遥感建筑物分割

肖振久,田昊,张杰浩,等. 融合动态特征增强的遥感建筑物分割[J]. 光电工程,2025,52(3): 240231. doi: 10.12086/oee.2025.240231
引用本文: 肖振久,田昊,张杰浩,等. 融合动态特征增强的遥感建筑物分割[J]. 光电工程,2025,52(3): 240231. doi: 10.12086/oee.2025.240231
Xiao Z J, Tian H, Zhang J H, et al. Fusion of dynamic features enhances remote sensing building segmentation[J]. Opto-Electron Eng, 2025, 52(3): 240231. doi: 10.12086/oee.2025.240231
Citation: Xiao Z J, Tian H, Zhang J H, et al. Fusion of dynamic features enhances remote sensing building segmentation[J]. Opto-Electron Eng, 2025, 52(3): 240231. doi: 10.12086/oee.2025.240231

融合动态特征增强的遥感建筑物分割

  • 基金项目:
    辽宁省高等学校基本科研项目(LJKMZ20220699)
详细信息
    作者简介:
    *通讯作者: 田昊,t18846338065@163.com。
  • 中图分类号: TP391.4

  • CSTR: 32245.14.oee.2025.240231

Fusion of dynamic features enhances remote sensing building segmentation

  • Fund Project: Basic Scientific Research Project of Liaoning Provincial Universities (LJKMZ20220699)
More Information
  • 针对遥感地物建筑物图像目标尺度差异大、样本空间分布不均衡、地物边界模糊、场景区域跨度大所导致的分割效果不佳问题,本文提出一种融合动态特征增强高精度遥感建筑物分割算法。首先,构建New_GhostNetV2网络,利用自适应上下文感知卷积,增强算法对样本空间特征的捕捉能力。其次,采用Ghost Convolution结合跳跃连接和特征分支策略设计多层级信息增强模块,增强特征整合。随后引入级联注意力CGA (cascaded group attention),通过组内独立注意力计算,加强模型对多样化地物形态的适应性。最后,通过动态深度特征增强器构造特征融合模块,进一步加强模型捕获能力。在WHU数据集上实验结果表明:改进算法较基线模型F1-Score提高8.57%,mIoU提高12.48%,与其他主流语义分割模型相比,改进DeepLabv3+具有更好的分割精度。

  • Overview: Due to the complex and variable characteristics of objects in remote sensing images, such as shape, size, texture, etc., objects may overlap with each other, and at the same time, they are affected by environmental factors such as atmospheric conditions, cloud cover, and changes in lighting, which leads to a decline in image quality and increases the difficulty of accurate segmentation. Modern deep learning technology has enabled semantic segmentation models to show strong robustness and recognition ability in complex scenes, but due to the differences in regions and application scenarios, there still exist problems such as insufficient capture of complex scene details, insufficient capture of long-distance dependencies, and difficulty in integrating multi-scale features, which makes the research in the direction of semantic segmentation very important. Aiming at the poor segmentation effect caused by large scale difference of objects, uneven spatial distribution of samples, fuzzy boundaries of objects and large span of scene area, this paper proposes a high-precision remote sensing building segmentation algorithm enhanced by integrating dynamic features. Firstly, the New_GhostNetV2 network is constructed, and the adaptive context-aware convolution is used to strengthen the discriminative ability of geometric deformation and the recognition ability of strong correlation features, improve the algorithm's ability to capture the local spatial features of samples and the global long-term dependence relationship, and realize the preliminary learning of sample images. Secondly, multi-level information enhancement modules are designed using Ghost Convolution combined with skip connections and feature branching strategies to solve the problem of key information loss and feature ambiguity caused by subsampling, enhance feature interaction and integration, and effectively reduce boundary ambiguity and segmentation errors. Then, the feature fusion module is constructed by the dynamic depth feature enhancer. According to the spatial position correlation of the feature, the feature mapping is self-adapted by channel and cross-scale to further strengthen the model's ability to mine and capture global key features and local fine-grained features, and improve the algorithm's attention to small objects. Finally, a cascade grouping attention mechanism is introduced to adjust the proportion of low-level and high-level features layer by layer, effectively suppressing background interference, and gradually optimize the segmentation results, so that the model can better cope with the diversified features in the image. Experimental results on the WHU dataset show that, compared with the baseline model, the improved algorithm is 8.57% higher than F1-Score, 12.48% higher than mIou, 13.28% higher than Recall and 12.13% higher than precision. Compared with other mainstream semantic segmentation models, the improved DeepLabv3+ has better segmentation accuracy, effectively improving semantic segmentation performance.

  • 加载中
  • 图 1  模型总体结构

    Figure 1.  Overall model structure

    图 2  主干网络(New_GhostNetV2)结构

    Figure 2.  Structure of backbone network (New_GhostNetV2)

    图 3  信息整合(MS-II)模块

    Figure 3.  Information integration (MS-II) module

    图 4  特征融合模块(DyMSLFusion)

    Figure 4.  Feature fusion module (DyMSLFusion)

    图 5  注意力机制(CGA)

    Figure 5.  Attention mechanism (CGA)

    图 6  数据集样本与标签图示例

    Figure 6.  Dataset sample and label chart example

    图 7  改进模型的评价指标精度图

    Figure 7.  Accuracy diagram of evaluation index of the improved model

    图 8  不同模型分割效果对比图

    Figure 8.  Comparison of segmentation effects of different models

    表 1  消融实验结果

    Table 1.  Ablation experiments results

    NumberABCDPrecision/%Recall/%F1/%mIoU/%
    1××××82.3275.5881.6474.35
    2×××84.1477.4883.6877.51
    3×××84.4576.9682.9476.23
    4×××85.5678.9583.4578.51
    5×××86.5479.4583.9680.48
    6××87.1982.4984.5981.74
    7××88.9484.5385.6182.48
    8×91.8583.8986.9581.87
    9×91.2586.5788.9182.36
    1094.4588.8690.2186.83
    下载: 导出CSV

    表 2  各类算法精度对比结果

    Table 2.  Precision comparison results of various algorithms

    Arithmetic Precision/% Recall% F1/% mIoU/% Time/(min/Epoch)
    UNet 87.85 86.87 86.95 82.64 4.0
    FCN 86.46 85.76 84.63 79.62 4.5
    SegNet 89.39 88.68 87.25 83.46 3.5
    DeepLabv3+ 82.32 75.58 81.64 74.35 5.0
    Our Methods 94.45 88.86 90.21 86.83 2.5
    下载: 导出CSV
  • [1]

    徐胜军, 荆扬, 李海涛, 等. 渐进式多粒度ResNet车型识别网络[J]. 光电工程, 2023, 50(7): 230052. doi: 10.12086/oee.2023.230052

    Xu S J, Jing Y, Li H T, et al. Progressive multi-granularity ResNet vehicle recognition network[J]. Opto-Electron Eng, 2023, 50(7): 230052. doi: 10.12086/oee.2023.230052

    [2]

    潘李琳, 邵剑飞. 融合图注意力的多分辨率点云补全[J]. 激光技术, 2023, 47(5): 700−707. doi: 10.7510/jgjs.issn.1001-3806.2023.05.019

    Pan L L, Shao J F. Multi-resolution point cloud completion fusing graph attention[J]. Laser Technol, 2023, 47(5): 700−707. doi: 10.7510/jgjs.issn.1001-3806.2023.05.019

    [3]

    王璨, 武新慧, 张燕青, 等. 基于双注意力语义分割网络的田间苗期玉米识别与分割[J]. 农业工程学报, 2021, 37(9): 211−221. doi: 10.11975/j.issn.1002-6819.2021.09.024

    Wang C, Wu X H, Zhang Y Q, et al. Recognition and segmentation of maize seedlings in field based on dual attention semantic segmentation network[J]. Trans Chin Soc Agric Eng, 2021, 37(9): 211−221. doi: 10.11975/j.issn.1002-6819.2021.09.024

    [4]

    傅启凡, 路茗, 张质懿, 等. 基于语义分割的水位监测方法研究[J]. 激光与光电子学进展, 2022, 59(4): 0410004. doi: 10.3788/LOP202259.0410004

    Fu Q F, Lu M, Zhang Z Y, et al. Water level monitoring method based on semantic segmentation[J]. Laser Optoelectron Prog, 2022, 59(4): 0410004. doi: 10.3788/LOP202259.0410004

    [5]

    孟俊熙, 张莉, 曹洋, 等. 基于Deeplab v3+的图像语义分割算法优化研究[J]. 激光与光电子学进展, 2022, 59(16): 1610009. doi: 10.3788/LOP202259.1610009

    Meng J X, Zhang L, Cao Y, et al. Optimization of image semantic segmentation algorithms based on Deeplab v3+[J]. Laser Optoelectron Prog, 2022, 59(16): 1610009. doi: 10.3788/LOP202259.1610009

    [6]

    Pan S, Li J W, Jiang J W. A street view semantic segmentation algorithm based on DeeplabV3+ architecture[J]. Proc SPIE, 2023, 12717: 127172D. doi: 10.1117/12.2684626

    [7]

    Harkat H, Nascimento J M P, Bernardino A, et al. Assessing the impact of the loss function and encoder architecture for fire aerial images segmentation using deeplabv3+[J]. Remote Sens, 2022, 14(9): 2023. doi: 10.3390/rs14092023

    [8]

    Das S, Fime A A, Siddique N, et al. Estimation of road boundary for intelligent vehicles based on DeepLabV3+ architecture[J]. IEEE Access, 2021, 9: 121060−121075. doi: 10.1109/ACCESS.2021.3107353

    [9]

    Su Y A, Lin Y, Fang X B, et al. Improved DeepLabV3+ network segmentation method for urban road scenes[C]//2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2022: 1274–1280. https://doi.org/10.1109/ITAIC54216.2022.9836922.

    [10]

    Guo L L, Zhou J Z. A lightweight network for semantic segmentation of road images based on improved DeepLabv3+[C]//2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), 2022: 832–837. https://doi.org/10.1109/PRAI55851.2022.9904092.

    [11]

    Li Z Y, Wang R, Zhang W, et al. Multiscale features supported DeepLabV3+ optimization scheme for accurate water semantic segmentation[J]. IEEE Access, 2019, 7: 155787−155804. doi: 10.1109/ACCESS.2019.2949635

    [12]

    刘尚旺, 崔智勇, 李道义. 基于Unet网络多任务学习的遥感图像建筑地物语义分割[J]. 国土资源遥感, 2020, 32(4): 74−83. doi: 10.6046/gtzyyg.2020.04.11

    Liu S W, Cui Z Y, Li D Y. Multi-task learning for building object semantic segmentation of remote sensing image based on Unet network[J]. Remote Sens Land Resour, 2020, 32(4): 74−83. doi: 10.6046/gtzyyg.2020.04.11

    [13]

    Wang J, Zhang X L, Yan T H, et al. DPNet: dual-pyramid semantic segmentation network based on improved deeplabv3 plus[J]. Electronics, 2023, 12(14): 3161. doi: 10.3390/electronics12143161

    [14]

    Kwak J, Sung Y. DeepLabV3-refiner-based semantic segmentation model for dense 3D point clouds[J]. Remote Sens, 2021, 13(8): 1565. doi: 10.3390/rs13081565

    [15]

    Dong W K, Gong S C. Hazy images segmentation method based on improved DeeplabV3[J]. Acad J Comput Inf Sci, 2023, 6(5): 21−29. doi: 10.25236/AJCIS.2023.060504

    [16]

    Basir S, Aziz N A A, Abiddin N Z. Semantic segmentation of paddy parcels using deep neural networks based on DeepLabV3[C]//Proceedings of the 10th International Conference on Geographical Information Systems Theory, Applications and Management, 2024, 1 : 173–180. https://doi.org/10.5220/0012698200003696.

    [17]

    Nie Z, Xu J C, Zhang S C. Analysis on DeepLabV3+ performance for automatic steel defects detection[Z]. arXiv: 2004.04822, 2020. https://doi.org/10.48550/arXiv.2004.04822.

    [18]

    Cabrera I, Zhou Y X, Ngo E, et al. Image segmentation using transfer learning with DeepLabv3 to facilitate photogrammetric limb scanning[Z]. https://doi.org/10.36227/techrxiv.19742488.v1.

    [19]

    Ikedo R, Hotta K. Feature sharing cooperative network for semantic segmentation[C]//Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2021, 5 : 577–584. https://doi.org/10.5220/0010312505770584.

    [20]

    Quan B, Liu B Y, Fu D C, et al. Improved Deeplabv3 for better road segmentation in remote sensing images[C]//2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI), 2021: 331–334. https://doi.org/10.1109/ICCEAI52939.2021.00066.

    [21]

    Tang M C S, Teoh S S, Ibrahim H. Retinal vessel segmentation from fundus images using DeepLabv3+[C]//2022 IEEE 18th International Colloquium on Signal Processing & Applications (CSPA), 2022: 377–381. https://doi.org/10.1109/CSPA55076.2022.9781891.

    [22]

    Harkat H, Nascimento J M P, Bernardino A. Fire detection using residual deeplabv3+ model[C]//2021 Telecoms Conference (ConfTELE), 2021: 1–6. https://doi.org/10.1109/ConfTELE50222.2021.9435459.

    [23]

    Tang Y H, Han K, Guo J Y, et al. GhostNetv2: enhance cheap operation with long-range attention[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022: 724. https://doi.org/10.5555/3600270.3600994.

    [24]

    Liu X Y, Peng H W, Zheng N X, et al. EfficientViT: Memory efficient vision transformer with cascaded group attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 14420–14430. https://doi.org/10.1109/CVPR52729.2023.01386.

    [25]

    Gu Z J, Jamison K W, Sabuncu M R, et al. Heritability and interindividual variability of regional structure-function coupling[J]. Nat Commun, 2021, 12(1): 4894. doi: 10.1038/s41467-021-25184-4

    [26]

    Deng G H, Wu D Z, Chen W Z. Attention guided food recognition via multi-stage local feature fusion[J]. Comput Mater Continua, 2024, 80(2): 1985−2003. doi: 10.32604/cmc.2024.052174

    [27]

    Chen Y P, Dai X Y, Liu M C, et al. Dynamic convolution: Attention over convolution kernels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11027–11036. https://doi.org/10.1109/CVPR42600.2020.01104.

    [28]

    Ji S P, Wei S Q, Lu M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Trans Geosci Remote Sens, 2019, 57(1): 574−586. doi: 10.1109/TGRS.2018.2858817

    [29]

    Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation[C]//18th International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.

    [30]

    Badrinarayanan V, Kendall A, Cipolla R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(12): 2481−2495. doi: 10.1109/TPAMI.2016.2644615

    [31]

    Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 15th European Conference on Computer Vision (ECCV), 2018: 833–851. https://doi.org/10.1007/978-3-030-01234-2_49.

    [32]

    Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965.

  • 加载中

(9)

(2)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2024-09-30
修回日期:  2025-01-08
录用日期:  2025-01-08
刊出日期:  2025-03-28

目录

/

返回文章
返回