融合动态特征增强的遥感建筑物分割

肖振久; 田昊; 张杰浩; 曲海成

doi:10.12086/oee.2025.240231

融合动态特征增强的遥感建筑物分割

- 辽宁工程技术大学软件学院，辽宁葫芦岛 125105
基金项目:
辽宁省高等学校基本科研项目(LJKMZ20220699)

详细信息

作者简介:
肖振久(1968-)，男，副教授，硕士生导师，主要从事机器学习和图像与视觉信息计算方面的研究。E-mail：xiaozhenjiu@lntu.edu.cn;

田昊(2000-)，男，硕士研究生，主要从事遥感图像语义分割方面的研究。E-mail：t18846338065@163.com;

张杰浩(2000-)，女，硕士研究生，主要从事遥感图像目标检测方面的研究。E-mail：zjhao0409@163.com;

曲海成(1981-)，副教授，硕士生导师，副院长，CCF会员，主要从事遥感影像高性能计算、视觉信息计算、目标检测与识别研究。E-mail：quhaicheng@lntu.edu.cn

**^*通讯作者:** 田昊，t18846338065@163.com。

中图分类号: TP391.4
CSTR: 32245.14.oee.2025.240231

收稿日期: 2024-09-30

修回日期: 2025-01-08

录用日期: 2025-01-08

刊出日期: 2025-03-28

Fusion of dynamic features enhances remote sensing building segmentation

- School of Software, Liaoning University of Engineering and Technology, Huludao, Liaoning 125105, China
Fund Project: Basic Scientific Research Project of Liaoning Provincial Universities (LJKMZ20220699)

More Information

**^*Corresponding author:** t18846338065@163.com

CSTR: 32245.14.oee.2025.240231

Received Date 30 September 2024

Revised Date 08 January 2025

Accepted Date 08 January 2025

Published Date 28 March 2025

摘要

摘要

针对遥感地物建筑物图像目标尺度差异大、样本空间分布不均衡、地物边界模糊、场景区域跨度大所导致的分割效果不佳问题，本文提出一种融合动态特征增强高精度遥感建筑物分割算法。首先，构建New_GhostNetV2网络,利用自适应上下文感知卷积，增强算法对样本空间特征的捕捉能力。其次，采用Ghost Convolution结合跳跃连接和特征分支策略设计多层级信息增强模块，增强特征整合。随后引入级联注意力CGA (cascaded group attention)，通过组内独立注意力计算，加强模型对多样化地物形态的适应性。最后，通过动态深度特征增强器构造特征融合模块，进一步加强模型捕获能力。在WHU数据集上实验结果表明：改进算法较基线模型F1-Score提高8.57%，mIoU提高12.48%，与其他主流语义分割模型相比，改进DeepLabv3+具有更好的分割精度。
- 遥感图像 /
- 语义分割 /
- 特征增强 /
- 信息整合
Abstract

Aiming at the poor segmentation effect caused by the large scale difference of objects, uneven spatial distribution of samples, fuzzy boundary of objects and large span of scene area, this paper proposes a high-precision remote sensing building segmentation algorithm enhanced by integrating dynamic features. Firstly, the New_GhostNetV2 network is constructed, and the adaptive context-aware convolution is used to improve the algorithm's ability to capture the features of the sample space. Secondly, multi-level information enhancement modules are designed using ghost convolution combined with skip connections and feature branching strategies to enhance the feature integration. Then CGA (cascaded group attention) is introduced to enhance the adaptability of the model to diverse ground object forms through the calculation of independent attention within the group. Finally, the feature fusion module is constructed by the dynamic depth feature enhancer to further enhance the ability of model capture. The experimental results on the WHU data set show that the improved algorithm is 8.57% higher than the baseline model F1-Score and 12.48% higher than mIoU. Compared with other mainstream semantic segmentation models, the improved DeepLabv3+ has better segmentation accuracy.
- remote sensing image /
- semantic segmentation /
- feature enhancement /
- information integration

Overview

Overview

Overview: Due to the complex and variable characteristics of objects in remote sensing images, such as shape, size, texture, etc., objects may overlap with each other, and at the same time, they are affected by environmental factors such as atmospheric conditions, cloud cover, and changes in lighting, which leads to a decline in image quality and increases the difficulty of accurate segmentation. Modern deep learning technology has enabled semantic segmentation models to show strong robustness and recognition ability in complex scenes, but due to the differences in regions and application scenarios, there still exist problems such as insufficient capture of complex scene details, insufficient capture of long-distance dependencies, and difficulty in integrating multi-scale features, which makes the research in the direction of semantic segmentation very important. Aiming at the poor segmentation effect caused by large scale difference of objects, uneven spatial distribution of samples, fuzzy boundaries of objects and large span of scene area, this paper proposes a high-precision remote sensing building segmentation algorithm enhanced by integrating dynamic features. Firstly, the New_GhostNetV2 network is constructed, and the adaptive context-aware convolution is used to strengthen the discriminative ability of geometric deformation and the recognition ability of strong correlation features, improve the algorithm's ability to capture the local spatial features of samples and the global long-term dependence relationship, and realize the preliminary learning of sample images. Secondly, multi-level information enhancement modules are designed using Ghost Convolution combined with skip connections and feature branching strategies to solve the problem of key information loss and feature ambiguity caused by subsampling, enhance feature interaction and integration, and effectively reduce boundary ambiguity and segmentation errors. Then, the feature fusion module is constructed by the dynamic depth feature enhancer. According to the spatial position correlation of the feature, the feature mapping is self-adapted by channel and cross-scale to further strengthen the model's ability to mine and capture global key features and local fine-grained features, and improve the algorithm's attention to small objects. Finally, a cascade grouping attention mechanism is introduced to adjust the proportion of low-level and high-level features layer by layer, effectively suppressing background interference, and gradually optimize the segmentation results, so that the model can better cope with the diversified features in the image. Experimental results on the WHU dataset show that, compared with the baseline model, the improved algorithm is 8.57% higher than F1-Score, 12.48% higher than mIou, 13.28% higher than Recall and 12.13% higher than precision. Compared with other mainstream semantic segmentation models, the improved DeepLabv3+ has better segmentation accuracy, effectively improving semantic segmentation performance.

HTML全文

图 1 模型总体结构

Figure 1. Overall model structure

下载: 全尺寸图片幻灯片

图 2 主干网络(New_GhostNetV2)结构

Figure 2. Structure of backbone network (New_GhostNetV2)

下载: 全尺寸图片幻灯片

图 3 信息整合(MS-II)模块

Figure 3. Information integration (MS-II) module

下载: 全尺寸图片幻灯片

图 4 特征融合模块(DyMSLFusion)

Figure 4. Feature fusion module (DyMSLFusion)

下载: 全尺寸图片幻灯片

图 5 注意力机制(CGA)

Figure 5. Attention mechanism (CGA)

下载: 全尺寸图片幻灯片

图 6 数据集样本与标签图示例

Figure 6. Dataset sample and label chart example

下载: 全尺寸图片幻灯片

图 7 改进模型的评价指标精度图

Figure 7. Accuracy diagram of evaluation index of the improved model

下载: 全尺寸图片幻灯片

图 8 不同模型分割效果对比图

Figure 8. Comparison of segmentation effects of different models

下载: 全尺寸图片幻灯片

表 1 消融实验结果

Table 1. Ablation experiments results

Number	A	B	C	D	Precision/%	Recall/%	F₁/%	mIoU/%
1	×	×	×	×	82.32	75.58	81.64	74.35
2	√	×	×	×	84.14	77.48	83.68	77.51
3	×	√	×	×	84.45	76.96	82.94	76.23
4	×	×	√	×	85.56	78.95	83.45	78.51
5	×	×	×	√	86.54	79.45	83.96	80.48
6	×	×	√	√	87.19	82.49	84.59	81.74
7	√	√	×	×	88.94	84.53	85.61	82.48
8	√	×	√	√	91.85	83.89	86.95	81.87
9	×	√	√	√	91.25	86.57	88.91	82.36
10	√	√	√	√	94.45	88.86	90.21	86.83

下载: 导出CSV

表 2 各类算法精度对比结果

Table 2. Precision comparison results of various algorithms

Arithmetic	Precision/%	Recall%	F₁/%	mIoU/%	Time/(min/Epoch)
UNet	87.85	86.87	86.95	82.64	4.0
FCN	86.46	85.76	84.63	79.62	4.5
SegNet	89.39	88.68	87.25	83.46	3.5
DeepLabv3+	82.32	75.58	81.64	74.35	5.0
Our Methods	94.45	88.86	90.21	86.83	2.5

下载: 导出CSV

参考文献(32)

参考文献

[1]	徐胜军, 荆扬, 李海涛, 等. 渐进式多粒度ResNet车型识别网络[J]. 光电工程, 2023, 50(7): 230052. doi: 10.12086/oee.2023.230052 Xu S J, Jing Y, Li H T, et al. Progressive multi-granularity ResNet vehicle recognition network[J]. Opto-Electron Eng, 2023, 50(7): 230052. doi: 10.12086/oee.2023.230052
[2]	潘李琳, 邵剑飞. 融合图注意力的多分辨率点云补全[J]. 激光技术, 2023, 47(5): 700−707. doi: 10.7510/jgjs.issn.1001-3806.2023.05.019 Pan L L, Shao J F. Multi-resolution point cloud completion fusing graph attention[J]. Laser Technol, 2023, 47(5): 700−707. doi: 10.7510/jgjs.issn.1001-3806.2023.05.019
[3]	王璨, 武新慧, 张燕青, 等. 基于双注意力语义分割网络的田间苗期玉米识别与分割[J]. 农业工程学报, 2021, 37(9): 211−221. doi: 10.11975/j.issn.1002-6819.2021.09.024 Wang C, Wu X H, Zhang Y Q, et al. Recognition and segmentation of maize seedlings in field based on dual attention semantic segmentation network[J]. Trans Chin Soc Agric Eng, 2021, 37(9): 211−221. doi: 10.11975/j.issn.1002-6819.2021.09.024
[4]	傅启凡, 路茗, 张质懿, 等. 基于语义分割的水位监测方法研究[J]. 激光与光电子学进展, 2022, 59(4): 0410004. doi: 10.3788/LOP202259.0410004 Fu Q F, Lu M, Zhang Z Y, et al. Water level monitoring method based on semantic segmentation[J]. Laser Optoelectron Prog, 2022, 59(4): 0410004. doi: 10.3788/LOP202259.0410004
[5]	孟俊熙, 张莉, 曹洋, 等. 基于Deeplab v3+的图像语义分割算法优化研究[J]. 激光与光电子学进展, 2022, 59(16): 1610009. doi: 10.3788/LOP202259.1610009 Meng J X, Zhang L, Cao Y, et al. Optimization of image semantic segmentation algorithms based on Deeplab v3+[J]. Laser Optoelectron Prog, 2022, 59(16): 1610009. doi: 10.3788/LOP202259.1610009
[6]	Pan S, Li J W, Jiang J W. A street view semantic segmentation algorithm based on DeeplabV3+ architecture[J]. Proc SPIE, 2023, 12717: 127172D. doi: 10.1117/12.2684626
[7]	Harkat H, Nascimento J M P, Bernardino A, et al. Assessing the impact of the loss function and encoder architecture for fire aerial images segmentation using deeplabv3+[J]. Remote Sens, 2022, 14(9): 2023. doi: 10.3390/rs14092023
[8]	Das S, Fime A A, Siddique N, et al. Estimation of road boundary for intelligent vehicles based on DeepLabV3+ architecture[J]. IEEE Access, 2021, 9: 121060−121075. doi: 10.1109/ACCESS.2021.3107353
[9]	Su Y A, Lin Y, Fang X B, et al. Improved DeepLabV3+ network segmentation method for urban road scenes[C]//2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), 2022: 1274–1280. https://doi.org/10.1109/ITAIC54216.2022.9836922.
[10]	Guo L L, Zhou J Z. A lightweight network for semantic segmentation of road images based on improved DeepLabv3+[C]//2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), 2022: 832–837. https://doi.org/10.1109/PRAI55851.2022.9904092.
[11]	Li Z Y, Wang R, Zhang W, et al. Multiscale features supported DeepLabV3+ optimization scheme for accurate water semantic segmentation[J]. IEEE Access, 2019, 7: 155787−155804. doi: 10.1109/ACCESS.2019.2949635
[12]	刘尚旺, 崔智勇, 李道义. 基于Unet网络多任务学习的遥感图像建筑地物语义分割[J]. 国土资源遥感, 2020, 32(4): 74−83. doi: 10.6046/gtzyyg.2020.04.11 Liu S W, Cui Z Y, Li D Y. Multi-task learning for building object semantic segmentation of remote sensing image based on Unet network[J]. Remote Sens Land Resour, 2020, 32(4): 74−83. doi: 10.6046/gtzyyg.2020.04.11
[13]	Wang J, Zhang X L, Yan T H, et al. DPNet: dual-pyramid semantic segmentation network based on improved deeplabv3 plus[J]. Electronics, 2023, 12(14): 3161. doi: 10.3390/electronics12143161
[14]	Kwak J, Sung Y. DeepLabV3-refiner-based semantic segmentation model for dense 3D point clouds[J]. Remote Sens, 2021, 13(8): 1565. doi: 10.3390/rs13081565
[15]	Dong W K, Gong S C. Hazy images segmentation method based on improved DeeplabV3[J]. Acad J Comput Inf Sci, 2023, 6(5): 21−29. doi: 10.25236/AJCIS.2023.060504
[16]	Basir S, Aziz N A A, Abiddin N Z. Semantic segmentation of paddy parcels using deep neural networks based on DeepLabV3[C]//Proceedings of the 10th International Conference on Geographical Information Systems Theory, Applications and Management, 2024, 1 : 173–180. https://doi.org/10.5220/0012698200003696.
[17]	Nie Z, Xu J C, Zhang S C. Analysis on DeepLabV3+ performance for automatic steel defects detection[Z]. arXiv: 2004.04822, 2020. https://doi.org/10.48550/arXiv.2004.04822.
[18]	Cabrera I, Zhou Y X, Ngo E, et al. Image segmentation using transfer learning with DeepLabv3 to facilitate photogrammetric limb scanning[Z]. https://doi.org/10.36227/techrxiv.19742488.v1.
[19]	Ikedo R, Hotta K. Feature sharing cooperative network for semantic segmentation[C]//Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2021, 5 : 577–584. https://doi.org/10.5220/0010312505770584.
[20]	Quan B, Liu B Y, Fu D C, et al. Improved Deeplabv3 for better road segmentation in remote sensing images[C]//2021 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI), 2021: 331–334. https://doi.org/10.1109/ICCEAI52939.2021.00066.
[21]	Tang M C S, Teoh S S, Ibrahim H. Retinal vessel segmentation from fundus images using DeepLabv3+[C]//2022 IEEE 18th International Colloquium on Signal Processing & Applications (CSPA), 2022: 377–381. https://doi.org/10.1109/CSPA55076.2022.9781891.
[22]	Harkat H, Nascimento J M P, Bernardino A. Fire detection using residual deeplabv3+ model[C]//2021 Telecoms Conference (ConfTELE), 2021: 1–6. https://doi.org/10.1109/ConfTELE50222.2021.9435459.
[23]	Tang Y H, Han K, Guo J Y, et al. GhostNetv2: enhance cheap operation with long-range attention[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022: 724. https://doi.org/10.5555/3600270.3600994.
[24]	Liu X Y, Peng H W, Zheng N X, et al. EfficientViT: Memory efficient vision transformer with cascaded group attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 14420–14430. https://doi.org/10.1109/CVPR52729.2023.01386.
[25]	Gu Z J, Jamison K W, Sabuncu M R, et al. Heritability and interindividual variability of regional structure-function coupling[J]. Nat Commun, 2021, 12(1): 4894. doi: 10.1038/s41467-021-25184-4
[26]	Deng G H, Wu D Z, Chen W Z. Attention guided food recognition via multi-stage local feature fusion[J]. Comput Mater Continua, 2024, 80(2): 1985−2003. doi: 10.32604/cmc.2024.052174
[27]	Chen Y P, Dai X Y, Liu M C, et al. Dynamic convolution: Attention over convolution kernels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11027–11036. https://doi.org/10.1109/CVPR42600.2020.01104.
[28]	Ji S P, Wei S Q, Lu M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Trans Geosci Remote Sens, 2019, 57(1): 574−586. doi: 10.1109/TGRS.2018.2858817
[29]	Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation[C]//18th International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.
[30]	Badrinarayanan V, Kendall A, Cipolla R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(12): 2481−2495. doi: 10.1109/TPAMI.2016.2644615
[31]	Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 15th European Conference on Computer Vision (ECCV), 2018: 833–851. https://doi.org/10.1007/978-3-030-01234-2_49.
[32]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965.

施引文献

资源附件(0)

访问统计

访问统计

点击扫一扫

图(9)

表(2)

计量

文章访问数:
PDF下载数:
施引文献: 0

融合动态特征增强的遥感建筑物分割

**^*通讯作者:** 田昊，t18846338065@163.com。

Fusion of dynamic features enhances remote sensing building segmentation

**^*Corresponding author:** t18846338065@163.com

摘要

Abstract

Overview

参考文献

访问统计

计量

目录

作者须知

其他内容

条款和政策

融合动态特征增强的遥感建筑物分割

*通讯作者: 田昊，t18846338065@163.com。

Fusion of dynamic features enhances remote sensing building segmentation

*Corresponding author: t18846338065@163.com

摘要

Abstract

Overview

参考文献

访问统计

计量

出版历程

目录

作者须知

其他内容

条款和政策

**^*通讯作者:** 田昊，t18846338065@163.com。

**^*Corresponding author:** t18846338065@163.com