Journal of Frontiers of Computer Science and Technology ›› 2025, Vol. 19 ›› Issue (9): 2430-2444.DOI: 10.3778/j.issn.1673-9418.2411066

• Graphics·Image • Previous Articles     Next Articles

Dual-Branch Network for Building Extraction in Remote Sensing Images with Fusion of Local-Global Features

LIU Erhu, LI Haowen, HU Yu, XU Shengjun, LI Xiaohan, SHI Ya   

  1. 1. College of Information and Control Engineering, Xi??an University of Architecture and Technology, Xi??an 710055, China
    2. Geovis Technology Co., Ltd., Beijing 101399, China
  • Online:2025-09-01 Published:2025-09-01

融合局部-全局特征的双分支遥感影像建筑物提取网络

刘二虎,李浩文,胡煜,徐胜军,李小晗,史亚   

  1. 1. 西安建筑科技大学 信息与控制工程学院,西安 710055
    2. 中科星图股份有限公司,北京 101399

Abstract: Efficient and automatic extraction of building information from remote sensing images is an important task for intelligent interpretation of remote sensing. However, the buildings in high-resolution remote sensing images are of varying sizes and shapes, and the background interference is severe, resulting in poor extraction performance of existing algorithms. A dual-branch network that integrates local global features is proposed for accurate and efficient extraction of buildings in remote sensing image to address this issue. Firstly, an encoder with a dual-branch structure of CNN and Transformer is designed to simultaneously capture local texture information and global contextual dependencies of buildings. Secondly, in order to overcome the differences in the features extracted by the CNN branch and the Transformer branch, a cross-feature attention fusion module (CFAFM) is designed to effectively aggregate the two different sets of features extracted by the two branches and weight the information with importance. In addition, to enhance the fine-grained feature recovery ability of the decoder, a feature refinement enhancement module (FREM) is designed, and inserted at the end of the decoder to reduce information loss during upsampling, refining the edges and local details of the building. In the WHU, Massachusetts, and Inria building datasets, the IoU of the proposed network reaches 90.84%, 74.94%, and 81.24%, respectively, and the F1- score reaches 95.20%, 85.53%, and 89.69%, respectively. The results indicate that the proposed network can effectively improve the accuracy of building extraction from remote sensing images, and has significant advantages compared with existing methods in complex task scenarios.

Key words: remote sensing images, building extraction, dual-branch network, feature fusion, feature refinement and enhancement

摘要: 从遥感影像中高效且自动地提取建筑物信息是遥感智能化解译的一项重要工作,然而高分辨率遥感影像中的建筑物大小不一、形状多变,背景干扰严重,导致现有算法的提取效果不佳。针对此问题,提出了一种融合局部-全局特征的双分支网络,用于遥感影像中建筑物的准确高效提取。设计了一种CNN与Transformer双分支结构的编码器以同时捕获建筑物的局部纹理信息和全局上下文依赖关系;为了克服CNN分支与Transformer分支所提取特征的差异性,设计了跨特征注意力融合模块(CFAFM)来有效地聚合两个分支所提取到的两组不同特征,对其进行重要性加权;为了增强解码器的细粒度特征恢复能力,设计了特征细化增强模块(FREM),插入至解码器的末端以减少上采样过程中的信息丢失,细化建筑物的边缘和局部细节。在WHU、Massachusetts及Inria建筑物数据集中,所提网络的IoU分别达到90.84%、74.94%、81.24%,F1-score分别达到95.20%、85.53%、89.69%。实验结果表明,所提网络可以有效提高遥感影像建筑物的提取精度,且在复杂任务场景下与现有方法相比具有明显的优势。

关键词: 遥感影像, 建筑物提取, 双分支网络, 特征融合, 特征细化增强