
Journal of Frontiers of Computer Science and Technology ›› 2025, Vol. 19 ›› Issue (4): 989-1000.DOI: 10.3778/j.issn.1673-9418.2403082
• Graphics·Image • Previous Articles Next Articles
LI Zhijie, CHENG Xin, LI Changhua, GAO Yuan, XUE Jingyu, JIE Jun
Online:2025-04-01
Published:2025-03-28
李智杰,程鑫,李昌华,高元,薛靖裕,介军
LI Zhijie, CHENG Xin, LI Changhua, GAO Yuan, XUE Jingyu, JIE Jun. Cross-Modal Multi-level Feature Fusion for Semantic Segmentation of Remote Sensing Images[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(4): 989-1000.
李智杰, 程鑫, 李昌华, 高元, 薛靖裕, 介军. 跨模态多层特征融合的遥感影像语义分割[J]. 计算机科学与探索, 2025, 19(4): 989-1000.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2403082
| [1] SCHUEGRAF P, SHAN J, BITTNER K. PLANES4LOD2: reconstruction of LoD-2 building models using a depth attention-based fully convolutional neural network[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 211: 425-437. [2] ZHOU W J, LI Y Z, HUAN J, et al. MSTNet-KD: multilevel transfer networks using knowledge distillation for the dense prediction of remote-sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4504612. [3] JIN J H, ZHOU W J, YANG R W, et al. Edge detection guide network for semantic segmentation of remote-sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 5000505. [4] MO Y, GUO Z C, ZHONG R F, et al. Urban functional zone classification using light-detection-and-ranging point clouds, aerial images, and point-of-interest data[J]. Remote Sensing, 2024, 16(2): 386. [5] LUO H, WANG Z J, DU B, et al. A deep cross-modal fusion network for road extraction with high-resolution imagery and LiDAR data[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4503415. [6] YAN L, HUANG J M, XIE H, et al. Efficient depth fusion transformer for aerial image semantic segmentation[J]. Remote Sensing, 2022, 14(5): 1294. [7] FOOLADGAR F, KASAEI S. A survey on indoor RGB-D semantic segmentation: from hand-crafted features to deep convolutional neural networks[J]. Multimedia Tools and Applications, 2020, 79(7): 4499-4524. [8] 毛斌, 韩文泉, 谢宏全, 等. 基于北京二号影像辅助nDSM的建筑物自动提取[J]. 测绘通报, 2022(3): 132-137. MAO B, HAN W Q, XIE H Q, et al. Construction of building automatic extraction process based on image-aided nDSM of BJ-2[J]. Bulletin of Surveying and Mapping, 2022(3): 132-137. [9] YANG R, DAI Q, CHENG H, et al. Improving semantic segmentation performance by jointly using high resolution remote sensing image and ndsm[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2022, 3: 77-83. [10] CHEN K Q, FU K, GAO X, et al. Effective fusion of multi-modal data with group convolutions for semantic segmentation of aerial imagery[C]//Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium. Piscataway: IEEE, 2019: 3911-3914. [11] BUYUKDEMIRCIOGLU M, CAN R, KOCAMAN S, et al. Deep learning based building footprint extraction from very high resolution true orthophotos and nDSM[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2022, 2: 211-218. [12] MARCOS D, VOLPI M, KELLENBERGER B, et al. Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 145: 96-107. [13] MAGGIORI E, TARABALKA Y, CHARPIAT G, et al. High-resolution aerial image labeling with convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(12): 7092-7103. [14] AUDEBERT N, LE SAUX B, LEFèVRE S. Beyond RGB: very high resolution urban remote sensing with multimodal deep networks[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 140: 20-32. [15] ZOU W B, PENG Y Q, ZHANG Z Y, et al. RGB-D gate-guided edge distillation for indoor semantic segmentation[J]. Multimedia Tools and Applications, 2022, 81(25): 35815-35830. [16] FAN X M, ZHOU W J, QIAN X H, et al. Progressive adjacent-layer coordination symmetric cascade network for semantic segmentation of multimodal remote sensing images[J]. Expert Systems with Applications, 2024, 238: 121999. [17] ZHOU W J, FAN X M, YU L, et al. MISNet: multiscale cross-layer interactive and similarity refinement network for scene parsing of aerial images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16: 2025-2034. [18] LUO H, FENG X B, DU B, et al. A multimodal feature fusion network for building extraction with very high-resolution remote sensing image and LiDAR data[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5621819. [19] CHENG Y H, CAI R, LI Z W, et al. Locality-sensitive deconvolution networks with gated fusion for RGB-D indoor semantic segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1475-1483. [20] LIU H, WU W S, WANG X D, et al. RGB-D joint modelling with scene geometric information for indoor semantic segmentation[J]. Multimedia Tools and Applications, 2018, 77(17): 22475-22488. [21] DENG L Y, YANG M, LI T Y, et al. RFBNet: deep multimodal networks with residual fusion blocks for RGB-D semantic segmentation[EB/OL]. [2024-01-15]. https://arxiv.org/abs/1907.00135. [22] CHEN X K, LIN K Y, WANG J B, et al. Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation[C]//Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020: 561-577. [23] ZHOU H, QI L, HUANG H, et al. CANet: co-attention network for RGB-D semantic segmentation[J]. Pattern Recognition, 2022, 124: 108468. [24] LIU H Y, ZHANG J M, YANG K L, et al. CMX: cross-modal fusion for RGB-X semantic segmentation with transformers[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24: 14679-14694. [25] CAO J M, LENG H C, LISCHINSKI D, et al. ShapeConv: shape-aware convolutional layer for indoor RGB-D semantic segmentation[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 7068-7077. [26] ZHOU W J, YANG E Q, LEI J S, et al. FRNet: feature reconstruction network for RGB-D indoor scene parsing[J]. IEEE Journal of Selected Topics in Signal Processing, 2022, 16(4): 677-687. [27] KONECNY G. The international society for photogrammetry and remote sensing (ISPRS) study on the status of mapping in the world[C]//Proceedings of the 2013 International Workshop on Global Geospatial Information. Piscataway: IEEE, 2013: 4-24. [28] MOU L C, HUA Y S, ZHU X X. A relation-augmented fully convolutional network for semantic segmentation in aerial scenes[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2019: 12416-12425. [29] YUE K, YANG L, LI R R, et al. TreeUNet: adaptive tree convolutional neural networks for subdecimeter aerial image segmentation[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2019, 156: 1-13. [30] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]//Proceedings of the 2020 International Conference on Learning Representations, 2020. [31] SUN X, QIAN Y R, CAO R Y, et al. BGFNet: semantic segmentation network based on boundary guidance[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 21: 2500305. [32] XIAO T T, LIU Y C, ZHOU B L, et al. Unified perceptual parsing for scene understanding[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 432-448. [33] ZHANG X R, WENG Z H, ZHU P, et al. ESDINet: efficient shallow-deep interaction network for semantic segmentation of high-resolution aerial images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5607615. [34] LIU Y C, FAN B, WANG L F, et al. Semantic labeling in very high resolution images via a self-cascaded convolutional neural network[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 145: 78-95. [35] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. [36] LI X, WEN C C, WANG L J, et al. Geometry-aware segmentation of remote sensing images via joint height estimation[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 8007905. [37] SUN Y, TIAN Y, XU Y P. Problems of encoder-decoder frameworks for high-resolution remote sensing image segmentation: structural stereotype and insufficient learning[J]. Neurocomputing, 2019, 330: 297-304. [38] LIU W L, WANG L Q, WANG X H, et al. ULKNet: rethinking large kernel CNN with UNet-attention for remote sensing images semantic segmentation[C]//Proceedings of the 49th Annual Conference of the IEEE Industrial Electronics Society. Piscataway: IEEE, 2023: 1-10. [39] XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[C]//Advances in Neural Information Processing Systems 34, 2021: 12077-12090. |
| [1] | ZHOU Ke, CHANG Ranran, XU Xizhi, MIAO Ru, ZHANG Guangyu, WANG Jiaqian. Water Body Extraction Method Based on ConvNeXt and Dual Feature Extraction Branch [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(5): 1264-1279. |
| [2] | LI Shaobo, WANG Xiaoqiang, GUO Libiao, HONG Ying, WANG Zhiguo. Review of Deep Learning Applications in Unmanned Aerial Vehicle Remote Sensing Images of Grass Plants [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(5): 1157-1176. |
| [3] | ZHAO Liang, LIU Chen, WANG Chunyan. Positional Enhancement TransUnet for Medical Image Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(4): 976-988. |
| [4] | ZHU Yike, DING Jianhao, YIN Xuesong, WANG Yigang. Structured Sparsity Graph Learning for Unsupervised Feature Extraction [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(4): 964-975. |
| [5] | LI Kun, LI Bin, ZHU Wenjing, ZHOU Qinglei. Cross-Architecture Vulnerability Detection Combining Semantic and Attribute Feature [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(3): 787-801. |
| [6] | MENG Xiuyang, WANG Shiyi, LI Dudu, WANG Chunling. Review on Application of Machine Learning in Detecting Suicidal Ideation for Social Media Users [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(3): 559-581. |
| [7] | JING Boxiang, WANG Hairong, WANG Tong, YANG Zhenye. Dual-Layer Fusion Knowledge Reasoning with Enhanced Multi-modal Features [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(2): 406-416. |
| [8] | WANG Jie, JIANG Fusong, JIANG Peng. Multiscale Difference Feature Enhancement Network for Remote Sensing Image Change Detection [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(1): 211-222. |
| [9] | WU Mengke, GAO Xindan. Semantic Segmentation Algorithm for High Resolution Remote Sensing Images with Dual Encoder [J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(1): 187-195. |
| [10] | XU Yanwei, LI Jun, DONG Yuanfang, ZHANG Xiaoli. Survey of Development of YOLO Object Detection Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2221-2238. |
| [11] | LI Mengyun, ZHANG Jing, ZHANG Huanxiang, ZHANG Xiaolin, LIU Luyao. Multimodal Sentiment Analysis Based on Cross-Modal Semantic Information Enhancement [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2476-2486. |
| [12] | LI Zhengwei, WANG Xili, AI Mei. Prototype-Combined Two-Stage Unsupervised Domain Adaptation Segmentation Model for Remote Sensing Images [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2091-2108. |
| [13] | WANG Guokai, ZHANG Xiang, WANG Shunfang. Multi-scale and Boundary Fusion Network for Skin Lesion Regions Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1826-1837. |
| [14] | LI Jiancheng, CAO Lu, HE Xiquan, LIAO Junhong. Review of Classification Methods for Lung Nodules in CT Images [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1705-1724. |
| [15] | MIN Feng, KUANG Yonggang, HAO Linlin, PENG Weiming. Remote Sensing Image Object Detection Algorithm Based on Multi-branch Feature Mapping [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1543-1555. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/