[1] 梁敏, 汪西莉. 结合超分辨率和域适应的遥感图像语义分割方法[J]. 计算机学报, 2022, 45(12): 2619-2636.
LIANG M, WANG X L. Semantic segmentation model for remote sensing images combining super resolution and domain adaption[J]. Chinese Journal of Computers, 2022, 45(12): 2619-2636.
[2] 马妍, 古丽米拉·克孜尔别克. 图像语义分割方法在高分辨率遥感影像解译中的研究综述[J]. 计算机科学与探索, 2023, 17(7): 1526-1548.
MA Y, Gulimila·Kezierbieke. Research review of image semantic segmentation method in high-resolution remote sensing image interpretation[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1526-1548.
[3] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651.
[4] RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241.
[5] SUN Y, TIAN Y, XU Y P. Problems of encoder-decoder frameworks for high-resolution remote sensing image segmentation: structural stereotype and insufficient learning[J]. Neurocomputing, 2019, 330: 297-304.
[6] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. Computer Science, 2014(4): 357-361.
[7] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[8] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 833-851.
[9] 罗旭东, 吴一全, 陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展[J]. 航空学报, 2024, 45(6): 028822.
LUO X D, WU Y Q, CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(6): 028822.
[10] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. [2025-02-20]. https://arxiv.org/abs/2010.11929.
[11] STRUDEL R, GARCIA R, LAPTEV I, et al. Segmenter: transformer for semantic segmentation[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 7242-7252.
[12] XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[EB/OL]. [2025-02-20]. https://arxiv.org/abs/2105.15203.
[13] CAO H, WANG Y Y, CHEN J, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation[C]//Proceedings of the 17th European Conference on Computer Vision. Cham: Springer, 2022: 205-218.
[14] WANG L B, LI R, ZHANG C, et al. UNetFormer: a UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214.
[15] WU H L, HUANG P, ZHANG M, et al. CMTFNet: CNN and multiscale transformer fusion network for remote-sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 2004612.
[16] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002.
[17] KITAEV N, KAISER L, LEVSKAYA A. Reformer: the efficient transformer[C]//Proceedings of the 8th International Conference on Learning Representations, 2020: 1-11.
[18] BELTAGY I, PETERS M E, COHAN A. Longformer: the long-document transformer[EB/OL]. [2025-02-20]. https://arxiv.org/abs/2004.05150.
[19] GU A, GOEL K. Mamba: linear-time sequence modeling with selective state spaces[C]//Proceedings of the 1st Conference on Language Modeling, 2024: 1-12.
[20] HE H Y, ZHANF J N, CAI Y X, et al. MobileMamba: lightweight multi-receptive visual Mamba network[C]//Proceedings of the 2025 IEEE/CVF Computer Vision and Pattern Recognition. Piscataway: IEEE, 2025: 4497-4507.
[21] ZHU L H, LIAO B C, ZHANG Q, et al. Vision Mamba: efficient visual representation learning with bidirectional state space model[EB/OL]. [2025-02-20]. https://arxiv.org/abs/2401. 09417.
[22] LIU Y, TIAN Y J, ZHAO Y Z, et al. VMamba: visual state space model[EB/OL]. [2025-02-20]. https://arxiv.org/abs/2401.10166.
[23] ZHAO S J, CHEN H, ZHANG X L, et al. RS-Mamba for large remote sensing image dense prediction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5633314.
[24] ZHU Q F, CAI Y Z, FANG Y, et al. Samba: semantic segmentation of remotely sensed images with state space model[J]. Heliyon, 2024, 10(19): e38495.
[25] MA X P, ZHANG X K, PUN M O. RS3Mamba: visual state space model for remote sensing image semantic segmentation[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 6011405.
[26] LIU M, DAN J, LU Z, et al. CM-UNet: hybrid CNN-Mamba UNet for remote sensing image semantic segmentation[EB/OL]. [2025-02-20]. https//arxiv.org/abs/2405.10530.
[27] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13713-13722.
[28] SI Y Z, XU H Y, ZHU X Z, et al. SCSA: exploring the synergistic effects between spatial and channel attention[J]. Neurocomputing, 2025, 634: 129866.
[29] WANG Q L, WU B G, ZHU P F, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539.
[30] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6230-6239.
[31] KIRILLOV A, GIRSHICK R, HE K M, et al. Panoptic feature pyramid networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 6392-6401.
[32] HUANG T, PEI X H, YOU S, et al. LocalMamba: visual state space model with windowed selective scan[EB/OL]. [2025-02-21]. https://arxiv.org/abs/2403.09338.
[33] XIAO C D, LI M H, ZHANG Z Q, et al. Spatial-Mamba: effective visual state space models via structure-aware state fusion[C]//Proceedings of the 13th International Conference on Learning Representations, 2025. |