[1] 闵锋, 况永刚, 郝琳琳, 等. 多分支特征映射的遥感图像目标检测算法[J]. 计算机科学与探索, 2024, 18(6): 1543-1555.
MIN F, KUANG Y G, HAO L L, et al. Remote sensing image object detection algorithm based on multi-branch feature mapping[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1543-1555.
[2] ZHANG C, ATKINSON P M, GEORGE C, et al. Identifying and mapping individual plants in a highly diverse high-elevation ecosystem using UAV imagery and deep learning[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 169: 280-291.
[3] 张桃红, 郭徐徐, 张颖. LRSAR-Net语义分割模型用于新冠肺炎CT图片辅助诊断[J]. 电子与信息学报, 2022, 44(1): 48-58.
ZHANG T H, GUO X X, ZHANG Y. LRSAR-Net semantic segmentation model for computer aided diagnosis for Covid-19 CT image[J]. Journal of Electronics & Information Technology, 2022, 44(1): 48-58.
[4] DONG R, PAN X, LI F. DenseU-Net-based semantic segmentation of small objects in urban remote sensing images[J]. IEEE Access, 2019, 7: 65347-65356.
[5] FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3146-3154.
[6] LI R, DUAN C, ZHENG S. MACU-Net semantic segmentation from high-resolution remote sensing images[EB/OL]. [2023-09-13]. https://arxiv.org/abs/2007.13083.
[7] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2017: 2881-2890.
[8] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 5998-6008.
[9] DOSOVITSKIY A, BEYER L, KOLESNIKOV A,et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. [2023-09-13]. https://arxiv.org/abs/2010.11929.
[10] LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF international Conference on Computer Vision. Piscataway: IEEE, 2021: 10012-10022.
[11] DASCOLI S, TOUVRON H, LEAVITT M L, et al. Convit: improving vision transformers with soft convolutional inductive biases[C]//Proceedings of the 38th International Conference on Machine Learning, Jul 18-24, 2021: 2286-2296.
[12] GRAHAM B, EL-NOUBY A, TOUVRON H, et al. Levit: a vision transformer in convnets clothing for faster inference[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 12259-12269.
[13] CHEN J, LU Y, YU Q, et al. Transunet: transformers make strong encoders for medical image segmentation[EB/OL]. [2023-09-13]. https://arxiv.org/abs/2102.04306.
[14] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct 5-9, 2015. Cham: Springer, 2015: 234-241.
[15] GULATI A, QIN J, CHIU C C, et al. Conformer: convolution-augmented transformer for speech recognition[EB/OL]. [2023-09-13]. https://arxiv.org/abs/2005.08100.
[16] 方红, 李德生, 蒋广杰. 高效跨域的Transformer小样本语义分割网络[J]. 计算机工程与应用, 2024, 60(4): 142-152.
FANG H, LI D S, JIANG G J. Efficient cross-domain transformer few-shot semantic segmentation network[J]. Computer Engineering and Applications, 2024, 60(4): 142-152.
[17] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 770-778.
[18] JIAFA M, WEIFENG W, YAHONG H, et al. A scene recognition algorithm based on deep residual network[J]. Systems Science & Control Engineering, 2019, 7(1): 243-251.
[19] HE X, ZHOU Y, ZHAO J, et al. Swin transformer embedding UNet for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-15.
[20] LIN R, ZHANG Y, ZHU X, et al. Local-global feature capture and boundary information refinement swin transformer segmentor for remote sensing images[J]. IEEE Access, 2024, 12: 6088-6099.
[21] 袁姮, 耿仪坤. 特征细化和多尺度注意力的Transformer图像去噪网络[J]. 计算机科学与探索, 2024, 18(7): 1838-1851.
YUAN H, GENG Y K. Feature refinement and multi-scale attention for Transformer image denoising network[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1838-1851.
[22] POUDEL R P K, LIWICKI S, CIPOLLA R. Fast-SCNN: fast semantic segmentation network[EB/OL]. [2023-09-13]. https:// arxiv.org/abs/1902.04502.
[23] 王耀文, 程军圣, 杨宇. 改进的语义分割模型及其应用[J]. 计算机工程与应用, 2024, 60(2): 337-343.
WANG Y W, CHENG J S, YANG Y. Improved semantic segmentation model and its application[J]. Computer Engineering and Applications, 2024, 60(2): 337-343.
[24] CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 801-818.
[25] ROMERA E, ALVAREZ J M, BERGASA L M, et al. ERFNet: efficient residual factorized ConvNet for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 19(1): 263-272.
[26] LIANG J, SUN G, ZHANG K, et al. Mutual affine network for spatially variant kernel estimation in blind image super-resolution[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 4096-4105.
[27] WANG H, JIANG X, REN H, et al. SwiftNet: real-time video object segmentation[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1296-1305.
[28] WANG L, LI R, ZHANG C,et al. UNetFormer: a UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214.
[29] MA A, WANG J, ZHONG Y, et al. FactSeg: foreground activation-driven small object semantic segmentation in large-scale remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-16.
[30] WANG J, SUN K, CHENG T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(10): 3349-3364.
[31] CHEN Y, LIN G, LI S, et al. BANet: bidirectional aggregation network with occlusion handling for panoptic segmentation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 3793-3802. |