[1] 汪西莉, 梁敏, 刘涛. 特征增强的单阶段遥感图像目标检测模型[J]. 西安电子科技大学学报, 2022, 49(3): 160-170.
WANG X L, LIANG M, LIU T. Feature enhanced single-stage remote sensing image object detection model[J]. Journal of Xidian University, 2022, 49(3): 160-170.
[2] 李坤亚, 欧鸥, 刘广滨, 等. 改进YOLOv5的遥感图像目标检测算法[J]. 计算机工程与应用, 2023, 59(9): 207-214.
LI K Y, OU O, LIU G B, et al. Target detection algorithm of remote sensing image based on improved YOLOv5[J]. Journal of Computer Engineering and Applications, 2023, 59(9): 207-214.
[3] 李阿标, 郭浩, 戚畅, 等. 复杂背景下遥感图像密集目标检测[J]. 计算机工程与应用, 2023, 59(8): 247-253.
LI A B, GUO H, QI C, et al. Dense object detection in remote sensing images under complex background[J]. Computer Engineering and Applications, 2023, 59(8): 247-253.
[4] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
[5] ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[J]. arXiv:2010. 04159, 2020.
[6] LIU S, LI F, ZHANG H, et al. Dab-DETR: dynamic anchor boxes are better queries for DETR[J]. arXiv:2201.12329, 2022.
[7] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 580-587.
[8] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiage, Dec 7-13, 2015. Washington: IEEE Computer Society,2015: 1440-1448.
[9] ZHOU X, GIRDHAR R, JOULIN A, et al. Detecting twenty-thousand classes using image-level supervision[C]//Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, Oct 23-27, 2022. Cham: Springer, 2022: 350-368.
[10] CHEN S, SUN P, SONG Y, et al. DiffusionDet: diffusion model for object detection[J]. arXiv:2211.09788, 2022.
[11] TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 13-19, 2020. Piscataway: IEEE, 2020: 10781-10790.
[12] GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[J]. arXiv:2107.08430, 2021.
[13] LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.02976, 2022.
[14] WANG C Y, BOCHKOVSKIY A, LIAO H. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for realtime object detectors[J]. arXiv:2207.02696, 2022.
[15] 朱煜, 方观寿, 郑兵兵, 等. 基于旋转框精细定位的遥感目标检测方法研究[J]. 自动化学报, 2023, 49(2): 10-17.
ZHU Y, FANG G S, ZHENG B B, et al. Research on detection method of refined rotated boxes in remote sensing[J]. Acta Automatica Sinica, 2023, 49(2): 10-17.
[16] NAIR V, HINTON G E. Rectified linear units improve restricted Boltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning, Haifa, Jun 21-24, 2010: 807-814.
[17] MAAS A L, HANNUN A Y, NG A Y. Rectifier nonlinearities improve neural network acoustic models[C]//Proceedings of the 30th International Conference on Machine Learning, Atlanta, Jun 16-21, 2013.
[18] CLEVERT D A, UNTERTHINER T, HOCHREITER S. Fast and accurate deep network learning by exponential linear units (ELUs)[J]. arXiv:1511.07289, 2015.
[19] RAMACHANDRAN P, ZOPH B, LE Q V. Searching for activation functions[J]. arXiv:1710.05941, 2017.
[20] HENDRYCKS D, GIMPEL K. Gaussian error linear units (GELUs)[J]. arXiv:1606.08415, 2016.
[21] BISWAS K, KUMAR S, BANERJEE S, et al. SMU: smooth activation function for deep networks using smoothing maximum technique[J]. arXiv:2111.04682, 2021.
[22] OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention U-Net: learning where to look for the pancreas[J]. arXiv:1804.03999, 2018.
[23] ZHU Y, ZHAO C, WANG J, et al. CoupleNet: coupling global structure with local parts for object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 4126-4134.
[24] DAI J, LI Y, HE K, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems 29, Barcelona, Dec 5-10, 2016: 379-387.
[25] TAN M, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]//Proceedings of the 2019 International Conference on Machine Learning, Long Beach, Jun 9-15, 2019: 6105-6114.
[26] TAN M, LE Q. EfficientNetv2: smaller models and faster training[C]//Proceedings of the 38th International Conference on Machine Learning, Jul 18-24, 2021: 10096-10106.
[27] LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 10012-10022.
[28] LIU Z, MAO H, WU C Y, et al. A convnet for the 2020s[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 11966-11976.
[29] ZHAO G, GE W, YU Y. GraphFPN: graph feature pyramid network for object detection[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Jun 19-25, 2021. Piscataway: IEEE, 2021: 2763-2772.
[30] DING X, ZHANG X, MA N, et al. RepVGG: making VGG-style convnets great again[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 13733-13742.
[31] JOCHER G, CHAURASIA A, STOKEN A, et al. Ultralytics/YOLOV5: v6.1-TensorRT, TensorFlow edge TPU and OpenVINO export and inference[Z]. Zenodo, 2022.
[32] CAO Y, XU J, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 3, 2019. Piscataway: IEEE, 2019: 1971-1980.
[33] WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington: IEEE Computer Society, 2018: 7794-7803.
[34] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Washington: IEEE Computer Society, 2018: 7132-7141.
[35] DEVLIN J, CHANG M W, LEE K, et al. BERT: pretraining of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018.
[36] BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Advances in Neural Information Processing Systems 33, Dec 6-12, 2020: 1877-1901.
[37] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 5998-6008.
[38] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 3-19.
[39] LIU Y, SHAO Z, TENG Y, et al. NAM: normalization-based attention module[J]. arXiv:2111.12419, 2021.
[40] LIU H, LIU F, FAN X, et al. Polarized self-attention: towards high-quality pixel-wise regression[J]. arXiv:2107.00782, 2021. |