Journal of Frontiers of Computer Science and Technology ›› 2023, Vol. 17 ›› Issue (12): 2967-2983.DOI: 10.3778/j.issn.1673-9418.2210120
• Graphics·Image • Previous Articles Next Articles
XU Shoukun, GU Jianan, ZHUANG Lihua, LI Ning, SHI Lin, LIU Yi
Online:
2023-12-01
Published:
2023-12-01
徐守坤,顾佳楠,庄丽华,李宁,石林,刘毅
XU Shoukun, GU Jianan, ZHUANG Lihua, LI Ning, SHI Lin, LIU Yi. Small Object Detection Based on Two-Stage Calculation Transformer[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(12): 2967-2983.
徐守坤, 顾佳楠, 庄丽华, 李宁, 石林, 刘毅. 基于两阶段计算Transformer的小目标检测[J]. 计算机科学与探索, 2023, 17(12): 2967-2983.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2210120
[1] 李文涛, 彭力. 多尺度通道注意力融合网络的小目标检测算法[J]. 计算机科学与探索, 2021, 15(12): 2390-2400. LI W T, PENG L. Small objects detection algorithm with multi-scale channel attention fusion network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2390-2400. [2] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems 28, Montreal, Dec 7-12, 2015: 91-99. [3] PENG C, XIAO T, LI Z, et al. MegDet: a large mini-batch object detector[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 6181-6189. [4] WANG H, WANG Q, GAO M, et al. Multi-scale location-aware kernel representation for object detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 1248-1257. [5] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//LNCS 9905: Proceedings of the 14th European Conference on computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 21-37. [6] FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector[J]. arXiv:1701.06659, 2017. [7] REDMON J, FARHADI A. YOLOV3: an incremental improvement[J]. arXiv:1804.02767, 2018. [8] PENG J, WANG F, FU Z, et al. Towards toxic and narcotic medication detection with rotated object detector[J]. arXiv:2110.09777, 2021. [9] WANG K, LIEW J H, ZOU Y, et al. PaNet: few-shot image semantic segmentation with prototype alignment[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 9196-9205. [10] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 580-587. [11] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 1440-1448. [12] 赵珊, 郑爱玲, 刘子路, 等. 通道分离双注意力机制的目标检测算法[J]. 计算机科学与探索, 2023, 17(5): 1112-1125. ZHAO S, ZHENG A L, LIU Z L, et al. Object detection algorithm based on channel separation dual attention mechanism[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(5): 1112-1125. [13] DAI J, LI Y, HE K, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems 29, Barcelona, Dec 5-10, 2016: 379-387. [14] SONG G, LIU Y, WANG X. Revisiting the sibling head in object detector[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 11563-11572. [15] ZHANG H, CHANG H, MA B, et al. Dynamic R-CNN: towards high quality object detection via dynamic training[C]//LNCS 12360: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 260-275. [16] SUN P, ZHANG R, JIANG Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 14454-14463. [17] SANG J, WU Z, GUO P, et al. An improved YOLOv2 for vehicle detection[J]. Sensors, 2018, 18(12): 4272. [18] LU J, MA C, LI L, et al. A vehicle detection method for aerial image based on YOLO[J]. Journal of Computer and Communications, 2018, 6(11): 98-107. [19] 邵伟平, 王兴, 曹昭睿, 等. 基于 MobileNet 与 YOLOv3 的轻量化卷积神经网络设计[J]. 计算机应用, 2020, 40(S1): 8-13. SHAO W P, WANG X, CAO Z R, et al. Lightweight convolutional neural network design based on MobileNet and YOLOv3[J]. Journal of Computer Applications, 2020, 40(S1): 8-13. [20] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOV4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020. [21] 刘晋, 邓洪敏, 徐泽林, 等. 面向目标识别的轻量化混合卷积神经网络[J]. 计算机应用, 2021, 41(z2): 5-12. LIU J, DENG H M, XU Z L, et al. Lightweight hybrid convolutional neural network for object recognition[J]. Journal of Computer Applications, 2021, 41(z2): 5-12. [22] LIU Y, YANG F, HU P. Small-object detection in UAV-captured images via multi-branch parallel feature pyramid networks[J]. IEEE Access, 2020, 8: 145740-145750. [23] ZOPH B, CUBUK E D, GHIASI G, et al. Learning data augmentation strategies for object detection[C]//LNCS 12372: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 566-583. [24] ZHANG X, IZQUIERDO E, CHANDRAMOULI K. Dense and small object detection in UAV vision based on cascade network[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 118-126. [25] 奚琦, 张正道, 彭力. 基于改进密集网络与二次回归的小目标检测算法[J]. 计算机工程, 2021, 47(4): 241-247. XI Q, ZHANG Z D, PENG L. Small object detection algorithm based on improved dense network and quadratic regression[J]. Computer Engineering, 2021, 47(4): 241-247. [26] 陈幻杰, 王琦琦, 杨国威, 等. 多尺度卷积特征融合的 SSD 目标检测算法[J]. 计算机科学与探索, 2019, 13(6): 1049-1061. CHENG H J, WANG Q Q, YANG G W, et al. SSD object detection algorithm with multi-scale convolution feature fusion[J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(6): 1049-1061. [27] 黄硕, 胡勇, 顾明剑, 等. 基于深度学习的红外遥感目标超分辨率检测算法[J]. 激光与光电子学进展, 2021, 58(16): 280-288. HUANG S, HU Y, GU M J, et al. Super-resolution infrared remote-sensing target-detection algorithm based on deep learning[J]. Laser & Optoelectronics Progress, 2021, 58(16): 280-288. [28] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv:2010.11929, 2020. [29] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//LNCS 12346: Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 213-229. [30] LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 9992-10002. [31] LIU S, LI F, ZHANG H, et al. DAB-DETR: dynamic anchor boxes are better queries for DETR[J]. arXiv:2201.12329, 2022. [32] LU T W, JIA S H, ZHANG H. MemFRCN: few shot object detection with memorable Faster-RCNN[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2022, 105-A(12): 1626-1630. [33] SHETTY S. Application of convolutional neural network for image classification on Pascal VOC challenge 2012 dataset[J]. arXiv:1607.03785, 2016. [34] GU Y, PAN Y, CHEN S. 2nd place solution to ECCV 2020 VIPriors object detection challenge[J]. arXiv:2007.08849, 2020. [35] YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision, Snowmass Village, Mar 1-5, 2020. Piscataway: IEEE, 2020: 1246-1254. [36] BAE S H. Object detection based on region decomposition and assembly[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Jan 27-Feb 1, 2019. Menlo Park: AAAI, 2019: 8094-8101. [37] ZHENG L, FU C, ZHAO Y. Extend the shallow part of single shot multibox detector via convolutional neural network[J]. arXiv:1801.05918, 2018. [38] CAO G, XIE X, YANG W, et al. Feature-fused SSD: fast detection for small objects[J]. arXiv:1709.05054, 2017. [39] ZHOU P, NI B, GENG C, et al. Scale-transferrable object detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 528-537. [40] TERMRITTHIKUN C, JAMTSHO Y, IEAMSAARD J, et al. EEEA-Net: an early exit evolutionary neural architecture search[J]. Engineering Applications of Artificial Intelligence, 2021, 104: 104397. [41] SONG C, CHENG X, LIU L, et al. ACFIM: adaptively cyclic feature information-interaction model for object detection[C]//LNCS 13019: Proceedings of the 4th Chinese Conference on Pattern Recognition and Computer Vision, Beijing, Oct 29-Nov 1, 2021. Cham: Springer, 2021: 379-391. [42] BAR A, WANG X, KANTOROV V, et al. DETReg: unsupervised pretraining with region priors for object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 14585-14595. [43] XU S, WANG X, LV W, et al. PP-YOLOE: an evolved version of YOLO[J]. arXiv:2203.16250, 2022. [44] GU Y, LIAO X, QIN X. YouTube-GDD: a challenging gun detection dataset with rich contextual information[J]. arXiv:2203.04129, 2022. [45] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J]. arXiv:2207.02696, 2022. [46] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: scaling cross stage partial network[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021. Washington: IEEE Computer Society, 2021: 13029-13038. [47] MENG D, CHEN X, FAN Z, et al. Conditional DETR for fast training convergence[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 3631-3640. [48] ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[J]. arXiv:2010.04159, 2020. [49] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington:IEEE Computer Society, 2017: 2980-2988. [50] CHOI J, ELEZI I, LEE H J, et al. Active learning for deep object detection via probabilistic modeling[C]//Poceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 10244-10253. [51] LI J, WANG Y, WANG C, et al. DSFD: dual shot face detector[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5060-5069. |
[1] | XUE Jinqiang, WU Qin. Lightweight Cross-Gating Transformer for Image Restoration and Enhancement#br# #br# [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 718-730. |
[2] | CHEN Qian, HONG Zheng, SI Jianpeng. Application Layer Protocol Recognition Incorporating SENet and Transformer [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(3): 805-817. |
[3] | PENG Bin, BAI Jing, LI Wenjing, ZHENG Hu, MA Xiangyu. Survey on Visual Transformer for Image Classification [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(2): 320-344. |
[4] | WANG Qiang, LU Xianling. Transformer Object Tracking Algorithm Based on Spatio-Temporal Template Update [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(9): 2161-2173. |
[5] | FENG Wenke, SHI Min, ZHU Dengming, LI Zhaoxin. 3D Human Animation Synthesis with Transformer-CVAE [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(9): 2137-2147. |
[6] | LI Mingyue, YAN Tao, JING Huahua, LIU Yuan. Low-Light Enhancement Method for Light Field Images by Fusing Multi-scale Features [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1904-1916. |
[7] | LIANG Hongtao, LIU Shuo, DU Junwei, HU Qiang, YU Xu. Review of Deep Learning Applied to Time Series Prediction [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(6): 1285-1300. |
[8] | HU Hao, GUO Fang, LIU Zhao. Object Detection Based on Improved YOLOX-S Model in Construction Sites [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(5): 1089-1101. |
[9] | ZHAO Dengge, ZHI Min. Spatial Multiple-Temporal Graph Convolutional Neural Network for Human Action Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(3): 719-732. |
[10] | TU Xiaomei, BAO Xiao'an, WU Biao, JIN Yuting, ZHANG Qingqi. Object Detection Algorithm for 3D Coordinate Attention Path Aggregation Network [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(12): 2984-2998. |
[11] | FENG Aiqi, WU Xiaojun, XU Tianyang. Real-Time Traffic Sign Detection Algorithm Combining Attention Mechanism and Contextual Information [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(11): 2676-2688. |
[12] | MA Ziping, TAN Lidao, MA Jinlin, CHEN Yong. SMViT: Lightweight Siamese Masked Vision Transformer Model for Diagnosis of COVID-19 [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(10): 2499-2510. |
[13] | ZHAO Pengfei, XIE Linbo, PENG Li. Deep Small Object Detection Algorithm Integrating Attention Mechanism [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4): 927-937. |
[14] | XIAO Zeguan, CHEN Qingliang. Aspect-Based Sentiment Analysis Model with Multiple Grammatical Information [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(2): 395-402. |
[15] | LI Kecen, WANG Xiaoqiang, LIN Hao, LI Leixiao, YANG Yanyan, MENG Chuang, GAO Jing. Survey of One-Stage Small Object Detection Methods in Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 41-58. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/