[1] 季长清, 王兵兵, 秦静, 等. 深度特征的实例图像检索算法综述[J]. 计算机科学与探索, 2023, 17(7): 1565-1575.
JI C Q, WANG B B, QIN J, et al. Survey of deep feature instance level image retrieval algorithms[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1565-1575.
[2] PANG J, CHEN K, SHI J, et al. Libra R-CNN: towards balanced learning for object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 821-830.
[3] WANG X, ZHANG S, YU Z, et al. Scale-equalizing pyramid convolution for object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 13359-13368.
[4] ZHU X, HU H, LIN S, et al. Deformable ConvNets V2: more deformable, better results[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 9308-9316.
[5] ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 9759-9768.
[6] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 580-587.
[7] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[8] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 1440-1448.
[9] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems 28, Montreal, Dec?7-12,?2015: 91-99.
[10] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN [C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 2961-2969.
[11] LIU W, ANGUELOV D, ERHAND D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Oct 11-14, 2016. Cham: Springer, 2016: 21-37.
[12] VASWANIA, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 5998-6008.
[13] LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchi- cal vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 10012-10022.
[14] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 213-229.
[15] ANSARI M F, LODI K A. A survey of recent trends in two-stage object detection methods[C]//Proceedings of the 2020 International Conference on Renewal Power, Jammu, Apr 17-18, 2020. Singapore: Springer, 2021: 669-677.
[16] ZHANG Y, LI X, WANG F, et al. A comprehensive review of one-stage networks for object detection[C]//Proceedings of the 2021 IEEE International Conference on Signal Processing, Communications and Computing, Xi??an, Aug 17-19, 2021. Piscataway: IEEE, 2021: 1-6.
[17] ZHANG S, YANG J, SCHIELE B. Occluded pedestrian detection through guided attention in CNNs[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Piscataway: IEEE, 2018: 6995-7003.
[18] TIAN Q, WANG M H, ZHANG Y, et al. A research for automatic pedestrian detection with ACE enhancement on fasters R-CNN[C]//Proceedings of the 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, Beijing, Oct 13-15, 2018. Piscataway: IEEE, 2018: 1-9.
[19] SHAO X, WEI J, GUO D, et al. Pedestrian detection algorithm based on improved faster RCNN[C]//Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference, Chongqing, Mar 12-14, 2021. Piscataway: IEEE, 2021: 1368-1372.
[20] 音松, 陈雪云, 贝学宇. 改进Mask RCNN算法及其在行人实例分割中的应用[J]. 计算机工程, 2021, 47(6): 271-276.
YIN S, CHEN X Y, BEI X Y. Improved Mask RCNN algorithm and its application in pedestrian instance segmentation[J]. Computer Engineering, 2021, 47(6): 271-276.
[21] DONG X, HAN Y, LI W, et al. Pedestrian detection in metro station based on improved SSD[C]//Proceedings of the 2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering, Dalian, Nov 14-16, 2019. Piscataway: IEEE, 2019: 936-939.
[22] BOYUAN W, MUQING W. Study on pedestrian detection based on an improved YOLOv4 alogorithm[C]//Proceedings of the 2020 IEEE 6th International Conference on Computer and Communications, Chengdu, Dec 11-14, 2020. Piscataway: IEEE, 2020: 1198-1202.
[23] DONG C, LUO X. Research on a pedestrian detection algorithm based on improved SSD network[C]//Proceedings of the 7th International Conference on Computer-Aided Design, Manufacturing, Modeling and Simulation, Busan, Nov 14-15, 2021: 032073.
[24] GUO W, SHEN N, ZHANG T. Overlapped pedestrian detection based on YOLOv5 in crowded scenes[C]//Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications, Changchun, May 20-22, 2022. Piscataway: IEEE, 2022: 412-416.
[25] LIU S, HUANG D, WANG Y. Learning spatial fusion for single-shot object detection[J]. arXiv:1911.09516, 2019.
[26] QUAN Y, ZHANG D, ZHANG L, et al. Centralized feature pyramid for object detection[J]. arXiv:2210.02093, 2022.
[27] CAO Y, XU J, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-28, 2019. Piscataway: IEEE, 2019: 21-29.
[28] XING H, WANG S, ZHENG D, et al. Dual attention based feature pyramid network[J]. China Communications, 2020, 17(8): 242-252.
[29] 彭豪, 李晓明. 多尺度选择金字塔网络的小样本目标检测算法[J]. 计算机科学与探索, 2022, 16(7): 1649-1660.
PENG H, LI X M. Multi-scale selection pyramid networks for small-sample target detection algorithms[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(7): 1649-1660.
[30] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018.
[31] DAI X, CHEN Y, XIAO B, et al. Dynamic Head: unifying object detection heads with attentions[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, Jun 20-15, 2021. Piscataway: IEEE, 2021: 7373-7382.
[32] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2117-2125.
[33] LIANG T, WANG Y, TANG Z, et al. OPANAS: one-shot path aggregation network architecture search for object detection[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, Jun 20-25, 2021. Piscataway: IEEE, 2021: 10195-10203.
[34] ZHANG S, XIE Y, WAN J, et al. Widerperson: a diverse dataset for dense pedestrian detection in the wild[J]. IEEE Transactions on Multimedia, 2019, 22(2): 380-393.
[35] CHEN K, WANG J, PANG J, et al. MMDetection: open MMLab detection toolbox and benchmark[J]. arXiv:1906.07155, 2019.
[36] TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 9627-9636.
[37] WANG X, KONG T, SHEN C, et al. Solo: segmenting objects by locations[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Springer, 2020: 649-665.
[38] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Piscataway: IEEE, 2017: 2980-2988.
[39] CAI Z, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-23, 2018. Piscataway: IEEE, 2018: 6154-6162.
[40] ZHENG Z, YE R, WANG P, et al. Localization distillation for dense object detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 9407-9416.
[41] CHEN W, XU X, JIA J, et al. Beyond appearance: a semantic controllable self-supervised learning framework for human-centric visual tasks[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Jun 18-22, 2023. Piscataway: IEEE, 2023: 15050-15061.
[42] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014.
[43] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[44] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 770-778.
[45] LIU Z, MAO H, WU C Y, et al. A ConvNet for the 2020s[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 11976-11986.
[46] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 5693-5703.
[47] KIRILLOV A, GIRSHICK R, HE K, et al. Panoptic feature pyramid networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 6399-6408.
[48] GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 7036-7045. |