Journal of Frontiers of Computer Science and Technology ›› 2024, Vol. 18 ›› Issue (9): 2239-2260.DOI: 10.3778/j.issn.1673-9418.2311105
• Frontiers·Surveys • Previous Articles Next Articles
LIAN Zhe, YIN Yanjun, ZHI Min, XU Qiaozhi
Online:
2024-09-01
Published:
2024-09-01
连哲,殷雁君,智敏,徐巧枝
LIAN Zhe, YIN Yanjun, ZHI Min, XU Qiaozhi. Review of Differentiable Binarization Techniques for Text Detection in Natural Scenes[J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2239-2260.
连哲, 殷雁君, 智敏, 徐巧枝. 自然场景文本检测中可微分二值化技术综述[J]. 计算机科学与探索, 2024, 18(9): 2239-2260.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2311105
[1] ANBUKKARASI S, SATHISHKUMAR V E, DHIVYAA C R, et al. Enhanced feature model based hybrid neural network for text detection on signboard, billboard and news tickers[J]. IEEE Access, 2023, 11: 41524-41534. [2] XIA X, MENG Z, HAN X, et al. An automated driving systems data acquisition and analytics platform[J]. Transportation Research Part C: Emerging Technologies, 2023, 151: 104120. [3] WANG J, CHEN Y, DONG Z, et al. Improved YOLOv5 network for real-time multi-scale traffic sign detection[J]. Neural Computing and Applications, 2023, 35(10): 7853-7865. [4] MENG Z, XIA X, XU R, et al. HYDRO-3D: hybrid object detection and tracking for cooperative perception using 3D LiDAR[J]. IEEE Transactions on Intelligent Vehicles, 2023,8(8): 4069-4080. [5] HONG T, KIM D, JI M, et al. BROS: a pre-trained language model focusing on text and layout for better key information extraction from documents[C]//Proceedings of the 2022 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2022: 10767-10775. [6] LIU W, QUIJANO K, CRAWFORD M M. YOLOv5-Tassel: detecting tassels in RGB UAV imagery with improved YOLOv5 based on transfer learning[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 8085-8094. [7] SHI H, ZHAO D. License plate recognition system based on improved YOLOv5 and GRU[J]. IEEE Access, 2023, 11: 10429-10439. [8] PADMASIRI H, SHASHIRANGANA J, MEEDENIYA D, et al. Automated license plate recognition for resource-constrained environments[J]. Sensors, 2022, 22(4): 1434. [9] BANU J F, MUNEESHWARI P, RAJA K, et al. Ontology based image retrieval by utilizing model annotations and content[C]//Proceedings of the 2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence). Piscataway: IEEE, 2022: 300-305. [10] 连哲, 殷雁君, 云飞, 等. 基于深度学习的自然场景文本检测综述[J]. 计算机工程, 2024, 50(3): 16-27. LIAN Z, YIN Y J, YUN F, et al. Review of natural scene text detection based on deep learning[J]. Computer Engineering, 2024, 50(3): 16-27. [11] WANG T, WU D J, COATES A, et al. End-to-end text recognition with convolutional neural networks[C]//Proceedings of the 21st International Conference on Pattern Recognition. Piscataway: IEEE, 2012: 3304-3308. [12] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems 25,Lake Tahoe, Dec 3-6, 2012: 1106-1114. [13] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115: 211-252. [14] 周燕, 韦勤彬, 廖俊玮, 等. 自然场景文本检测与端到端识别:深度学习方法[J]. 计算机科学与探索, 2023, 17(3): 577-594. ZHOU Y, WEI Q B, LIAO J W, et al. Natural scene text detection and end-to-end recognition: deep learning methods[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(3): 577-594. [15] 刘艳菊, 伊鑫海, 李炎阁, 等. 深度学习在场景文字识别技术中的应用综述[J]. 计算机工程与应用, 2022, 58(4): 52-63. LIU Y J, YIN X H, LI Y G, et al. Application of scene text recognition technology based on deep learning: a survey[J]. Computer Engineering and Applications, 2022, 58(4): 52-63. [16] 王润民, 桑农, 丁丁, 等. 自然场景图像中的文本检测综述[J]. 自动化学报, 2018, 44(12): 2113-2141. WANG R M, SANG N, DING D, et al. Text detection in natural scene image: a survey[J]. Acta Automatica Sinica, 2018, 44(12): 2113-2141. [17] 王建新, 王子亚, 田萱. 基于深度学习的自然场景文本检测与识别综述[J]. 软件学报, 2020, 31(5): 1465-1496. WANG J X, WANG Z Y, TIAN X. Review of natural scene text detection and recognition based on deep learning[J]. Journal of Software, 2020, 31(5): 1465-1496. [18] 刘崇宇, 陈晓雪, 罗灿杰, 等. 自然场景文本检测与识别的深度学习方法[J]. 中国图象图形学报, 2021, 26(6): 1330-1367. LIU C Y, CHEN X X, LUO C J, et al. Deep learning methods for scene text detection and recognition[J]. Journal of Image and Graphics, 2021, 26(6): 1330-1367. [19] LIU X, MENG G, PAN C. Scene text detection and recognition with advances in deep learning: a survey[J]. International Journal on Document Analysis and Recognition, 2019, 22: 143-162. [20] LONG S, HE X, YAO C. Scene text detection and recognition: the deep learning era[J]. International Journal of Computer Vision, 2021, 129: 161-184. [21] CHEN X, JIN L, ZHU Y, et al. Text recognition in the wild: a survey[J]. ACM Computing Surveys, 2021, 54(2): 1-35. [22] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems 28, Montreal, Dec 7-12, 2015: 91-99. [23] MA J, SHAO W, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122. [24] ZHONG Z, SUN L, HUO Q. An anchor-free region proposal network for Faster R-CNN-based text detection approaches[J]. International Journal on Document Analysis and Recognition, 2019, 22: 315-327. [25] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shotmultibox detector[C]//Proceedings of the 14th European Conference on Computer Vision. Cham: Springer, 2016: 21-37. [26] LIAO M, ZHU Z, SHI B, et al. Rotation-sensitive regression for oriented scene text detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Saline Lake, Jun 19-21, 2018. Washington: IEEE Computer Society, 2018: 5909-5918. [27] LIAO M, SHI B, BAI X. TextBoxes++: a single-shot oriented scene text detector[J]. IEEE Transactions on Image Processing, 2018, 27(8): 3676-3690. [28] LIAO M, WAN Z, YAO C, et al. Real-time scene text detection with differentiable binarization[C]//Proceedings of the 2020 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020: 11474-11481. [29] RAISI Z, NAIEL M A, YOUNES G, et al. Transformer-based text detection in the wild[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3162-3171. [30] YE M, ZHANG J, ZHAO S, et al. DeepSolo: let transformer decoder with explicit points solo for text spotting[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 19348-19357. [31] ZHANG S X, YANG C, ZHU X, et al. Arbitrary shape text detection via boundary transformer[J]. IEEE Transactions on Multimedia, 2024, 26: 1747-1760. [32] YE M, ZHANG J, ZHAO S, et al. DPText-DETR: towards better scene text detection with dynamic points in transformer[C]//Proceedings of the 2023 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2023: 3241-3249. [33] LIAO M, ZOU Z, WAN Z, et al. Real-time scene text detection with differentiable binarization and adaptive scale fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(1): 919-931. [34] WANG W, XIE E, LI X, et al. PAN++: towards efficient and accurate end-to-end spotting of arbitrarily-shaped text[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5349-5367. [35] WANG W, XIE E, LI X, et al. Shape robust text detection with progressive scale expansion network[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 15-20, 2019. Piscataway: IEEE, 2019: 9336-9345. [36] WANG W, XIE E, SONG X, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 8440-8449. [37] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2015: 3431-3440. [38] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2117-2125. [39] LI Y, QI H, DAI J, et al. Fully convolutional instance-aware semantic segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 2359-2367. [40] LIN J, YAN Y, WANG H. A dual-path transformer network for scene text detection[C]//Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Nashville, Mar 26-30, 2023. Piscataway: IEEE, 2023: 1-5. [41] CAI Y, LIU Y, SHEN C, et al. Arbitrarily shaped scene text detection with dynamic convolution[J]. Pattern Recognition, 2022, 127: 108608. [42] GUO Y, ZHOU Y, QIN X, et al. UNITS: unsupervised intermediate training stage for scene text detection[C]//Proceedings of the 2022 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2022: 1-6. [43] CHANG H C, CHEN H J, SHEN Y C, et al. Re-Attention is all you need: memory-efficient scene text detection via re-attention on uncertain regions[C]//Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2021: 452-459. [44] ZHANG Y, SONG C, XUE M. PSND: a robust parking space number detector[C]//Proceedings of the 2022 26th International Conference on Pattern Recognition. Piscataway: IEEE, 2022: 1742-1748. [45] WU H, DONG B, DING L, et al. Attention feature pyramid network for scene text detection[C]//Proceedings of the 2022 IEEE 8th International Conference on Computer and Communications. Piscataway: IEEE, 2022: 1726-1731. [46] WANG Z, TIAN X. Power equipment nameplate text detection based on improved multiscale feature fusion network[C]//Proceedings of the 15th International Conference on Digital Image Processing. New York: ACM, 2023: 1-8. [47] 卫薇, 龙娜, 田钺, 等. 基于改进DBNet的电力设备铭牌文本检测方法研究[J]. 高电压技术, 2023, 49(S1): 63-67. WEI W, LONG N, TIAN Y, et al. Research on text detection method for power equipment nameplates based on improved DBNet[J]. High Voltage Engineering, 2023, 49(S1): 63-67. [48] WANG X, LI Y, LIU J, et al. Intelligent micron optical character recognition of dfb chip using deep convolutional neural network[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-9. [49] QU Z, SHEN J, LI R, et al. Partsnet: a unified deep network for automotive engine precision parts defect detection[C]//Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence. New York:ACM, 2018: 594-599. [50] DU Y, DONG J. Research on natural scene vehicle nameplate text detection based on improved DBNet[C]//Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning, Shanghai, Mar 17-19, 2023: 338-345. [51] ZHOU X, YAO C, WEN H, et al. EAST: an efficient and accurate scene text detector[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition,Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 5551-5560. [52] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 770-778. [53] TANG Q, JIANG Z, PAN B, et al. Scene text detection using HRNet and spatial attention mechanism[J]. Programming and Computer Software, 2023, 49(8): 954-965. [54] ZHU X, HU H, LIN S, et al. Deformable ConvNets V2: more deformable, better results[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9308-9316. [55] BISWAS K, KUMAR S, BANERJEE S, et al. SMU: smooth activation function for deep networks using smoothing maximum technique[EB/OL]. [2023-09-23]. https://arxiv.org/abs/2111.04682. [56] 邵海琳, 季怡, 刘纯平, 等. 基于增强特征金字塔网络的场景文本检测算法[J]. 计算机科学, 2022, 49(2): 248-255. SHAO H L, JI Y, LIU C P, et al. Scene text detection algorithm based on enhanced feature pyramid network[J]. Computer Science, 2022, 49(2): 248-255. [57] LIU B, JIN J. Text detection based on bidirectional feature fusion and SA attention mechanism[C]//Proceedings of the 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers. Piscataway: IEEE, 2022: 912-915. [58] ZHANG Q L, YANG Y B. SA-NET: shuffle attention for deep convolutional neural networks[C]//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 2235-2239. [59] IBRAYIM M, LI Y, HAMDULLA A. Scene text detection based on two-branch feature extraction[J]. Sensors, 2022, 22(16): 6262. [60] LI Y, IBRAYIM M, HAMDULLA A. CSFF-Net: scene text detection based on cross-scale feature fusion[J]. Information, 2021, 12(12): 524. [61] LU M, LENG Y, CHEN C L, et al. An improved differentiable binarization network for natural scene street sign text detection[J]. Applied Sciences, 2022, 12(23): 12120. [62] 邹伟平, 冯辉扬, 龙鑫. 基于改进特征金字塔网络和注意力机制的场景文本检测[J]. 电子技术与软件工程, 2022(13): 174-177. ZHOU W P, FENG H Y, LONG X. Scene text detection based on improved feature pyramid network and attention mechanism[J]. Electronic Technology & Software Engineering, 2022(13): 174-177. [63] WANG H, FENG S. Research on text detection algorithm based on improved FPN[C]//Proceedings of the 2022 IEEE 6th Advanced Information Technology, Electronic and Automation Control Conference. Piscataway: IEEE, 2022: 352-355. [64] HUANG B, FENG X. Scene text detection based on multi-headed self-attention using shifted windows[J]. Applied Sciences, 2023, 13(6): 3928. [65] LI Y, SILAMU W, WANG Z, et al. Attention-based scene text detection on dual feature fusion[J]. Sensors, 2022, 22(23): 9072. [66] SUN Q, ZHANG J, LIU Z, et al. Text detection method of signage image based on attention mechanism and SPP[C]//Proceedings of the 9th International Symposium on Test Automation & Instrumentation, Beijing, Nov 11-13, 2022: 520-524. [67] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. [68] 魏哲亮, 李岳阳, 罗海驰. 多尺度池化和双向特征融合的场景文本检测[J]. 计算机工程与应用, 2024, 60(2): 154-161. WEI Z L, LI Y Y, LUO H C. Scene text detection based on multi-scale pooling and bidirectional feature fusion[J]. Computer Engineering and Applications, 2024, 60(2): 154-161. [69] CHENG Y, WAN Y, SIMA Y, et al. Text detection of transformer based on deep learning algorithm[J]. Tehni?ki vjesnik, 2022, 29(3): 861-866. [70] RONNEBERGER O, FISCHER P, BROX T. U-NET: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct 5-9, 2015. Cham: Springer, 2015: 234-241. [71] NAOSEKPAM V, AGGARWAL S, SAHU N. UTextNet: a UNet based arbitrary shaped scene text detector[C]//Proceedings of the 2021 International Conference on Intelligent Systems Design and Applications. Cham: Springer, 2021: 368-378. [72] GU S, ZHANG F. Applicable scene text detection based on semantic segmentation[J]. Journal of Physics: Conference Series, 2020, 1631(1): 012080. [73] HEN H, LIU J, ZHOU W. Natural scene text detection algorithm based on improved DBNet[C]//Proceedings of the 2022 IEEE 5th International Conference on Electronic Information and Communication Technology. Piscataway: IEEE, 2022: 186-190. [74] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2018: 7132-7141. [75] MA H, LU N, MEI J, et al. Label distribution learning for scene text detection[J]. Frontiers of Computer Science, 2023, 17(6): 176339. [76] ZHAO Q, WANG Y, LYU S, et al. Attention-based feature decomposition-reconstruction network for scene text detection[EB/OL]. [2023-09-23]. https://arxiv.org/abs/2111.14340. [77] ZHAO F, YU J, XING E, et al. Real-time scene text detection based on global level and word level features[EB/OL]. [2023-09-23]. https://arxiv.org/abs/2203.05251. [78] WANG L, YAO X, SONG C. Text detection method based on HDBNet in natural scenes[J]. The Journal of Engineering, 2023(1): e12212. [79] ZHU J, WANG G. TransText: improving scene text detection via transformer[J]. Digital Signal Processing, 2022, 130: 103698. [80] YANG J, YOU Z, ZHONG Z, et al. DTTR: detecting text with transformers[C]//Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Nashville, Mar 26-30, 2023. Piscataway: IEEE, 2023: 1-5. [81] TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10781-10790. [82] CHEN X, CHANG Y, ZHANG P, et al. Pixel-level end-to-end dual-channel bill text detection based algorithm[C]//Proceedings of the 2022 7th International Conference on Intelligent Computing and Signal Processing. Piscataway: IEEE, 2022: 405-409. [83] CHENG Q, WANG G. Shape awareness and structure-preserving network for arbitrary shape text detection[J]. Multimedia Tools and Applications, 2021, 80: 10761-10775. [84] 李雨, 闫甜甜, 周东生, 等. 基于注意力机制与深度多尺度特征融合的自然场景文本检测[J]. 图学学报, 2023, 44(3): 473-481. LI Y, YAN T T, ZHOU D S, et al. Natural scene text detection based on attention mechanism and deep multi-scale feature fusion[J]. Journal of Graphics, 2023, 44(3): 473-481. [85] ZHANG H, WU C, ZHANG Z, et al. ResNeSt: split-attention networks[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2736-2746. [86] WANG Y, MAMAT H, XU X, et al. Scene Uyghur text detection based on fine-grained feature representation[J]. Sensors, 2022, 22(12): 4372. [87] GAO S H, CHENG M M, ZHAO K, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(2): 652- 662. [88] HU X, WU D, LI H, et al. ShallowNet: an efficient lightweight text detection network based on instance count-aware supervision information[C]//Proceedings of the 2021 International Conference on Neural Information Processing. Cham: Springer, 2021: 633-644. [89] LIN W, ZHANG Z, XUE X. an agile and efficient neural network based on knowledge distillation for scene text detection[J]. Wireless Communications and Mobile Computing, 2022(1): 8682961. [90] YANG P, ZHANG F, YANG G. A fast scene text detector using knowledge distillation[J]. IEEE Access, 2019, 7: 22588-22598. [91] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2023-09-23]. https://arxiv.org/abs/1409.1556. [92] KIM K H, HONG S, ROH B, et al. PVANET: deep but lightweight neural networks for real-time object detection[EB/OL]. [2023-09-23]. https://arxiv.org/abs/1608.08021. [93] SHAHAB A, SHAFAIT F, DENGEL A. ICDAR 2011 robust reading competition challenge 2: reading text in scene images[C]//Proceedings of the 2011 International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2011: 1491-1496. [94] KARATZAS D, SHAFAIT F, UCHIDA S, et al. ICDAR 2013 robust reading competition[C]//Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2013: 1484-1493. [95] KARATZAS D, GOMEZ-BIGORDA L, NICOLAOU A, et al. ICDAR 2015 competition on robust reading[C]//Proceedings of the 2015 13th International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2015: 1156-1160. [96] SHI B, YAO C, LIAO M, et al. ICDAR2017 competition on reading Chinese text in the wild (RCTW-17)[C]//Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition, Kyoto, Nov 9-12, 2017. Piscataway: IEEE, 2017: 1429-1434. [97] NAYEF N, YIN F, BIZID I, et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT[C]//Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2017: 1454-1459. [98] GOMEZ R, SHI B, GOMEZ L, et al. ICDAR2017 robust reading challenge on COCO-text[C]//Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition, Kyoto, Nov 9-12, 2017. Piscataway: IEEE, 2017: 1435-1443. [99] SUN Y, LIU J, LIU W, et al. Chinese street view text: large-scale Chinese text reading with partially supervised learning[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9086-9095. [100] CHNG C K, LIU Y, SUN Y, et al. ICDAR2019 robust reading challenge on arbitrary-shaped text-RRC-ArT[C]//Proceedings of the 2019 International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2019: 1571-1576. [101] FENG W, HE W, YIN F, et al. TextDragon: an end-to-end framework for arbitrary shaped text spotting[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9076-9085. [102] CHNG C K, CHAN C S. Total-text: a comprehensive dataset for scene text detection and recognition[C]//Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2017: 935-942. [103] YAO C, BAI X, LIU W, et al. Detecting texts of arbitrary orientations in natural images[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 1083-1090. [104] WANG K, BELONGIE S J. Word spotting in the wild[C]//LNCS 6311: Proceedings of the 11th European Conference on Computer Vision, Heraklion, Sep 5-11, 2010. Berlin, Heidelberg: Springer, 2010: 591-604. [105] MISHRA A, ALAHARI K, JAWAHAR C V. Scene text recognition using higher order language priors[C]//Proceedings of the 2012 British Machine Vision Conference, Surrey, Sep 3-7, 2012. Durham: BMVA Press, 2012: 1-11. [106] RISNUMAWAN A, SHIVAKUMARA P, CHAN C S, et al. A robust arbitrary text detection system for natural scene images[J]. Expert Systems with Applications, 2014, 41(18): 8027-8048. [107] WOLF C, JOLION J M. Object count/area graphs for the evaluation of object detection and segmentation algorithms[J]. International Journal of Document Analysis and Recognition, 2006, 8(4): 280-296. [108] SAMADI M, MOUSAVIAN M, MOMTAZI S. Deep contextualized text representation and learning for fake news detection[J]. Information Processing & Management, 2021, 58(6): 102723. [109] CHOWDHURY P N, SHIVAKUMARA P, RAGHAVENDRA R, et al. An episodic learning network for text detection on human bodies in sports images[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 32(4): 2279-2289. [110] VORAKITPHAN V, CABRIO E, VILLATA S. PROTECT—a pipeline for propaganda detection and classification[C]//Proceedings of the 8th Italian Conference on Computational Linguistics, Milan, Jan 26-28, 2022. Turin: Accademia University Press, 2022: 352-358. [111] OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Nashville, Mar 26-30, 2023. Piscataway: IEEE, 2023: 1-5. [112] WAN H, ZENG X, FAN Z, et al. U2ESPNet—a lightweight and high-accuracy convolutional neural network for real-time semantic segmentation of visible branches[J]. Computers and Electronics in Agriculture, 2023, 204: 107542. [113] LIAN Z, YIN Y, ZHI M, et al. PCBSNet: a pure convolutional bilateral segmentation network for real-time natural scene text detection[J]. Electronics, 2023, 12(14): 3055. [114] YU C, GAO C, WANG J, et al. BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129: 3051-3068. [115] LIU Z, MAO H, WU C Y, et al. A ConvNet for the 2020s[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11976-11986. [116] GUPTA A, VEDALDI A, ZISSERMAN A. Synthetic data for text localisation in natural images[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society,2016: 2315-2324. [117] DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2009: 248-255. [118] LUO D, ZHOU Y, YANG R, et al. ICDAR 2023 competition on detecting tampered text in images[C]//Proceedings of the 2023 International Conference on Document Analysis and Recognition. Cham: Springer, 2023: 587-600. [119] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. [120] HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]//Advances in Neural Information Processing Systems 33, Dec 6-12, 2020: 6840-6851. |
[1] | LI Ziqi, SU Yuxuan, SUN Jun, ZHANG Yonghong, XIA Qingfeng, YIN Hefeng. Critical Review of Multi-focus Image Fusion Based on Deep Learning Method [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2276-2292. |
[2] | FANG Boru, QIU Dawei, BAI Yang, LIU Jing. Review of Application of Surface Electromyography Signals in Muscle Fatigue Research [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2261-2275. |
[3] | XU Yanwei, LI Jun, DONG Yuanfang, ZHANG Xiaoli. Survey of Development of YOLO Object Detection Algorithms [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(9): 2221-2238. |
[4] | WANG Yousong, PEI Junpeng, LI Zenghui, WANG Wei. Review of Research on Deep Learning in Retinal Blood Vessel Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 1960-1978. |
[5] | YE Qingwen, ZHANG Qiuju. Multi-label Image Recognition Using Channel Pixel Attention [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(8): 2109-2117. |
[6] | HAN Han, HUANG Xunhua, CHANG Huihui, FAN Haoyi, CHEN Peng, CHEN Jijia. Review of Self-supervised Learning Methods in Field of ECG [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1683-1704. |
[7] | LI Jiancheng, CAO Lu, HE Xiquan, LIAO Junhong. Review of Classification Methods for Lung Nodules in CT Images [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1705-1724. |
[8] | HOU Xin, WANG Yan, WANG Xuan, FAN Wei. Review of Application Progress of Panoramic Imagery in Urban Research [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(7): 1661-1682. |
[9] | JIANG Jian, ZHANG Qi, WANG Caiyong. Review of Deep Learning Based Iris Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1421-1437. |
[10] | PU Qiumei, YIN Shuai, LI Zhengmao, ZHAO Lina. Review of U-Net-Based Convolutional Neural Networks for Breast Medical Image Segmentation [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(6): 1383-1403. |
[11] | ZHANG Kaili, WANG Anzhi, XIONG Yawei, LIU Yun. Survey of Transformer-Based Single Image Dehazing Methods [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(5): 1182-1196. |
[12] | ZENG Fanzhi, FENG Wenjie, ZHOU Yan. Survey on Natural Scene Text Recognition Methods of Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(5): 1160-1181. |
[13] | YU Fan, ZHANG Jing. Dense Pedestrian Detection Based on Shifted Window Attention Multi-scale Equalization [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(5): 1286-1300. |
[14] | SUN Shuifa, TANG Yongheng, WANG Ben, DONG Fangmin, LI Xiaolong, CAI Jiacheng, WU Yirong. Review of Research on 3D Reconstruction of Dynamic Scenes [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 831-860. |
[15] | WANG Enlong, LI Jiawei, LEI Jia, ZHOU Shihua. Deep Learning-Based Infrared and Visible Image Fusion: A Survey [J]. Journal of Frontiers of Computer Science and Technology, 2024, 18(4): 899-915. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/