SMViT：用于新冠肺炎诊断的轻量化孪生网络模型

doi:10.3778/j.issn.1673-9418.2210070

摘要/Abstract

摘要： 针对新冠肺炎的深度学习诊断模型存在的准确率不高、泛化能力较差和参数量较大的问题，基于ViT和孪生网络，提出了一种新冠肺炎诊断的轻量化孪生网络SMViT。首先，提出了循环子结构轻量化策略，使用多个具有相同结构的子网络构成诊断网络，从而降低网络的参数量；其次，提出ViT掩码自监督预训练模型，以增强模型的潜在特征表达能力；然后，构建新冠肺炎诊断的孪生网络SMViT，有效提升模型的诊断准确率，改善小样本下模型泛化能力较差的问题；最后，使用消融实验验证并确定了模型结构，通过对比实验验证模型的诊断性能和轻量化能力。实验结果表明：与最具竞争力的ViT架构的诊断模型相比，该模型在X-ray数据集上的准确率、特异度、灵敏度与[F1]分数值分别提高了1.42%、4.62%、0.40%和2.80%，在CT图像数据集上的准确率、特异度、灵敏度与[F1]分数值分别提高了2.16%、2.17%、2.05%和2.06%；在样本量较小时，模型具有较强的泛化能力；与ViT相比，SMViT模型具有更小的参数量和更高的诊断性能。

关键词: 新冠肺炎诊断, 孪生网络, ViT模型, 自监督学习, 轻量化模型

Abstract: In order to solve the problems of low accuracy, poor generalization ability and large number of parameters in the diagnosis model of COVID-19 based on deep learning, a lightweight siamese architecture network SMViT (siamese masked vision transformer) for COVID-19 diagnosis based on ViT (vision transformer) and siamese network is proposed. Firstly, a lightweight strategy of cyclic substructure is proposed, which uses multiple subnets with the same structure to make a diagnosis network, thereby reducing the number of network parameters. Secondly, masked self-supervised pre-training model based on ViT is proposed to enhance the potential feature expression ability of the model. Then, in order to effectively improve the diagnostic accuracy of the diagnosis model of COVID-19, and improve the poor generalization ability of the model under small samples, this paper constructs the twin network SMViT. Finally, the ablation experiment is used to verify and determine the structure of the model, and the diagnostic performance and lightweight capacity of the model are verified through comparative experiments. Experimental results show that, compared with the most competitive ViT-based diagnostic model, the Accuracy, Specificity, Sensitivity and F1 scores of this model on the X-ray dataset have increased by 1.42%, 4.62%, 0.40% and 2.80% respectively, and the Accuracy, Specificity, Sensitivity and F1 scores on the CT image dataset have increased by 2.16%, 2.17%, 2.05% and 2.06% respectively. The SMViT model has strong generalization ability for small sample size datasets. Compared with ViT, SMViT model has smaller parameters and higher diagnostic performance.

Key words: diagnosis of COVID-19, siamese network, vision transformer, self-supervised learning, lightweight model

马自萍, 谭力刀, 马金林, 陈勇. SMViT：用于新冠肺炎诊断的轻量化孪生网络模型[J]. 计算机科学与探索, 2023, 17(10): 2499-2510.

MA Ziping, TAN Lidao, MA Jinlin, CHEN Yong. SMViT: Lightweight Siamese Masked Vision Transformer Model for Diagnosis of COVID-19[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(10): 2499-2510.

参考文献

[1] PAN Y, GUAN H, ZHOU S, et al. Initial CT findings and temporal changes in patients with the novel coronavirus pneumonia (2019-nCoV): a study of 63 patients in Wuhan, China[J]. European Radiology, 2020, 30(6): 3306-3309.
[2] 高艳, 戎冬冬, 安彦虹, 等. 新型冠状病毒肺炎的X线及CT表现[J]. CT理论与应用研究, 2020, 29(2): 147-154.
GAO Y, RONG D D, AN Y H, et al. X-ray and CT features of novel coronavirus pneumonia[J]. CT Theory and Appli-cations, 2020, 29(2): 147-154.
[3] 马金林, 裘硕, 马自萍, 等. 新型冠状病毒肺炎的深度学习诊断方法综述[J]. 计算机工程与应用, 2022, 58(12): 51-65.
MA J L, QIU S, MA Z P, et al. Review of deep learning diag-nostic methods for COVID-19[J]. Computer Engineering and Applications, 2022, 58(12): 51-65.
[4] SHIBLY K H, DEY S K, ISLAM M T U, et al. COVID faster R-CNN: a novel framework to diagnose novel coronavirus disease (COVID-19) in X-ray images[J]. Informatics in Me-dicine Unlocked, 2020, 20(3): 104-115.
[5] WU Z, SHEN C, VAN DEN HENGEL A. Wider or deeper: revisiting the Resnet model for visual recognition[J]. Pattern Recognition, 2019, 90(7): 119-133.
[6] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Re-cognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Com-puter Society, 2017: 2261-2269.
[7] DAS D, SANTOSH K C, PAL U. Truncated Inception Net: COVID-19 outbreak screening using chest X-rays[J]. Physi-cal and Engineering Sciences in Medicine, 2020, 43(3): 915-925.
[8] HEIDARIAN S, AFSHAR P, ENSHAEI N, et al. COVIDFACT: a fully-automated capsule network-based framework for identification of COVID-19 cases from chest CT scans[J]. Frontiers in Artificial Intelligence, 2021, 4(1): 156-169.
[9] TAN M, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]//Proceedings of the 36th International Conference on Machine Learning, Long Beach, Jun 9-15, 2019: 6105-6114.
[10] OZKAYA U, OZTURK S, BARSTUGAN M. Coronavirus (COVID-19) classification using deep features fusion and ranking technique[J]. Big Data Analytics and Artificial In-telligence, 2020, 18(4): 281-295.
[11] RAHIMZADEH M, ATTAR A. A modified deep convolutional neural network for dete cting COVID-19 and pneumonia from chest X-ray images based on the concatenation of Xception and ResNet50V2[J]. Informatics in Medicine Unlocked, 2020, 19(1): 349-360.
[12] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 936-944.
[13] TOGACAR M, ERGEN B, COMERT Z. COVID-19 detec-tion using deep learning models to exploit social mimic optimization and structured chest X-ray images using fuzzy color and stacking approaches[J]. Computers in Biology and Medicine, 2020, 121(6): 1-12.
[14] SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Washington: IEEE Computer Society, 2018: 4510-4520.
[15] FU G Z, SUN P Z, ZHU W B, et al. A deep learning based approach for fast and robust steel surface defects classifi-cation[J]. Optics and Lasers in Engineering, 2019, 12(1): 397-405.
[16] DOGAN U, GLASMACHERS T, IGEL C. A unified view on multiclass support vector classification[J]. Journal of Ma-chine Learning Research, 2016, 25(5): 17-32.
[17] ZHENG C, DENG X, FU Q, et al. Deep learning-based de-tection for COVID-19 from chest CT using weak label[EB/OL]. [2022-07-16]. https://doi.org/10.1101/2020.03.12.20027185.
[18] ZHOU Z H. A brief introduction to weakly supervised lear-ning[J]. National Science Review, 2018, 5(1): 44-53.
[19] NARIN A, KAYA C, PAMUK Z. Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks[J]. Pattern Analysis and App-lications, 2021,13(4): 1-14.
[20] WANG S, ZHA Y F, LI W M, et al. A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis[J]. European Respiratory Journal, 2020, 56(5): 2000775.
[21] CHOWDHURY N K, KABIR M A, RAHMAN M M, et al.ECOVNet: a highly effective ensemble based deep learning model for detecting COVID-19[J]. PeerJ Computer Science, 2021, 7(7): 551-561.
[22] HE K, CHEN X, XIE S, et al. Masked autoencoders are scalable vision learners[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recogni-tion, Louisiana, Jun 19-24, 2022. Piscataway: IEEE, 2022: 15979-15988.
[23] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recog-nition at scale[C]//Proceedings of the 9th International Con-ference on Learning Representations, Vienna, May 3-7, 2021: 11929-11941.
[24] MCNUTT A T, KOES D R. Improving ΔΔG predictions with a multitask convolutional siamese network[J]. Journal of Chemical Information and Modeling, 2022,17(8): 62-75.
[25] CHEN T, LU Z, YANG Y, et al. A siamese network based U-Net for change detection in high resolution remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 4(15): 221-236.
[26] CHEN X, HE K. Exploring simple siamese representation learning[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition, Jun 19-25, 2021.Piscataway: IEEE, 2021: 15750-15758.
[27] RAHMAN T, KHANDAKAR A, QIBLAWEY Y, et al. Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images[J]. Compu-ters in Biology and Medicine, 2021, 7(5): 132-141.
[28] COHEN J P, MORRISON P, LAN D, et al. COVID-19 image data collection: prospective predictions are the future[J]. BMC Public Health, 2020, 36(7): 1-11.
[29] YANG X, HE X, ZHAO J, et al. COVID-CT- Dataset: a CT scan dataset about COVID-19[J]. Journal of Diabetes Inves-tigation, 2020,17(3): 135-147.
[30] SOARES E, ANGELOV P, BIASO S, et al. SARS -CoV-2 CT-scan dataset: a large dataset of real patients CT-scans for SARS-CoV-2 identification[J]. Cold Spring Harbor La-boratory Press, 2020, 27(2): 128-134.
[31] 冯毅博, 仇大伟, 曹慧, 等. 基于深度可分离稠密网络的新型冠状病毒肺炎X射线图像检测方法研究[J]. 生物医学工程学杂志, 2020, 37(4): 557-565.
FENG Y B, QIU D W, CAO H, et al. Research on X-ray image detection of novel coronavirus pneumonia based on deep separable dense network[J]. Journal of Biomedical En-gineering, 2020, 37(4): 557-565.
[32] BHATT A, GANATRA A, KOTECHA K. COVID-19 pul-monary consolidations detection in chest X-ray using pro-gressive resizing and transfer learning techniques[J]. Heliyon, 2021, 7(6): 72-81.
[33] QI X, FORAN D J, NOSHER J L, et al. Multi-feature vision transformer via self-supervised representation learning for improvement of COVID-19 diagnosis[C]//LNCS 13559: Pro-ceedings of the 1st International Workshop on Medical Image Learning with Limited and Noisy Data, Singapore, Sep 16-23, 2022. Cham: Springer, 2022: 76-85.
[34] ARDAKANI A A, KANAFI A R, ACHARYA U R, et al. Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: results of 10 convolutional neural networks[J]. Computers in Biology and Medicine, 2020, 48(3): 121-129.
[35] BANERJEE A, BHATTACHARYA R, BHATEJA V, et al. COFE-Net: an ensemble strategy for computer-aided detection for COVID-19[J]. Measurement, 2022, 29(7): 187-196.
[36] FAN X, FENG X, DONG Y, et al. COVID-19 CT image recognition algorithm based on transformer and CNN[J]. Displays, 2022, 72(1): 150-165.