小样本图像分类研究综述

doi:10.3778/j.issn.1673-9418.2210035

摘要/Abstract

摘要： 近年来，借助大规模数据集和庞大的计算资源，以深度学习为代表的人工智能算法在诸多领域取得成功。其中计算机视觉领域的图像分类技术蓬勃发展，并涌现出许多成熟的视觉任务分类模型。这些模型均需要利用大量的标注样本进行训练，但在实际场景中因诸多限制导致数据量稀少，往往很难获得相应规模的高质量标注样本。因此如何使用少量样本进行学习已经逐渐成为当前的研究热点。针对分类任务系统梳理了当前小样本图像分类的相关工作，小样本学习主要采用元学习、度量学习和数据增强等深度学习方法。从有监督、半监督和无监督等层次归纳总结了小样本图像分类的研究进展和典型技术模型，以及这些模型方法在若干公共数据集上的表现，并从机制、优势、局限性等方面进行了对比分析。最后，讨论了当前小样本图像分类面临的技术难点以及未来的发展趋势。

关键词: 深度学习, 监督学习, 元学习, 度量学习, 图像分类

Abstract: In recent years, artificial intelligence algorithms represented by deep learning have achieved success in many fields by relying on large-scale datasets and huge computing resources. Among them, the image classification technology in the field of computer vision develops vigorously, and many mature visual task classification models emerge. All these models need to use a large number of annotated samples for training. However, in actual scena-rios, due to many restrictions, the amount of data is scarce, and it is often difficult to obtain high-quality annotated samples of corresponding scale. Therefore, how to use a small number of samples for learning has gradually become a research hotspot. In view of the classification task system, this paper reviews the current work related to few-shot image classification. Few-shot learning mainly adopts deep learning methods such as meta-learning, metric learning and data enhancement. This paper summarizes the research progress and typical technical models of few-shot image classification from supervised, semi-supervised and unsupervised levels, as well as the performance of these model methods on several public datasets, and makes comparative analysis from the mechanism, advantages, limitations, etc. Finally, the technical difficulties and future trends of few-shot image classification are discussed.

Key words: deep learning, supervised learning, meta-learning, metric learning, image classification

安胜彪, 郭昱岐, 白宇, 王腾博. 小样本图像分类研究综述[J]. 计算机科学与探索, 2023, 17(3): 511-532.

AN Shengbiao, GUO Yuqi, BAI Yu, WANG Tengbo. Survey of Few-Shot Image Classification Research[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(3): 511-532.

参考文献

[1] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2015, Montreal, Dec 7-12, 2015. Red Hook: Curran Associates, 2015: 91-99.
[2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Image-Net classification with deep convolutional neural networks[C]//Proceedings of the 26th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 3-6, 2012. Red Hook: Curran Associates, 2012: 1106-1114.
[3] YAN L, ZHENG Y, CAO J. Few-shot learning for short text classification[J]. Multimedia Tools and Applications, 2018, 77(22): 29799-29810.
[4] GOODFELLOW I, BENGIO Y, COURVILLE A. Deep lear-ning[M]. Cambridge: MIT Press, 2016.
[5] FEI-FEI L, FERGUS R, PERONA P. One-shot learning of object categories[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594-611.
[6] LAKE B M, SALAKHUTDINOV R R, TENENBAUM J. One-shot learning by inverting a compositional causal pro-cess[C]//Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Dec 5-8, 2013. Red Hook: Curran Associates, 2013: 2526-2534.
[7] TaNG K D, TAPPEN M F, SUKTHANKAR R, et al. Optim-izing one-shot recognition with micro-set learning[C]//Pro-ceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, Jun 13-18, 2010. Washington: IEEE Computer Society, 2010: 3027-3034.
[8] MUNKHDALAI T, YUAN X, MEHRI S, et al. Rapid adap-tation with conditionally shifted neurons[C]//Proceedings of the 35th International Conference on Machine Learning, Stockholmsm?ssan, Jul 10-15, 2018: 3661-3670.
[9] VANSCHOREN J. Meta-learning: a survey[J]. arXiv:1810. 03548, 2018.
[10] LI K，MALIK J. Learning to optimize[C]//Proceedings of the 5th International Conference on Learning Representations, Toulon, Apr 24-26, 2017.
[11] 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述[J]. 软件学报, 2021, 32(2): 349-369.
ZHAO K L, JIN X L, WANG Y Z. Survey on few-shot learning[J]. Journal of Software, 2021, 32(2): 349-369.
[12] 刘春磊, 陈天恩, 王聪, 等. 小样本目标检测研究综述[J]. 计算机科学与探索, 2023, 17(1): 53-73.
LIU C L, CHEN T N, WANG C, et al. Survey of few-shot object detection[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 53-73.
[13] 张振伟, 郝建国, 黄健, 等. 小样本图像目标检测研究综述[J]. 计算机工程与应用, 2022, 58(5): 1-11.
ZHANG Z W, HAO J G, Huang J, et al. Review of few-shot object detection[J]. Computer Engineering and Applic-ations, 2022, 58(5): 1-11.
[14] BARLOW H B. Unsupervised learning[J]. Neural Comput-ation, 1989, 1(3): 295-311.
[15] ZHOU Z H. A brief introduction to weakly supervised learn-ing[J]. National Science Review, 2018, 5(1): 44-53.
[16] SETTLES B. Active learning literature survey: TR1648[R]. University of Wisconsin Madison, 2009.
[17] ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extre-mely efficient convolutional neural network for mobile devices[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 6848-6856.
[18] FINK M. Object classification from a single example utili-zing class relevance metrics[C]//Proceedings of the 17th International Conference on Neural Information Processing Systems, Vancouver, Dec 13-18, 2004. Red Hook: Curran Associates, 2004: 449-456.
[19] SANTORO A, BARTUNOV S, BOTVINICK M, et al. Meta-learning with memory-augmented neural networks[C]//Proce-edings of the 33rd International Conference on Machine Learning, New York, Jun 19-24, 2016: 1842-1850.
[20] WANG Y X, HEBERT M. Learning from small sample sets by combining unsupervised meta-training with CNNs[C]// Proceedings of the Annual Conference on Neural Information Processing Systems 2016, Barcelona, Dec 5-10, 2016. Red Hook: Curran Associates, 2016: 244-252.
[21] ABDULLAH J M, QI G J, SHAH M. Task-agnostic meta-learning for few-shot learning[J]. arXiv:1805.07722, 2018.
[22] MISHRA N, ROHANINEJAD M, CHEN X, et al. A simple neural attentive meta-learner[J]. arXiv:1707.03141, 2017.
[23] QI H, BROWN M, LOWE D G. Low-shot learning with imprinted weights[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, Jun 18-22, 2018. Piscataway: IEEE, 2018: 5822-5830.
[24] ZHOU F, WU B, LI Z. Deep meta-learning: learning to learn in the concept space[J]. arXiv:1802.03596, 2018.
[25] ZHANG Y, TANG H, JIA K. Fine-grained visual categori-zation using meta-learning optimization with sample selection of auxiliary data[C]//LNCS 11212: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 241-256.
[26] XING E, JORDAN M, RUSSELL S J, et al. Distance metric learning with application to clustering with side-information[C]//Proceedings of the 15th International Conference on Neural Information Processing Systems, Vancouver, Dec 9-14, 2002.Cambridge: MIT Press, 2002: 505-512.
[27] LAWRENCE S, GILES C L, TSOI A C. Lessons in neural network training: overfitting may be harder than expected[C]//Proceedings of the 14th National Conference on Artificial Intelligence and 9th Innovative Applications of Artificial Intelligence Conference, Providence, Jul 27-31, 1997. Menlo Park: AAAI, 1997: 540-545.
[28] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2014, Montreal, Dec 8-13, 2014. Red Hook: Curran Associates, 2014: 2672-2680.
[29] ZHANG R, CHE T, GHAHRAMANI Z, et al. MetaGAN: an adversarial approach to few-shot learning[C]//Proceed-ings of the Annual Conference on Neural Information Processing Systems 2018, Montréal, Dec 3-8, 2018. Red Hook: Curran Associates, 2018: 2371-2380
[30] KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[31] REN M, TRIANTAFILLOU E, RAVI S, et al. Meta-learning for semi-supervised few-shot classification[J]. arXiv:1803.00676, 2018.
[32] BERTINETTO L, HENRIQUES J F, TORR P H S, et al. Meta-learning with differentiable closed-form solvers[J]. arXiv:1805.08136, 2018.
[33] CHEN W Y, LIU Y C, KIRA Z, et al. A closer look at few-shot classification[J]. arXiv:1904.04232, 2019.
[34] FORT S. Gaussian prototypical networks for few-shot learning on Omniglot[J]. arXiv:1708.02735, 2017.
[35] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-lear-ning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning, Sydney, Aug 6-11, 2017: 1126-1135.
[36] NICHOL A, ACHIAM J, SCHULMAN J. On first-order meta-learning algorithms[J]. arXiv:1803.02999, 2018.
[37] ANTONIOU A, EDWARDS H, STORKEY A. How to train your MAML[J]. arXiv:1810.09502, 2018.
[38] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C]//Proceedings of the 5th International Conference on Learning Representations, Toulon, Apr 24-26, 2017.
[39] RUSU A A, RAO D, SYGNOWSKI J, et al. Meta-learning with latent embedding optimization[J]. arXiv:1807.05960, 2018.
[40] SUN Q R, LIU Y Y, CHUA T S, et al. Meta-transfer lear-ning for few-shot learning[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 403-412.
[41] YE H J, HU H, ZHAN D C, et al. Few-shot learning via embedding adaptation with set-to-set functions[C]//Procee-dings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 8805-8814.
[42] LEE K, MAJI S, RAVICHANDRAN A, et al. Meta-learning with differentiable convex optimization[C]//Procee-dings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, Jun 16-20, 2019. Pisc-ataway: IEEE, 2019: 10657-10665.
[43] ZHANG C, DING H, LIN G, et al. Meta navigator: search for a good adaptation policy for few-shot learning[J]. arXiv:2109.05749, 2021.
[44] CHEN Y B, LIU Z, XU H J, et al. Meta-baseline: exploring simple meta-learning for few-shot learning[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 9042-9051.
[45] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Proceedings of the Annual Confe-rence on Neural Information Processing Systems 2017, Long Beach, Dec 4-9, 2017. Red Hook: Curran Associates, 2017: 4077-4087.
[46] KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[C]//Proceedings of the 32nd International Conference on Machine Learning, Lille, Jul 6-11, 2015, 2: 1-8.
[47] BROMLEY J, GUYON I, LECUN Y, et al. Signature verifi-cation using a “Siamese” time delay neural network[C]//Proceedings of the 6th International Conference on Neural Information Processing Systems, Denver, Nov 1993. San Francisco: Morgan Kaufmann, 1993: 737-744.
[48] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matc-hing networks for one shot learning[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2016, Barcelona, Dec 5-10, 2016. Red Hook: Curran Asso-ciates, 2016: 3630-3638.
[49] SIMON C, KONIUSZ P, NOCK R, et al. Adaptive sub-spaces for few-shot learning[C]//Proceedings of the 2020 IEEE/ CVF Conference on Computer Vision and Pattern Recog-nition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 4135-4144.
[50] HOU R B, CHANG H, MA B P, et al. Cross attention network for few-shot classification[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019. Red Hook: Curran Associates, 2019: 4005-4016.
[51] ZHANG C, CAI Y J, LIN G S, et al. DeepEMD: few-shot image classification with differentiable earth mover??s dis-tance and structured classifiers[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 12200-12210.
[52] RIZVE M N, KHAN S, KHAN F S, et al. Exploring complementary strengths of invariant and equivariant repre-sentations for few-shot learning[C]//Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Reco-gnition, Jun 19-25, 2021. Piscataway: IEEE, 2021: 10836-10846.
[53] LIU G, ZHAO L L, LI W, et al. Class-wise metric scaling for improved few-shot classification[C]//Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, Jan 3-8, 2021. Piscataway: IEEE, 2021: 586-595.
[54] ZHOU Z Q, QIU X, XIE J T, et al. Binocular mutual learning for improving few-shot classification[C]//Procee-dings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 8382-8391.
[55] WU J M, ZHANG T Z, ZHANG Y D, et al. Task-aware part mining network for few-shot learning[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 8413-8422.
[56] SINGH A, JAMALI-RAD H. Transductive decoupled varia-tional inference for few-shot classification[J]. arXiv:2208. 10559, 2022.
[57] HE Y J, LIANG W H, ZHAO D Y, et al. Attribute surro-gates learning and spectral tokens pooling in transformers for few-shot learning[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, Jun 18-24, 2022. Piscataway: IEEE, 2022: 9109-9119.
[58] LAINE S, AILA T. Temporal ensembling for semi-supervised learning[J]. arXiv:1610.02242, 2016.
[59] RODRIGUEZ A, LAIO A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191): 1492-1496.
[60] BERTHELOT D, CARLINI N, GOODFELLOW I J, et al. MixMatch: a holistic approach to semi-supervised learning[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019. Red Hook: Curran Associates, 2019: 5050-5060.
[61] LIU Y, LEE J, PARK M, et al. Learning to propagate labels: transductive propagation network for few-shot learning[J]. arXiv:1805.10002, 2018.
[62] YU Z J, CHEN L, CHENG Z W, et al. TransMatch: a transfer-learning scheme for semi-supervised few-shot lear-ning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 12853-12861.
[63] YANG L, LI L L, ZHANG Z L, et al. DPGN: distribution propagation graph network for few-shot learning[C]//Pro-ceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 13387-13396.
[64] LI X Z, SUN Q R, LIU Y Y, et al. Learning to self-train for semi-supervised few-shot classification[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019. Red Hook: Curran Associates, 2019: 10276-10286.
[65] HUANG K, GENG J, JIANG W, et al. Pseudo-loss confi-dence metric for semi-supervised few-shot learning[C]//Pro-ceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Oct 10-17, 2021. Piscataway: IEEE, 2021: 8651-8660.
[66] WANG Y K, XU C M, LIU C, et al. Instance credibility inference for few-shot learning[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 12833-12842.
[67] LI P, WU G L, GONG S G, et al. Semi-supervised few-shot learning with pseudo label refinement[C]//Proceedings of the 2021 IEEE International Conference on Multi-media and Expo, Shenzhen, Jul 5-9, 2021. Piscataway: IEEE, 2021.
[68] CHEUNG B, LIVEZEY J A, BANSAL A K, et al. Disco-vering hidden factors of variation in deep networks[J]. arXiv:1412.6583, 2014.
[69] CHEN X, DUAN Y, HOUTHOOFT R, et al. InfoGAN: interpretable representation learning by information maxim-izing generative adversarial nets[C]//Proceedings of the Annual Conference on Neural Information Processing Systems 2016, Barcelona, Dec 5-10, 2016. Red Hook: Curran Associates,2016: 2172-2180.
[70] ERHAN D, COURVILLE A C, BENGIO Y, et al. Why does unsupervised pre-training help deep learning?[C]//Procee-dings of the 13th International Conference on Artificial Intelligence and Statistics, Sardinia, May 13-15, 2010: 201-208.
[71] HSU K, LEVINE S, FINN C. Unsupervised learning via meta-learning[J]. arXiv:1810.02334, 2018.
[72] KHODADADEH S, B?L?NI L, SHAH M. Unsupervised meta-learning for few-shot image classification[C]//Proce-edings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, Dec 8-14, 2019. Red Hook: Curran Associates, 2019: 10132-10142.
[73] ANTONIOU A, STORKEY A. Assume, augment and learn: unsupervised few-shot meta-learning via random labels and data augmentation[J]. arXiv:1902.09884, 2019.
[74] JI Z, ZOU X, HUANG T, et al. Unsupervised few-shot learning via self-supervised training[J]. arXiv:1912.12178, 2019.
[75] LI J, LIU G. Few-shot image classification via contrastive self-supervised learning[J]. arXiv:2008.09942, 2020.
[76] QIN T, LI W, SHI Y, et al. Diversity helps: unsupervised few-shot learning via distribution shift-based data augmen-tation[J]. arXiv:2004.05805, 2020.
[77] XU H, WANG J X, LI H, et al. Unsupervised meta-learning for few-shot learning[J]. Pattern Recognition, 2021, 116: 107951.
[78] ZHANG H, ZHAN T, DAVIDSON I. A self-supervised deep learning framework for unsupervised few-shot learning and clustering[J]. Pattern Recognition Letters, 2021, 148: 75-81.
[79] HILLER M, MA R, HARANDI M, et al. Rethinking genera-lization in few-shot classification[J]. arXiv:2206.07267, 2022.
[80] LUO X, CHEN Y X, WEN L J, et al. Boosting few-shot classification with view-learnable contrastive learning[C]//Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, Shenzhen, Jul 5-9, 2021. Piscataway: IEEE, 2021.
[81] LIU C, FU Y W, XU C M, et al. Learning a few-shot embedding model with contrastive learning[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, the 33rd Conference on Innovative Applications of Artificial Intelligence, the 11th Symposium on Educational Advances in Artificial Intelligence, Feb 2-9, 2021. Menlo Park: AAAI, 2021: 8635-8643.
[82] LEE T, YOO S. Augmenting few-shot learning with super-vised contrastive learning[J]. IEEE Access, 2021, 9: 61466-61474.
[83] LU Y, WEN L, LIU J, et al. Self-supervision can be a good few-shot learner[J]. arXiv:2207.09176, 2022.
[84] YAO B P, KHOSLA A, LI F F. Combining randomization and discrimination for fine-grained image categorization[C]//Proceedings of the 24th IEEE Conference on Com-puter Vision and Pattern Recognition, Colorado Springs, Jun 20-25, 2011. Washington: IEEE Computer Society, 2011: 1577-1584.