多标签分类综述

doi:10.3778/j.issn.1673-9418.2303082

摘要/Abstract

摘要： 多标签分类是指在一个样本中可能会有多个标签同时存在的分类问题，目前已被广泛应用于文本分类、图像分类、音乐及视频分类等领域。与传统的单标签分类问题不同，由于标签之间可能存在相关性或者依赖关系，多标签分类问题变得更加复杂。近年来，深度学习技术发展迅猛，结合深度学习的多标签分类方法逐渐成为研究热点。因此，从传统的和基于深度学习的角度对多标签分类方法进行了总结，分析了每一种方法的关键思想、代表性模型和优缺点。在传统的多标签分类方法中，分别介绍了问题转换方法和算法自适应方法。在基于深度学习的多标签分类方法中，特别是对最新的基于Transformer的多标签分类方法进行了综述，该方法目前已成为解决多标签分类问题的主流方法之一。此外，介绍了来自不同领域的多标签分类数据集，并简要分析了多标签分类的15个评价指标。最后，从多模态数据多标签分类、基于提示学习的多标签分类和不平衡数据多标签分类三方面对未来工作进行了展望，以期进一步推动多标签分类的发展和应用。

关键词: 多标签分类, 问题转换, 算法自适应, 深度学习

Abstract: Multi-label classification refers to the classification problem where multiple labels may coexist in a single sample. It has been widely applied in fields such as text classification, image classification, music and video classification. Unlike traditional single-label classification problems, multi-label classification problems become more complex due to the possible correlation or dependence among labels. In recent years, with the rapid development of deep learning technology, many multi-label classification methods combined with deep learning have gradually become a research hotspot. Therefore, this paper summarizes the multi-label classification methods from the traditional and deep learning-based perspectives, and analyzes the key ideas, representative models, and advantages and disadvantages of each method. In traditional multi-label classification methods, problem transformation methods and algorithm adaptation methods are introduced. In deep learning-based multi-label classification methods, the latest multi-label classification methods based on Transformer are reviewed particularly, which have become one of the mainstream methods to solve multi-label classification problems. Additionally, various multi-label classification datasets from different domains are introduced, and 15 evaluation metrics for multi-label classification are briefly analyzed. Finally, future work is discussed from the perspectives of multi-modal data multi-label classification, prompt learning-based multi-label classification, and imbalanced data multi-label classification, in order to further promote the development and application of multi-label classification.

Key words: multi-label classification, problem transformation, algorithm adaptation, deep learning

李冬梅, 杨宇, 孟湘皓, 张小平, 宋潮, 赵玉凤. 多标签分类综述[J]. 计算机科学与探索, 2023, 17(11): 2529-2542.

LI Dongmei, YANG Yu, MENG Xianghao, ZHANG Xiaoping, SONG Chao, ZHAO Yufeng. Review on Multi-lable Classification[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(11): 2529-2542.

参考文献

[1] KHAN M, ARIF R B, SIDDIQUE M, et al. Study and obser-vation of the variation of accuracies of KNN, SVM, LMNN, ENN algorithms on eleven different datasets from UCI machine learning repository[C]//Proceedings of the 2018 4th Interna-tional Conference on Electrical Engineering and Information and Communication Technology, Seoul, Jun 8-10, 2018. Pis-cataway: IEEE, 2018: 124-129.
[2] TSOUMAKAS G, KATAKIS I, VLAHAVAS I. Mining multi-label data[M]//Data Mining and Knowledge Discovery Hand-book. Boston: Springer, 2010: 667-685.
[3] YOGARAJAN V, MONTIEL J, SMITH T, et al. Transformers for multi-label classification of medical text: an empirical comparison[C]//LNCS 12721: Proceedings of the 19th Inter-national Conference on Artificial Intelligence in Medicine, Jun 15-18, 2021. Cham: Springer, 2021: 114-123.
[4] 吴欣, 徐红, 林卓胜, 等. 深度学习在舌象分类中的研究综述[J]. 计算机科学与探索, 2023, 17(2): 303-323.
WU X, XU H, LIN Z S, et al. Review of deep learning in classification of tongue image[J]. Journal of Frontiers of Com-puter Science and Technology, 2023, 17(2): 303-323.
[5] LI T, WU C, MA Y. Multi-label constitution identification based on tongue image in traditional Chinese medicine[C]//Proceedings of the 2021 China Automation Congress, Beijing, Oct 22-24, 2021. Piscataway: IEEE, 2021: 1617-1622.
[6] SANDEN C, ZHANG J Z. Enhancing multi-label music genre classification through ensemble techniques[C]//Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Beijing, Jul 25-29, 2011. New York: ACM, 2011: 705-714.
[7] TSOUMAKAS G, KATAKIS I. Multi-label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3): 1-13.
[8] ZHANG M L, ZHOU Z H. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 26(8): 1819-1837.
[9] MOYANO J, GIBAJA E, CIOS K, et al. Review of ensembles of multi-label classifiers: models, experimental study and prospects[J]. Information Fusion, 2018, 44: 33-45.
[10] 武红鑫, 韩萌, 陈志强,等. 监督和半监督学习下的多标签分类综述[J]. 计算机科学, 2022, 49(8): 12-25.
WU H X, HAN M, CHEN Z Q, et al. Survey of multi-label classification based on supervised and semi-supervised lear-ning[J]. Computer Science, 2022, 49(8): 12-25.
[11] BOUTELL M R, LUO J, SHEN X, et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771.
[12] READ J, PFAHRINGER B, HOLMES G, et al. Classifier chains for multi-label classification[J]. Machine Learning, 2011, 85(3): 333-359.
[13] TSOUMAKAS G, KATAKIS I, VLAHAVAS I P. Random k-labelsets for multi-label classification[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(7): 1079-1089.
[14] CLARE A, KING R D. Knowledge discovery in multi-label phenotype data[C]//LNCS 2168: Proceedings of the 5th Euro-pean Conference on Principles of Data Mining and Knowledge Discovery, Freiburg, Sep 3-5, 2001. Cham: Springer, 2001: 42-53.
[15] ELISSEEFF A, WESTON J. A kernel method for multi-labelled classification[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, Vancouver, Dec 3-8, 2001. Cambridge: MIT Press, 2001: 681-687.
[16] ZHANG M L, ZHOU Z H. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
[17] KIM Y. Convolutional neural networks for sentence classifi-cation[J]. arXiv:1408.5882, 2014.
[18] JOHNSON R, TONG Z. Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Lingui-stics, Vancouver, Jul 30-Apr 4, 2017. Stroudsburg: ACL, 2017: 562-570.
[19] WEI Y C, XIA W, LIN M, et al. HCP: a flexible CNN framework for multi-label image classification[J]. IEEE Tran-sactions on Software Engineering, 2016, 38(9): 1901-1907.
[20] YANG W, LI J, FUKUMOTO F, et al. HSCNN: a hybrid-siamese convolutional neural network for extremely imba-lanced multi-label text classification[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Nov 16-20, 2020. Stroudsburg: ACL, 2020: 6716-6722.
[21] TAN Z, CHEN J, KANG Q, et al. Dynamic embedding projection-gated convolutional neural networks for text classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(3): 973-982.
[22] YAZICI V O, GONZALEZ-GARCIA A, RAMISA A, et al. Orderless recurrent models for multi-label classification[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, Jun 14-19, 2020. Washington: IEEE Computer Society, 2020: 13440-13449.
[23] HU J, KANG X, NISHIDE S, et al. Text multi-label senti-ment analysis based on Bi-LSTM[C]//Proceedings of the 2019 IEEE 6th International Conference on Cloud Computing and Intelligence Systems, Singapore, Sep 25-27, 2019. Piscataway: IEEE, 2019: 16-20.
[24] LIU H, CHEN G, LI P, et al. Multi-label text classification via joint learning from label embedding and label correla-tion[J]. Neurocomputing, 2021, 460: 385-398.
[25] CHEN Z, REN J. Multi-label text classification with latent word-wise label information[J]. Applied Intelligence, 2020, 51(2): 966-979.
[26] XIAO Y Q, LI Y, YUAN J, et al. History-based attention in Seq2Seq model for multi-label text classification[J]. Know-ledge-Based Systems, 2021, 224: 107094.
[27] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, Long Beach, Dec 4-9 2017. Red Hook: Curran Associates, 2017: 5998-6008.
[28] LIU J, CHANG W C, WU Y, et al. Deep learning for extreme multi-label text classification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Aug 7-11 2017. New York: ACM, 2017: 115-124.
[29] CHANG W C, YU H F, ZHONG K, et al. Taming pretrained transformers for extreme multi-label text classification[C]//Proceedings of the 26th ACM SIGKDD International Confe-rence on Knowledge Discovery and Data Mining, Jul 6-10, 2020. New York: ACM, 2020: 3163-3171.
[30] JIANG T, WANG D, SUN L, et al. LightXML: transformer with dynamic negative sampling for high-performance ext-reme multi-label text classification[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, the 33rd Conference on Innovative Applications of Artificial Intelli-gence, the 11th Symposium on Educational Advances in Arti-ficial Intelligence, Feb 2-9, 2021. Menlo Park: AAAI, 2021: 7987-7994.
[31] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understan-ding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Lingui-stics: Human Language Technologies, Minneapolis, Jun 2-7, 2019. Stroudsburg: ACL, 2019: 4171-4186.
[32] JIN Z, LAI X, CAO J. Multi-label sentiment analysis base on BERT with modified TF-IDF[C]//Proceedings of the 2020 IEEE International Symposium on Product Compliance Engi-neering-Asia, Chongqing, Nov 6-8, 2020. Piscataway: IEEE, 2020: 1-6.
[33] KIM D, KOO J, KIM U M. EnvBERT: multi-label text classi-fication for imbalanced, noisy environmental news data[C]//Proceedings of the 2021 15th International Conference on Ubiquitous Information Management and Communication, Seoul, Jan 4-6, 2021. Piscataway: IEEE, 2021: 1-8.
[34] 林森, 刘蓓蓓, 李建文, 等. 基于BERT迁移学习模型的地震灾害社交媒体信息分类研究[J/OL]. 武汉大学学报(信息科学版) (2022-09-05)[2023-03-18]. https://doi.org/10.13203/j.whugis20220167.
LIN S, LIU B B, LI J W, et al. Social media information classification of earthquake disasters based on BERT transfer learning model[J/OL]. Geomatics and Information Science of Wuhan University (2022-09-05)[2023-03-18]. https://doi.org/10.13203/j.whugis20220167.
[35] LU G, LIU Y, WANG J, et al. CNN-BiLSTM-Attention: a multi-label neural classifier for short texts with a small set of labels[J]. Information Processing & Management, 2023, 60(3): 103320.
[36] SNOEK C G M, WORRING M, VAN GEMERT J C, et al. The challenge problem for automated detection of 101 seman-tic concepts in multimedia[C]//Proceedings of the 14th ACM International Conference on Multimedia, Santa Barbara, Oct 23-27, 2006. New York: ACM, 2006: 421-430.
[37] CHUA T S, TANG J, HONG R, et al. NUS-WIDE: a real-world web image database from National University of Singa-pore[C]//Proceedings of the 8th ACM International Conference on Image and Video Retrieval, Santorini Island, Jul 8-10, 2009. New York: ACM, 2009: 1-9.
[38] ELEFTHERIOS S X, SYMEON P, IOANNIS Y K, et al. A comprehensive study over VLAD and product quantization in large-scale image retrieval[J]. IEEE Transactions on Multi-media, 2014, 16(6): 1713-1728.
[39] PESTIAN J, BREW C, MATYKIEWICZ P, et al. A shared task involving multi-label classification of clinical free text[C]//Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, Prague, Jun 29, 2007. Stroudsburg: ACL, 2007: 97-104.
[40] ASHOK N S, BRETT Z U. Discovering recurring anomalies in text reports regarding complex space systems[C]//Procee-dings of the 2005 IEEE Aerospace Conference, Big Sky, Mar 5-12, 2005. Piscataway: IEEE, 2005: 3853-3862.
[41] KATAKIS I, TSOUMAKAS G, VLAHAVAS I. Multilabel text classification for automated tag suggestion[C]//Proceedings of the ECMLPKDD 2008 Discovery Challenge, 2008: 75-83.
[42] TURNBULL D, BARRINGTON L, TORRES D, et al. Seman-tic annotation and retrieval of music and sound effects[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(2): 467-476.
[43] 于玉海, 林鸿飞, 孟佳娜, 等. 跨模态多标签生物医学图像分类建模识别[J]. 中国图象图形学报, 2018, 23(6): 917-927.
YU Y H, LIN H F, MENG J N, et al. Classification modeling and recognition for cross modal and multi-label biomedical image[J]. Journal of Image and Graphics, 2018, 23(6): 917-927.
[44] 井佩光, 李亚鑫, 苏育挺. 一种多模态特征编码的短视频多标签分类方法[J]. 西安电子科技大学学报, 2022, 49(4): 109-117.
JING P G, LI Y X, SU Y T. Micro-video multi-label classi-fication method based on multi-modal feature encoding[J]. Journal of Xidian University, 2022, 49(4): 109-117.
[45] TANG P, YAN X, NAN Y, et al. FusionM4Net: a multi-stage multi-modal learning algorithm for multi-label skin lesion classification[J]. Medical Image Analysis, 2022, 76: 102307.
[46] ZHANG Y, CHEN M, SHEN J, et al. Tailor versatile multi-modal learning for multi-label emotion recognition[C]//Proceedings of the 36th AAAI Conference on Artificial Inte-lligence, Vancouver, Feb 22-Mar 1, 2022. Menlo Park: AAAI, 2022: 9100-9108.
[47] LIU P, YUAN W, FU J, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35.
[48] CHAI Y, TENG C, FEI H, et al. Prompt-based generative multi-label emotion prediction with label contrastive lear-ning[C]//LNCS 13551: Proceedings of the 11th CCF Inter-national Conference on Natural Language Processing and Chinese Computing, Guilin, Sept 24-25, 2022. Cham: Springer, 2022: 551-563.
[49] SONG R, LIU Z, CHEN X, et al. Label prompt for multi-label text classification[J]. Applied Intelligence, 2023, 53(8): 8761-8775.
[50] WANG H, XU C, MCAULEY J. Automatic multi-label prompting: simple and interpretable few-shot classification[C]//Proceedings of the 2022 Conference of the North Ame-rican Chapter of the Association for Computational Lingui-stics: Human Language Technologies, Seattle, Jul 10-15, 2022. Stroudsburg: ACL, 2022: 5483-5492.
[51] TAREKEGN A N, GIACOBINI M, MICHALAK K. A review of methods for imbalanced multi-label classification[J]. Pattern Recognition, 2021, 118: 107965.
[52] ZHU X, LI J, REN J, et al. Dynamic ensemble learning for multi-label classification[J]. Information Sciences, 2023, 623: 94-111.