Journal of Frontiers of Computer Science and Technology ›› 2023, Vol. 17 ›› Issue (11): 2529-2542.DOI: 10.3778/j.issn.1673-9418.2303082

• Frontiers·Surveys • Previous Articles     Next Articles

Review on Multi-lable Classification

LI Dongmei, YANG Yu, MENG Xianghao, ZHANG Xiaoping, SONG Chao, ZHAO Yufeng   

  1. 1. School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
    2. Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, Beijing 100083, China
    3. National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
  • Online:2023-11-01 Published:2023-11-01

多标签分类综述

李冬梅,杨宇,孟湘皓,张小平,宋潮,赵玉凤   

  1. 1. 北京林业大学 信息学院,北京 100083
    2. 国家林业草原林业智能信息处理工程技术研究中心,北京 100083
    3. 中国中医科学院 中医药数据中心,北京 100700

Abstract: Multi-label classification refers to the classification problem where multiple labels may coexist in a single sample. It has been widely applied in fields such as text classification, image classification, music and video classification. Unlike traditional single-label classification problems, multi-label classification problems become more complex due to the possible correlation or dependence among labels. In recent years, with the rapid development of deep learning technology, many multi-label classification methods combined with deep learning have gradually become a research hotspot. Therefore, this paper summarizes the multi-label classification methods from the traditional and deep learning-based perspectives, and analyzes the key ideas, representative models, and advantages and disadvantages of each method. In traditional multi-label classification methods, problem transformation methods and algorithm adaptation methods are introduced. In deep learning-based multi-label classification methods, the latest multi-label classification methods based on Transformer are reviewed particularly, which have become one of the mainstream methods to solve multi-label classification problems. Additionally, various multi-label classification datasets from different domains are introduced, and 15 evaluation metrics for multi-label classification are briefly analyzed. Finally, future work is discussed from the perspectives of multi-modal data multi-label classification, prompt learning-based multi-label classification, and imbalanced data multi-label classification, in order to further promote the development and application of multi-label classification.

Key words: multi-label classification, problem transformation, algorithm adaptation, deep learning

摘要: 多标签分类是指在一个样本中可能会有多个标签同时存在的分类问题,目前已被广泛应用于文本分类、图像分类、音乐及视频分类等领域。与传统的单标签分类问题不同,由于标签之间可能存在相关性或者依赖关系,多标签分类问题变得更加复杂。近年来,深度学习技术发展迅猛,结合深度学习的多标签分类方法逐渐成为研究热点。因此,从传统的和基于深度学习的角度对多标签分类方法进行了总结,分析了每一种方法的关键思想、代表性模型和优缺点。在传统的多标签分类方法中,分别介绍了问题转换方法和算法自适应方法。在基于深度学习的多标签分类方法中,特别是对最新的基于Transformer的多标签分类方法进行了综述,该方法目前已成为解决多标签分类问题的主流方法之一。此外,介绍了来自不同领域的多标签分类数据集,并简要分析了多标签分类的15个评价指标。最后,从多模态数据多标签分类、基于提示学习的多标签分类和不平衡数据多标签分类三方面对未来工作进行了展望,以期进一步推动多标签分类的发展和应用。

关键词: 多标签分类, 问题转换, 算法自适应, 深度学习