基于标签感知变分自编码器的多标签分类

doi:10.3778/j.issn.1673-9418.2405061

摘要/Abstract

摘要： 随着互联网的兴起，各式各样的数据急速增长，如何高效地利用这些样本数据成为数据挖掘领域的重要问题。多标签分类任务作为机器学习与数据挖掘领域的重要任务，旨在为样本标注多个标签类别。目前的方法大多仅对特征分支进行嵌入表示学习，并未考虑到特征和标签之间的语义关联性，缺乏对特征嵌入空间的有效约束，从而导致学习到的特征嵌入针对性不足。在标签相关性学习方面，现有的大多数方法主要关注低阶标签相关性，在面对复杂的实际标签场景时，多个标签之间的高阶相关性学习不足的问题变得更为突出。为解决上述问题，从嵌入表示学习和标签相关性学习出发，提出了一种基于标签感知变分自编码器的多标签分类方法。针对嵌入表示学习，提出使用特征和标签双流变分自编码器同时学习和对齐特征和标签的嵌入空间，对特征嵌入空间添加标签引导来增强特征嵌入。采用基于标签语义的交叉注意力机制，将特定标签信息加入到特征嵌入中，最终获得标签感知后的判别性特征嵌入。针对标签相关性学习，采用共享解码器中的多层自注意力机制，充分融合多个标签的相似性信息，通过不同标签间的共现交互，学习到标签高阶相关性表示并用于交叉感知特征嵌入。在四个不同领域的数据集上得到的实验结果表明，提出的方法能够有效增强特征和标签嵌入，并充分捕获标签之间高阶相关性信息用于多标签分类任务，通过与多个最先进算法在多个评价指标上进行比较分析，验证了提出的方法在性能上的显著优越性。

关键词: 多标签分类, 嵌入空间学习, 变分自动编码器, Transformer, 标签相关性

Abstract: With the rise of the Internet, all kinds of data are growing rapidly, and how to utilize these sample data efficiently has become an important issue in the field of data mining. The multi-label classification task, as an important task in the field of machine learning and data mining, aims to label samples with multiple label categories. Most of the current methods only learn embedding representations for feature branches, do not take into account the semantic relevance between features and labels, and lack effective constraints on the feature embedding space, which leads to insufficient relevance of the learnt feature embeddings. Meanwhile, in terms of label relevance learning, most of the existing methods mainly focus on low-order label relevance, and thus the problem of insufficient learning of high-order relevance between multiple labels becomes more prominent when facing complex actual labeling scenarios. Therefore, in order to solve the above problems, this paper proposes a multi-label classification method based on label-aware variational self-encoder from embedding representation learning and label relevance learning. Specifically, for embedding representation learning, this paper proposes to use feature and label dual-stream variational self-encoders to simultaneously learn and align the embedding space of features and labels, and add label guidance to the feature embedding space to enhance feature embedding. Meanwhile, a label semantic-based cross-attention mechanism is used to add specific label information to the feature embedding, and finally discriminative feature embeddings after label sensing are obtained. For label relevance learning, the multi-layer self-attention mechanism in the shared decoder is used to fully fuse the similarity information of multiple labels, and through the co-occurring interactions between different labels, the label higher-order relevance representations are learnt and used for cross-aware feature embedding. Experimental results obtained on datasets from four different domains show that the proposed method can effectively enhance feature and label embedding and fully capture the higher-order correlation information between labels for multi-label classification tasks, and the significant superiority of the proposed method in performance is verified through a comparative analysis with state-of-the-art algorithms in terms of multiple evaluation metrics.

Key words: multi-label classification, embedded space learning, variational autoencoder, Transformer, label correlation

孙宏健, 徐鹏宇, 刘冰, 景丽萍, 于剑. 基于标签感知变分自编码器的多标签分类[J]. 计算机科学与探索, 2025, 19(3): 714-723.

SUN Hongjian, XU Pengyu, LIU Bing, JING Liping, YU Jian. Multi-label Classification Based on Label-Aware Variational Autoencoder[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(3): 714-723.

参考文献

[1] NIGAM K, MCCALLUM A K, THRUN S, et al. Text classification from labeled and unlabeled documents using EM[J]. Machine Learning, 2000, 39(2): 103-134.
[2] MCCALLUM A K. Multi-label text classification with a mixture model trained by EM[C]//Proceedings of the 1999 Workshop on Text Learning. Palo Alto: AAAI Press, 1999.
[3] SARWAR B, KARYPIS G, KONSTAN J, et al. Item-based collaborative filtering recommendation algorithms[C]//Proceedings of the 10th International Conference on World Wide Web. New York: ACM, 2001: 285-295.
[4] KOREN Y, BELL R, VOLINSKY C. Matrix factorization techniques for recommender systems[J]. Computer, 2009, 42(8): 30-37.
[5] PANG B, LEE L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Stroudsburg: ACL, 2004: 271-278.
[6] MOHAMMAD S, SALAMEH M, KIRITCHENKO S. Sentiment lexicons for Arabic social media[C]//Proceedings of the 10th International Conference on Language Resources and Evaluation, 2016: 33-37.
[7] XU P Y, XIAO L, LIU B, et al. Label-specific feature augmentation for long-tailed multi-label text classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(9): 10602-10610.
[8] BI W, KWOK J T, BI W, et al. Multilabel classification with label correlations and missing labels[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2014: 1680-1686.
[9] CHUNG J, KASTNER K, DINH L, et al. A recurrent latent variable model for sequential data[C]//Advances in Neural Information Processing Systems 28, Montreal, Dec 7-12, 2015: 2980-2988.
[10] 徐鹏宇, 刘华锋, 刘冰, 等. 标签推荐方法研究综述[J]. 软件学报, 2022, 33(4): 1244-1266.
XU P Y, LIU H F, LIU B, et al. Survey of tag recommendation methods[J]. Journal of Software, 2022, 33(4): 1244-1266.
[11] WU B Y, LIU Z L, WANG S F, et al. Multi-label learning with missing labels[C]//Proceedings of the 2014 22nd International Conference on Pattern Recognition. Piscataway: IEEE, 2014: 1964-1968.
[12] CHEN Y N, LIN H T. Feature-aware label space dimension reduction for multi-label classification[C]//Advances in Neural Information Processing Systems 25, Lake Tahoe, Dec 3-6, 2012: 1538-1546.
[13] BHATIA K, JAIN H, KAR P, et al. Sparse local embeddings for extreme multi-label classification[C]//Advances in Neural Information Processing Systems 28, Montreal, Dec 7-12, 2015: 730-738.
[14] KIM Y, LI P, HUANG H. Convolutional neural networks for sentence classification[EB/OL]. [2024-01-15]. https://arxiv.org/abs/1408.5882.
[15] LIU P F, QIU X P, HUANG X J, et al. Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York: ACM, 2016: 2873-2879.
[16] DU C X, CHEN Z Z, FENG F L, et al. Explicit interaction model towards text classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 6359-6366.
[17] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2024-01-15]. https://arxiv.org/abs/1810.04805.
[18] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30, Long Beach, Dec 4-9, 2017: 5998-6008.
[19] XU P Y, SONG M Y, LI Z Y, et al. Taming prompt-based data augmentation for long-tailed extreme multi-label text classification[C]//Proceedings of the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2024: 9981-9985.
[20] BOUTELL M R, LUO J B, SHEN X P, et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771.
[21] READ J, PFAHRINGER B, HOLMES G, et al. Classifier chains for multi-label classification[J]. Machine Learning, 2011, 85(3): 333-359.
[22] WANG J, YANG Y, MAO J H, et al. CNN-RNN: a unified framework for multi-label image classification[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2285-2294.
[23] CHEN Z M, WEI X S, WANG P, et al. Multi-label image recognition with graph convolutional networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5172-5181.
[24] MA Q W, YUAN C Y, ZHOU W, et al. Label-specific dual graph neural network for multi-label text classification[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 3855-3864.
[25] LIU H T, CHEN G, LI P P, et al. Multi-label text classification via joint learning from label embedding and label correlation[J]. Neurocomputing, 2021, 460: 385-398.
[26] KINGMA D P, WELLING M. Auto-encoding variational Bayes[EB/OL]. [2024-01-15]. https://arxiv.org/abs/1312.6114.
[27] XU P Y, XIA M X, XIAO L, et al. Textual tag recommendation with multi-tag topical attention[J]. Neurocomputing, 2023, 537: 73-84.
[28] HUISKES M J, LEW M S. The MIR flickr retrieval evaluation[C]//Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval. New York: ACM, 2008: 39-43.
[29] CHUA T S, TANG J H, HONG R C, et al. NUS-WIDE: a real-world web image database from National University of Singapore[C]//Proceedings of the 2009 ACM International Conference on Image and Video Retrieval. New York: ACM, 2009: 1-9.
[30] NAKAI K, KANEHISA M. A knowledge base for predicting protein localization sites in eukaryotic cells[J]. Genomics, 1992, 14(4): 897-911.
[31] KATAKIS I, TSOUMAKAS G, VLAHAVAS I. Multilabel text classification for automated tag suggestion[C]//Proceedings of the ECML/PKDD 2008 Discovery Challenge, 2008: 75-83.
[32] ZHANG M L, ZHOU Z H. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
[33] WU X Z, ZHOU Z H. A unified view of multi-label performance measures[C]//Proceedings of the 2017 International Conference on Machine Learning, 2017: 3780-3788.
[34] YEH C K, WU W C, KO W J, et al. Learning deep latent space for multi-label classification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 2838-2844.
[35] ANDREW G, ARORA R, BILMES J, et al. Deep canonical correlation analysis[C]//Proceedings of the 30th International Conference on Machine Learning, Atlanta, Jun 16-21, 2013: 1247-1255.
[36] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
[37] LANCHANTIN J, SEKHON A, QI Y J. Neural message passing for multi-label classification[C]//Proceedings of the 2020 European Conference on Machine Learning and Know-ledge Discovery in Databases. Cham: Springer, 2020: 138-163.
[38] BAI J W, KONG S F, GOMES C. Disentangled variational autoencoder based multi-label classification with covariance-aware multivariate probit model[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020: 4313-4321.

编辑推荐 0

Metrics

阅读次数

全文

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	7	0	50

	来源	本网站

	次数	57
	比例	100%

摘要

最新录用	在线预览	正式出版

29	0	42

	来源	本网站

	次数	71
	比例	100%