计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (6): 1279-1290.DOI: 10.3778/j.issn.1673-9418.2111144
刘雅芬1,2, 郑艺峰1,2,+(), 江铃燚1,2, 李国和3, 张文杰1,2
收稿日期:
2021-11-02
修回日期:
2022-01-05
出版日期:
2022-06-01
发布日期:
2022-01-17
通讯作者:
+ E-mail: zyf@mnnu.edu.cn作者简介:
刘雅芬(1999—),女,福建南平人,硕士研究生,CCF会员,主要研究方向为机器学习、深度学习。基金资助:
LIU Yafen1,2, ZHENG Yifeng1,2,+(), JIANG Lingyi1,2, LI Guohe3, ZHANG Wenjie1,2
Received:
2021-11-02
Revised:
2022-01-05
Online:
2022-06-01
Published:
2022-01-17
About author:
LIU Yafen, born in 1999, M.S. candidate, member of CCF. Her research interests include machine learning and deep learning.Supported by:
摘要:
随着智能技术的发展,深度学习已成为机器学习的研究热点,在各个领域发挥着越来越重要的作用。深度学习需要大量的标签数据用于提升模型性能。为了有效解决标签问题,研究人员将半监督学习与深度学习相结合。同时利用少量的标签数据和大量的无标签数据构建模型,有利于扩大样本空间。鉴于深度半监督学习的理论意义和实际应用价值,以深度半监督学习方法中的伪标签方法作为切入点进行分析。首先,对深度半监督学习进行介绍,指出伪标签方法优势所在;其次,从自训练和多视角训练角度出发对伪标签方法进行阐述,对已有的模型进行综合性分析;接着,重点介绍基于图和伪标签的标签传播方法,并对已有伪标签方法进行实验分析;最后,从无标签数据效用性、噪声数据、合理性和伪标签方法的结合上总结伪标签方法所面临的问题和未来研究方向。
中图分类号:
刘雅芬, 郑艺峰, 江铃燚, 李国和, 张文杰. 深度半监督学习中伪标签方法综述[J]. 计算机科学与探索, 2022, 16(6): 1279-1290.
LIU Yafen, ZHENG Yifeng, JIANG Lingyi, LI Guohe, ZHANG Wenjie. Survey on Pseudo-Labeling Methods in Deep Semi-supervised Learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(6): 1279-1290.
数据集 | 节点(样本) | 特征 | 类别 | 类别分布 |
---|---|---|---|---|
Iris | 150 | 4 | 3 | 50,50,50 |
Cmc | 1 473 | 9 | 3 | 629,333,511 |
Iono | 351 | 34 | 2 | 225,126 |
表1 实验中使用的UCI数据集
Table 1 UCI datasets used in experiment
数据集 | 节点(样本) | 特征 | 类别 | 类别分布 |
---|---|---|---|---|
Iris | 150 | 4 | 3 | 50,50,50 |
Cmc | 1 473 | 9 | 3 | 629,333,511 |
Iono | 351 | 34 | 2 | 225,126 |
方法 | CIFAR-10 | CIFAR-100 | ILSVRC-2012 |
---|---|---|---|
熵最小化 | 86.41 | — | 83.39 |
代理标签 | — | — | 82.41 |
噪声学生 | — | — | 88.39 |
元伪标签 | 88.62 | — | 90.20 |
自半监督 | — | — | 91.23 |
协同训练 | 90.97 | 65.37 | — |
三体训练 | 91.55 | 70.26 | — |
表2 伪标签方法在不同图像数据集上实验结果
Table 2 Experimental results of pseudo-labeling method on different image datasets %
方法 | CIFAR-10 | CIFAR-100 | ILSVRC-2012 |
---|---|---|---|
熵最小化 | 86.41 | — | 83.39 |
代理标签 | — | — | 82.41 |
噪声学生 | — | — | 88.39 |
元伪标签 | 88.62 | — | 90.20 |
自半监督 | — | — | 91.23 |
协同训练 | 90.97 | 65.37 | — |
三体训练 | 91.55 | 70.26 | — |
方法 | Iris | Cmc | Ionosphere |
---|---|---|---|
协同训练 | 75.21 | 32.33 | 63.82 |
三体训练 | 80.03 | 35.92 | 64.13 |
标签传播 | 85.02 | 40.95 | 67.65 |
表3 伪标签方法在不同UCI数据集上实验结果
Table 3 Experimental results of pseudo-labeling method on different UCI datasets %
方法 | Iris | Cmc | Ionosphere |
---|---|---|---|
协同训练 | 75.21 | 32.33 | 63.82 |
三体训练 | 80.03 | 35.92 | 64.13 |
标签传播 | 85.02 | 40.95 | 67.65 |
[1] | SZELISKI R. Computer vision[M]. Berlin, Heidelberg: Spr-inger, 2011. |
[2] |
WANG X, SOUMITRA G, SUN W G. Quantitative quality control in microarray image processing and data acquisition[J]. Nucleic Acids Research, 2019, 29(15): 75-80.
DOI URL |
[3] | BILLINGSLEY F C. Applications of digital image processing[J]. Applied Optics, 1970, 9(2): 101-106. |
[4] | TSURUOKA Y. Deep Learning and natural language processing[J]. Brain and Nerve, 2019, 71(1): 45-55. |
[5] | DOWNS J M. Identifying suicidal adolescents from mental hea-lth records using natural language processing[J]. Studies in Health Technology and Informatics, 2019, 264: 413-417. |
[6] | 余凯, 贾磊, 陈雨强, 等. 深度学习的昨天、今天和明天[J]. 计算机研究与发展, 2013, 50(9): 1799-1804. |
YU K, JIA L, CHEN Y Q, et al. Deep learning yesterday, today and tomorrow[J]. Journal of Computer Research and Development, 2013, 50(9): 1799-1804. | |
[7] | OLIVIER C, BERNHARD S, ALEXANDER Z. Semi-supervised learning[J]. IEEE Transactions on Neural Net-works, 2009, 20(3): 542-543. |
[8] | YANG X L, SONG Z X, IRWIN K, et al. A survey on deep semi-supervised learning[J]. arXiv:2103.00550, 2021. |
[9] | MILLER D J, UYAR H S. A mixture of experts classifier with learning based on both labelled and unlabelled data[C]// Adva-nces in Neural Information Processing Systems 9, Denver, Dec 2-5, 1996. Cambridge: MIT Press, 1997: 571-577. |
[10] |
NIGAM K, MCCALLUM A, THRUN S, et al. Text classifi-cation from labeled and unlabeled documents using EM[J]. Machine Learning, 2000, 39(2/3): 103-134.
DOI URL |
[11] | SPRINGENBERG J T. Unsupervised and semi-supervised lear-ning with categorical generative adversarial networks[J]. Com-puter Science, 2015, 42(7): 70-85. |
[12] | ABBASNEJAD M E, DICK A R, VAN DEN HENGEL A R. Infinite variational autoencoder for semi-supervised learning[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Jul 21-26, 2017. Washington: IEEE Computer Society, 2017: 781-790. |
[13] | JOY T, SCHMON S M, TORR P H, et al. Rethinking semi-supervised learning in VAEs[J]. arXiv:2006.10102, 2020. |
[14] | ZHU X J. Semi-supervised learning literature survey[D]. Madison: University of Wisconsin, 2005. |
[15] | SAJJADI M, JAVANMARDI M, TASDIZEN T. Regulariza-tion with stochastic transformations and perturbations for deep semi supervised learning[C]// Advances in Neural Informa-tion Processing Systems 29, Barcelona, Dec 5-10, 2016. Red Hook: Curran Associates, 2016: 1163-1171. |
[16] | IZMAILOV P, PODOPRIKHIN D, GARIPOV T, et al. Avera-ging weights leads to wider optima and better generalization[C]// Proceedings of the 34th Conference on Uncertainty in Arti-ficial Intelligence, Monterey, Aug 6-10, 2018: 876-885. |
[17] | PARK S, PARK J, SHIN S, et al. Adversarial dropout for supervised and semi-supervised learning[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Arti-ficial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 3917-3924. |
[18] | ZHU X J. Learning from labeled and unlabeled data with label propagation[R]. Carnegie Mellon University, 2002: 1-8. |
[19] | ZHU X J, GHAHRAMANI Z, LAFFERTY J D. Semi-supervised learning using Gaussian fields and harmonic functions[C]// Proceedings of the 20th International Conference on Machine Learning, Washington, Aug 21-24, 2003. Menlo Park: AAAI, 2003: 912-919. |
[20] | ZHOU D Y, BOUSQUET O, LAL T N, et al. Learning with local and global consistency[C]// Advances in Neural Informa-tion Processing Systems 16, Vancouver, Dec 8-13, 2003. Camb-ridge: MIT Press, 2003: 321-328. |
[21] | WANG D X, CUI P, ZHU W W. Structural deep network embe-dding[C]// Proceedings of the 22nd ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 13-17, 2016. New York: ACM, 2016: 1225-1234. |
[22] | CAO S, LU W, XU Q. Deep neural networks for learning graph representations[C]// Proceedings of the 30th AAAI Confe-rence on Artificial Intelligence, Phoenix, Feb 2-17, 2016. Menlo Park: AAAI, 2016: 1145-1152. |
[23] | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[J]. arXiv:1609.02907, 2016. |
[24] | ZHANG H, CISS M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[J]. arXiv:1710.09412, 2017. |
[25] | VERMA V, LAMB A, KANNALA J, et al. Interpolation con-sistency training for semi-supervised learning[J]. Neural Net-works, 2019, 145: 90-106. |
[26] | BERTHELOT D, CARLINI N, GOODFELLOW I J, et al. Mixmatch: a holistic approach to semi-supervised learning[C]// Advances in Neural Information Processing Systems 32, Vancouver, Dec 8-14, 2019: 5050-5060. |
[27] | XIE J R, SZYMANSKI B K. LabelRank: a stabilized label propagation algorithm for community detection in networks[C]// Proceedings of the 2nd IEEE Network Science Work-shop, Thayer Hotel, Apr 29-May 1, 2013. Washington: IEEE Computer Society, 2013: 138-143. |
[28] | OLIVIER C, BERNHARD S, ALEXANDER Z. Semi-supervised learning[J]. IEEE Transactions on Neural Networks, 2009, 20(3): 542. |
[29] | DAVID Y. Unsupervised word sense disambiguation riva-ling supervised methods[C]// Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cam-bridge, Jun 26-30, 1995. Stroudsburg: ACL, 1995: 189-196. |
[30] |
SCUDDER H. Probability of error of some adaptive pattern-recognition machines[J]. IEEE Transactions on Information Theory, 1965, 11(3): 363-371.
DOI URL |
[31] | RILOFF E. Automatically generating extraction patterns from untagged text[C]// Proceedings of the 13th National Confe-rence on Artificial Intelligence and 8th Innovative Applica-tions of Artificial Intelligence Conference, Portland, Aug 4-8, 1996. Menlo Park: AAAI, 1996: 1044-1049. |
[32] | 李南. 基于聚类假设的数据流分类算法[J]. 模式识别与人工智能, 2017, 30(1): 1-10. |
LI N. Clustering assumption based classification algorithm for stream data[J]. Pattern Recognition and Artificial Intelli-gence, 2017, 30(1): 1-10. | |
[33] | XI W, JANG U, CHEN L, et al. Manifold assumption and defenses against adversarial perturbations[J]. arXiv:1711.08001, 2017. |
[34] | LEE D H. Pseudo-Label: the simple and efficient semisuper-vised learning method for deep neural networks[C]// Procee-dings of the Workshop: Challenges in Representation Learning, Atlanta, Jun 16-21, 2013. New York: ACM, 2013: 1-6. |
[35] | YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods[C]// Proceedings of the 33rd An-nual Meeting of the Association for Computational Linguis-tics, Cambridge, Jun 26-30, 1995. Stroudsburg: ACL, 1995: 189-196. |
[36] |
SCUDDER H. Probability of error of some adaptive pattern-recognition machines[J]. IEEE Transactions on Information Theory, 1965, 11(3): 363-371.
DOI URL |
[37] | YALNIZ I Z, JÉGOU H, CHEN K, et al. Billion-scale semi-supervised learning for image classification[J]. arXiv:1905.00546, 2019. |
[38] | GRANDVALET Y, BENGIO Y. Semi-supervised learning by entropy minimization[C]// Proceedings of the Conférence Fran-cophone sur l’apprentissage Automatique, Nice, 2005: 281-296. |
[39] | OLIVER A, ODENA A, RAFFEL C, et al. Realistic evalua-tion of deep semi-supervised learning algorithms[C]// Adva-nces in Neural Information Processing Systems 31, Mont-réal, Dec 3-8, 2018: 3239-3250. |
[40] | SHI W W, GONG Y H, DING C, et al. Transductive semi-supervised deep learning using min-max features[C]// LNCS 11209: Proceedings of the 15th European Conference on Com-puter Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 311-327. |
[41] | ISCEN A, TOLIAS G, AVRITHIS Y, et al. Label propagation for deep semi-supervised learning[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recogni-tion, Long Beach, Jun 16-20, 2019. Piscataway: IEEE, 2019: 5070-5079. |
[42] | ARAZO E, DIEGO O, PAUL A, et al. Pseudo-labeling and confirmation bias in deep semi-supervised learning[J]. arXiv:1908.02983, 2019. |
[43] | XIE Q, LU M, HOVY E H, et al. Self-training with noisy student improves ImageNet classification[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pat-tern Recognition, Seattle, Jun 13-19, 2020. Piscataway: IEEE, 2020: 10684-10695. |
[44] | HINTON G E, VINYALS O, DEAN J. Distilling the know-ledge in a neural network[J]. arXiv:1503.02531, 2015. |
[45] | TAN M X, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks[J]. arXiv:1905.11946v5, 2019. |
[46] | LIU Y, LIM H, XIE L. Exploration of chemical space with partial labeled noisy student self-training for improving deep learning: application to drug metabolism[EB/OL]. [2021-08-23]. https://doi.org/10.1101/2020.08.06.239988. |
[47] | KUMAR V, RAO S, YU L. Noisy student training using body language dataset improves facial expression recognition[C]// LNCS 12535: Proceedings of the 16th ECCV Workshops on Computer Vision, Glasgow, Aug 23-28, 2020. Cham: Spri-nger, 2020: 756-773. |
[48] | BEYER L, ZHAI X H, OLIVER A, et al. S4L: self-super-vised semi-supervised learning[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Oct 27-Nov 2, 2019. Piscataway: IEEE, 2019: 1476-1485. |
[49] | PHAM H, DAI Z H, XIE Q Z, et al. Meta pseudo labels[J]. arXiv:2003.10580, 2020. |
[50] | KUMAR A, DAUMÉ Ⅲ H. A co-training approach for multi-view spectral clustering[C]// Proceedings of the 28th Interna-tional Conference on Machine Learning, Bellevue, Jun 28-Jul 2, 2011. Madison: Omni Press, 2011: 393-400. |
[51] |
ZHAO J, XIE X J, XU X, et al. Multi-view learning over-view: recent progress and new challenges[J]. Information Fusion, 2017, 38: 43-54.
DOI URL |
[52] | BLUM A, MITCHELL T M. Combining labeled and unlabeled data with co-training[C]// Proceedings of the 11th Annual Confe-rence on Computational Learning Theory, Madison, Jul 24-26, 1998. New York: ACM, 1998: 92-100. |
[53] |
TRAN H Q, HA C. Reducing the burden of data collection in a fingerprinting-based VLP system using a hybrid of improved co-training semi-supervised regression and adaptive boosting algorithms[J]. Optics Communications, 2021, 488: 126857.
DOI URL |
[54] |
DÍAZ G, PERALTA B, CARO L A, et al. Co-training for visual object recognition based on self-supervised models using a cross-entropy regularization[J]. Entropy, 2021, 23(4): 423.
DOI URL |
[55] | CHEN D D, WANG W, GAO W, et al. Tri-net for semi-supervised deep learning[C]// Proceedings of the 27th Interna-tional Joint Conference on Artificial Intelligence, Stockholm, Jul 13-19, 2018: 2014-2020. |
[56] | RUDER S, PLANK B. Strong baselines for neural semi-supervised learning under domain shift[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Jul 15-20, 2018. Stroudsburg: ACL, 2018: 76-85. |
[57] | CLARK K, LUONG M T, MANNING C D, et al. Semi-supervised sequence modeling with cross-view training[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Oct 31-Nov 4, 2018. Stroudsburg: ACL, 2018: 1914-1925. |
[58] | CLARK K, LUONG T, LE Q V. Cross-view training for semi-supervised learning[C]// Proceedings of the 2018 Conference Acceptance Decision, Vancouver, Apr 30-May 3, 2018: 1-14. |
[59] | YANG L, LIU H B, ZHOU J H, et al. Pluggable weakly-supervised cross-view learning for accurate vehicle reidenti-fication[J]. arXiv:2103.05376, 2021. |
[60] | 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016. |
ZHOU Z H. Machine learning[M]. Beijing: Tsinghua Univer-sity Press, 2016. | |
[61] |
YI Y, CHEN Y Q, WANG J Z, et al. Joint feature represen-tation and classification via adaptive graph semi-supervised nonnegative matrix factorization[J]. Signal Processing: Image Communication, 2020, 89: 115984.
DOI URL |
[62] | OUALI Y, HUDELOT C, TAMI M. An overview of deep semi-supervised learning[J]. arXiv:2006.05278, 2020. |
[63] |
KUMAR S, SINGHLA L, JINDAL K, et al. IM-ELPR: inf-luence maximization in social networks using label propa-gation based community structure[J]. Applied Intelligence, 2021, 51(11): 7647-7665.
DOI URL |
[64] | XIE T, WANG B, KUO C C J. GraphHop: an enhanced label propagation method for node classification[J]. arXiv:2101.02326, 2021. |
[65] | 王俊斌. 基于标签传播的半监督聚类算法研究[D]. 太原: 山西大学, 2020. |
WANG J B. Research on semi-supervised clustering algorithm based on label propagation[D]. Taiyuan: Shanxi University, 2020. | |
[66] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Image-Net classification with deep convolutional neural networks[C]// Advances in Neural Information Processing Systems 25, Lake Tahoe, Dec 3-6, 2012. Red Hook: Curran Associates, 2012: 1106-1114. |
[67] | KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Auto-immune Diseases, 2009, 1(4): 1-60. |
[68] | SINGH A, NOWAK R D, ZHU X J. Unlabeled data: now it helps, now it doesn’t[C]// Proceedings of the 22nd Annual Conference on Neural Information Processing Systems, Vancouver, Dec 8-11, 2008. Red Hook: Curran Associates, 2008: 1513-1520. |
[69] |
YANG T, PRIEBE C E. The effect of model misspecification on semi-supervised classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(10): 2093-2103.
DOI URL |
[70] |
LU Z W, WANG L W. Noise-robust semi-supervised learning via fast sparse coding[J]. Pattern Recognition, 2015, 48(2): 605-612.
DOI URL |
[71] | HAN B, YAO Q M, YU X R, et al. Co-teaching: robust trai-ning of deep neural networks with extremely noisy labels[C]// Advances in Neural Information Processing Systems 31, Mon-tréal, Dec 3-8, 2018: 8536-8546. |
[1] | 安凤平, 李晓薇, 曹翔. 权重初始化-滑动窗口CNN的医学图像分类[J]. 计算机科学与探索, 2022, 16(8): 1885-1897. |
[2] | 曾凡智, 许露倩, 周燕, 周月霞, 廖俊玮. 面向智慧教育的知识追踪模型研究综述[J]. 计算机科学与探索, 2022, 16(8): 1742-1763. |
[3] | 刘艺, 李蒙蒙, 郑奇斌, 秦伟, 任小广. 视频目标跟踪算法综述[J]. 计算机科学与探索, 2022, 16(7): 1504-1515. |
[4] | 赵小明, 杨轶娇, 张石清. 面向深度学习的多模态情感识别研究进展[J]. 计算机科学与探索, 2022, 16(7): 1479-1503. |
[5] | 夏鸿斌, 肖奕飞, 刘渊. 融合自注意力机制的长文本生成对抗网络模型[J]. 计算机科学与探索, 2022, 16(7): 1603-1610. |
[6] | 孙方伟, 李承阳, 谢永强, 李忠博, 杨才东, 齐锦. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259. |
[7] | 程卫月, 张雪琴, 林克正, 李骜. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154. |
[8] | 钟梦圆, 姜麟. 超分辨率图像重建算法综述[J]. 计算机科学与探索, 2022, 16(5): 972-990. |
[9] | 裴利沈, 赵雪专. 群体行为识别深度学习方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 775-790. |
[10] | 许嘉, 韦婷婷, 于戈, 黄欣悦, 吕品. 题目难度评估方法研究综述[J]. 计算机科学与探索, 2022, 16(4): 734-759. |
[11] | 朱伟杰, 陈莹. 双流时间域信息交互的微表情识别卷积网络[J]. 计算机科学与探索, 2022, 16(4): 950-958. |
[12] | 姜艺, 胥加洁, 柳絮, 朱俊武. 边缘指导图像修复算法研究[J]. 计算机科学与探索, 2022, 16(3): 669-682. |
[13] | 张全贵, 胡嘉燕, 王丽. 耦合用户公共特征的单类协同过滤推荐算法[J]. 计算机科学与探索, 2022, 16(3): 637-648. |
[14] | 邬开俊, 黄涛, 王迪聪, 白晨帅, 陶小苗. 视频异常检测技术研究进展[J]. 计算机科学与探索, 2022, 16(3): 529-540. |
[15] | 刘颖, 郭莹莹, 房杰, 范九伦, 郝羽, 刘继明. 深度学习跨模态图文检索研究综述[J]. 计算机科学与探索, 2022, 16(3): 489-511. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||