Maximize AUC for Positive-Unlabeled Classification and Incremental Algorithm

doi:10.3778/j.issn.1673-9418.1912073

Abstract

Abstract:

Positive-unlabeled classification is referred to as PU classification. Since there are only positive samples and unlabeled samples, the traditional classification methods are not effective in PU classification. For this reason, this paper proposes to apply AUC (area under receiver operating characteristic curve) in traditional classification methods as an objective function to PU classification because of the relationship between AUC under PU classification and traditional classification. For making the data linearly separable, this paper uses Gaussian kernel function to map the original sample to high-dimensional space. Optimizing the AUC objective function to obtain an analytical solution avoids the trouble of multiple iterations, and can derive an incremental formula to speed up the operation speed. Experimental results show that the proposed algorithm achieves performance similar to an ideal support vector machine (SVM) whose labels are known for all positive and negative examples in the training set, and achieves rapid increments. It is a powerful tool for dealing with real problems.

Key words: machine learning, positive-unlabeled (PU) classification, AUC, incremental algorithm

摘要：

正例未标注分类简称PU分类，由于只有正例样本与未标注样本，传统的分类方法在PU分类中往往效果不甚理想。为此利用PU分类下的AUC与传统分类下的AUC关系，提出了将传统分类方法中AUC作为目标函数应用到PU分类中，利用高斯核函数将原始样本映射到高维空间使数据线性可分。通过优化AUC目标函数得到解析解避免了多次迭代的麻烦，并可以推导出增量公式，加快了运算速度。实验结果表明，所提算法实现了与训练集内所有正例与负例标签都已知的理想支持向量机（SVM）相近的性能，并且实现了快速增量，是处理现实问题的有力工具。

关键词: 机器学习, PU分类, AUC, 增量算法

MA Yumin, WANG Shitong. Maximize AUC for Positive-Unlabeled Classification and Incremental Algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(11): 1879-1887.

马毓敏，王士同. 最大化AUC的正例未标注分类及其增量算法[J]. 计算机科学与探索, 2020, 14(11): 1879-1887.

References

[1] Zhou J T, Pan S J, Mao Q, et al. Multi-view positive and unlabeled learning[J]. Journal of Machine Learning Research, 2012, 25: 555-570.
[2] Xiao Y Y, Segal M R. Biological sequence classification utilizing positive and unlabeled data[J]. Bioinformatics, 2008, 24(9): 1198-1205.
[3] Wang L, Hao S, Wang Q, et al. Semi-supervised classification for hyperspectral imagery based on spatial-spectral label propagation[J]. ISPRS Journal of Photogrammetry & Remote Sensing, 2014, 97: 123-137.
[4] Liu L, Peng T. Clustering-based method for positive and unlabeled text categorization enhanced by improved TFIDF[J]. Journal of Information Science and Engineering, 2014, 30(5): 1463-1481.
[5] Xu S, Han K, Xu N T. A supervised learning approach to link prediction in dynamic networks[C]//LNCS 10874: Proceedings of the 13th International Conference on Wireless Algorithms, Systems, and Applications, Tianjin, Jun 20-22, 2018. Berlin, Heidelberg: Springer, 2018: 799-805.
[6] Park K, Shin H. Stock price prediction based on a complex interrelation network of economic factors[J]. Engineering Applications of Artificial Intelligence, 2013, 26(5/6): 1550-1561.
[7] Sakai T, Niu G, Sugiyama M. Semi-supervised AUC optimization based on positive-unlabeled learning[J]. Machine Learning, 2018, 107(4): 767-794.
[8] Rüping S. Incremental learning with support vector machines[C]//Proceedings of the 2001 IEEE International Conference on Data Mining, San Jose, Nov 29-Dec 2, 2001. Washington: IEEE Computer Society, 2001: 641-642.
[9] Elwell R, Polikar R. Incremental learning of concept drift in nonstationary environments[J]. IEEE Transactions on Neural Networks, 2011, 22(10): 1517-1531.
[10] Ren K, Yang H C, Zhao Y, et al. A robust AUC maximization framework with simultaneous outlier detection and feature selection for positive-unlabeled classification[J]. IEEE Transactions on Neural Networks & Learning Systems, 2019, 30(10): 3072-3083.
[11] Dodd L E, Pepe M S. Partial AUC estimation and regression[J]. Biometrics, 2003, 59(3): 614-623.
[12] Lobo J M. AUC: a misleading measure of the performance of predictive distribution models[J]. Global Ecology and Biogeography, 2007, 17(2): 145-151.
[13] Ortiz-Boyer D. Boosting k-nearest neighbor classifier by means of input space projection[J]. Expert Systems with Applications, 2009, 36(7): 10570-10582.
[14] Zhang B, Srihari S N. Fast k-nearest neighbor classification using cluster-based trees[J]. IEEE Transactions on Pattern analysis and Machine Intelligence, 2004, 26(4): 525-528.
[15] Purves R D. Optimum numerical integration methods for estimation of area-under-the-curve (AUC) and area-under-the-moment-curve (AUMC)[J]. Journal of Pharmacokinetics & Biopharmaceutics, 1992, 20(3): 211-226.
[16] Gao W, Wang L, Jin R, et al. One-pass AUC optimization[J]. Artificial Intelligence, 2016, 236: 1-29.
[17] Burges C J C. From RankNet to LambdaRank to Lambda-MART: an overview: MSR-TR-2010-8[R]. Microsoft Research Technical Report, 2010.
[18] Rudin C, Schapire R E. Margin-based ranking and an equivalence between AdaBoost and RankBoost[J]. Journal of Machine Learning Research, 2009, 10(3): 2193-2232.
[19] Feng C, Liao S Z. Large-scale kernel methods via random hypothesis spaces[J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(5): 785-793. 冯昌, 廖士中. 大规模核方法的随机假设空间方法[J]. 计算机科学与探索, 2018, 12(5): 785-793.
[20] Bauer J, Haugland D, Yuan D. New results on the time complexity and approximation ratio of the broadcast incremental power algorithm[J]. Information Processing Letters, 2009, 109(12): 615-619.
[21] du Plessis M C D, Niu G, Sugiyama M. Analysis of learning from positive and unlabeled data[C]//Proceedings of the 2014 Annual Conference on Neural Information Processing Systems, Montreal, Dec 8-13, 2014. Red Hook: Curran Associates, 2014: 703-711.