增量式约简拉氏非对称ν型孪生支持向量回归机

doi:10.3778/j.issn.1673-9418.2207050

摘要/Abstract

摘要： 拉氏非对称ν型孪生支持向量回归机是一种泛化性能良好的预测算法，然而其并不适用于增量提供样本的场景。为此，提出了一种增量式约简拉氏非对称ν型孪生支持向量回归机（IRLAsy-ν-TSVR）算法。首先，引入正号函数，将有约束最优化问题转换成无约束最优化问题，并采用半光滑牛顿法在原始空间直接求解，以加快收敛速度。接着，利用矩阵求逆引理，实现半光滑牛顿法中Hessian矩阵求逆的高效增量更新，节省时间开销。然后，为了减少样本累积导致的内存消耗，使用约简技术分别筛选增广核矩阵的列向量和行向量以逼近原增广核矩阵，确保解的稀疏性。最后，在基准测试数据集上验证算法的可行性和有效性。结果表明，与一些代表性算法相比，IRLAsy-ν-TSVR算法继承了离线算法的泛化性能，能够获得稀疏解，更适合大规模数据集的在线学习。

关键词: 孪生支持向量回归机（TSVR）, 半光滑牛顿法, 在线学习, 增量式学习, 约简技术

Abstract: Lagrangian asymmetric ν-twin support vector regression is a prediction algorithm with good generalization performance. However, it is unsuitable for the scenarios where the samples are provided incrementally. Therefore, an incremental Lagrangian asymmetric ν-twin support vector regression (IRLAsy-ν-TSVR) algorithm is proposed. Firstly, the constrained optimization problems are transformed into unconstrained ones by introducing the plus functions, and the semi-smooth Newton method is utilized to directly solve them in the primal space to accelerate the convergence speed. Then, the matrix inverse lemma is adopted to realize efficient incremental update of the Hessian matrix inversion in the semi-smooth Newton method and save time. Next, to reduce the memory cost caused by the sample accumulation, the column and row vectors of the augmented kernel matrix are filtered by the reduced technology to approximate the original augmented kernel matrix, and this ensures the sparsity of the solution. Finally, the feasibility and efficacy of the proposed algorithm are validated on the benchmark datasets. The results show that compared with some state-of-the-art algorithms, the IRLAsy-ν-TSVR algorithm inherits the generali-zation performance of the offline algorithm and can obtain sparse solution, which is more suitable for online learning of large-scale datasets.

Key words: twin support vector regression (TSVR), semi-smooth Newton method, online learning, incremental learning, reduced technology

张帅鑫, 顾斌杰, 潘丰. 增量式约简拉氏非对称ν型孪生支持向量回归机[J]. 计算机科学与探索, 2023, 17(11): 2640-2650.

ZHANG Shuaixin, GU Binjie, PAN Feng. Incremental Reduced Lagrangian Asymmetric ν-Twin Support Vector Regression[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(11): 2640-2650.

参考文献

[1] SCHÖLKOPF B, BARTLETT P, SMOLA A, et al. Support vector regression with automatic accuracy control[C]//Procee-dings of the 1998 International Conference on Artificial Neural Networks, Sk?vde, Sep 2-4, 1998. Berlin, Heidelberg: Springer, 1998: 111-116.
[2] ZHANG Z, HONG W C. Electric load forecasting by comp-lete ensemble empirical mode decomposition adaptive noise and support vector regression with quantum-based dragonfly algorithm[J]. Nonlinear Dynamics, 2019, 98(4): 1107-1136.
[3] OUAHILAL M, MOHAJIR M E, CHAHHOU M, et al. Opti-mizing stock market price prediction using a hybrid appro-ach based on HP filter and support vector regression[C]//Proceedings of the 2016 4th IEEE International Colloquium on Information Science and Technology, Tangier, Oct 24-26, 2016. Piscataway: IEEE, 2016: 290-294.
[4] DHIMAN H S, DEB D, GUERRERO J M. Hybrid machine intelligent SVR variants for wind forecasting and ramp eve-nts[J]. Renewable and Sustainable Energy Reviews, 2019, 108(7): 369-379.
[5] KOWALSKI K, OKUJENI A, BRELL M, et al. Quantifying drought effects in central European grasslands through regression-based unmixing of intra-annual Sentinel-2 time series[J]. Remote Sensing of Environment, 2022, 268: 112781.
[6] SCHÖLKOPF B, SMOLA A J, WILLIAMSON R C, et al. New support vector algorithms[J]. Neural Computation, 2000, 12(5): 1207-1245.
[7] HUANG X L, SHI L, PELCKMANS K, et al. Asymmetric ν-tube support vector regression[J]. Computational Statistics & Data Analysis, 2014, 77(7): 371-382.
[8] PENG X J. TSVR: an efficient twin support vector machine for regression[J]. Neural Networks, 2010, 23(3): 365-372.
[9] SINGH M, CHADHA J, AHUJA P, et al. Reduced twin sup-port vector regression[J]. Neurocomputing, 2011, 74(9): 1474-1477.
[10] SHAO Y H, ZHANG C H, YANG Z M, et al. An ε-twin support vector machine for regression[J]. Neural Computing and Applications, 2013, 23(1): 175-185.
[11] 卢振兴, 杨志霞, 高新豫. 最小二乘双支持向量回归机[J]. 计算机工程与应用, 2014, 50(23): 140-144.
LU Z X, YANG Z X, GAO X Y. Least square twin support vector regression[J]. Computer Engineering and Applications, 2014, 50(23): 140-144.
[12] RASTOGI R, ANAND P, CHANDRA S. A ν-twin support vector machine based regression with automatic accuracy control[J]. Applied Intelligence, 2016, 46(3): 1-14.
[13] XU Y, LI X, PAN X, et al. Asymmetric ν-twin support vector regression[J]. Neural Computing and Applications, 2017, 30(12): 3799-3814.
[14] GUPTA U, GUPTA D. An improved regularization based Lagrangian asymmetric ν-twin support vector regression using pinball loss function[J]. Applied Intelligence, 2019, 49(10): 3606-3627.
[15] GUPTA D, GUPTA U. On robust asymmetric Lagrangian ν-twin support vector regression using pinball loss function[J]. Applied Soft Computing, 2021, 102(3): 1-33.
[16] MA J S, THEILER J, PERKINS S. Accurate online support vector regression[J]. Neural Computation, 2003, 15(11): 2683-2703.
[17] GU B, SHENG V S, TAY K Y, et al. Incremental support vector learning for ordinal regression[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 26(7): 1403-1416.
[18] 顾斌杰, 潘丰. 精确增量式在线ν型支持向量回归机学习算法[J]. 控制理论与应用, 2016, 33(4): 466-478.
GU B J, PAN F. Accurate incremental online ν-support vector regression learning algorithm[J]. Control Theory & Applications, 2016, 33(4): 466-478.
[19] HUANG Y, LU J, ZHANG G Q. An online robust support vector regression for data streams[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 34(1): 150-163.
[20] 曹杰, 顾斌杰, 潘丰, 等. 精确增量式ε型孪生支持向量回归机[J]. 控制理论与应用, 2022, 39(6): 1020-1032.
CAO J, GU B J, PAN F, et al. Accurate incremental ε-twin support vector regression[J]. Control Theory & Applications, 2022, 39(6): 1020-1032.
[21] HAO Y H, ZHANG H F. Incremental learning algorithm based on twin support vector regression[J]. Computer Scie-nce, 2016(2): 230-234.
[22] 曹杰, 顾斌杰, 潘丰, 等. 增量式约简最小二乘孪生支持向量回归机[J]. 计算机科学与探索, 2021, 15(3): 553-563.
CAO J, GU B J, PAN F, et al. Incremental reduced least squares twin support vector regression[J]. Journal of Frontiers of Com-puter Science and Technology, 2021, 15(3): 553-563.
[23] FUNG G , MANGASARIAN O L. Finite Newton method for Lagrangian support vector machine classification[J]. Neuro-computing, 2003, 55(1): 39-55.
[24] CAI T, LI M, YAO Y, et al. An improved nonlinear smooth twin support vector regression based-behavioral model for joint compensation of frequency-dependent transmitter non-linearities[J]. International Journal of RF and Microwave Computer-Aided Engineering, 2021, 31(6): 1-16.
[25] 郑逢德, 张鸿宾. 双支持向量回归的牛顿算法[J]. 计算机工程, 2013, 39(1):191-194.
ZHENG F D, ZHANG H B. Newton algorithm for double support vector regression[J]. Computer Engineering, 2013, 39(1): 191-194.
[26] GOLUB G H, VAN LOAN C F. Matrix computations[M]. Baltimore: Johns Hopkins University Press, 1996.
[27] TYLAVSKY D J, SOHIE G R L. Generalization of the matrix inversion lemma[J]. Proceedings of the IEEE, 1986, 74(7): 1050-1052.
[28] TANVEER M, SHUBHAM K. A regularization on Lagran-gian twin support vector regression[J]. International Journal of Machine Learning and Cybernetics, 2017, 8(3): 807-821.
张帅鑫（1996—），男，江苏南通人，硕士研究生，主要研究方向为机器学习、模式识别、算法优化。
ZHANG Shuaixin, born in 1996, M.S. candidate. His research interests include machine learning, pattern recognition and algorithm optimization.