计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (11): 2640-2650.DOI: 10.3778/j.issn.1673-9418.2207050

• 理论·算法 • 上一篇    下一篇

增量式约简拉氏非对称ν型孪生支持向量回归机

张帅鑫,顾斌杰,潘丰   

  1. 江南大学 轻工过程先进控制教育部重点实验室,江苏 无锡 214122
  • 出版日期:2023-11-01 发布日期:2023-11-01

Incremental Reduced Lagrangian Asymmetric ν-Twin Support Vector Regression

ZHANG Shuaixin, GU Binjie, PAN Feng   

  1. Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2023-11-01 Published:2023-11-01

摘要: 拉氏非对称ν型孪生支持向量回归机是一种泛化性能良好的预测算法,然而其并不适用于增量提供样本的场景。为此,提出了一种增量式约简拉氏非对称ν型孪生支持向量回归机(IRLAsy-ν-TSVR)算法。首先,引入正号函数,将有约束最优化问题转换成无约束最优化问题,并采用半光滑牛顿法在原始空间直接求解,以加快收敛速度。接着,利用矩阵求逆引理,实现半光滑牛顿法中Hessian矩阵求逆的高效增量更新,节省时间开销。然后,为了减少样本累积导致的内存消耗,使用约简技术分别筛选增广核矩阵的列向量和行向量以逼近原增广核矩阵,确保解的稀疏性。最后,在基准测试数据集上验证算法的可行性和有效性。结果表明,与一些代表性算法相比,IRLAsy-ν-TSVR算法继承了离线算法的泛化性能,能够获得稀疏解,更适合大规模数据集的在线学习。

关键词: 孪生支持向量回归机(TSVR), 半光滑牛顿法, 在线学习, 增量式学习, 约简技术

Abstract: Lagrangian asymmetric ν-twin support vector regression is a prediction algorithm with good generalization performance. However, it is unsuitable for the scenarios where the samples are provided incrementally. Therefore, an incremental Lagrangian asymmetric ν-twin support vector regression (IRLAsy-ν-TSVR) algorithm is proposed. Firstly, the constrained optimization problems are transformed into unconstrained ones by introducing the plus functions, and the semi-smooth Newton method is utilized to directly solve them in the primal space to accelerate the convergence speed. Then, the matrix inverse lemma is adopted to realize efficient incremental update of the Hessian matrix inversion in the semi-smooth Newton method and save time. Next, to reduce the memory cost caused by the sample accumulation, the column and row vectors of the augmented kernel matrix are filtered by the reduced technology to approximate the original augmented kernel matrix, and this ensures the sparsity of the solution. Finally, the feasibility and efficacy of the proposed algorithm are validated on the benchmark datasets. The results show that compared with some state-of-the-art algorithms, the IRLAsy-ν-TSVR algorithm inherits the generali-zation performance of the offline algorithm and can obtain sparse solution, which is more suitable for online learning of large-scale datasets.

Key words: twin support vector regression (TSVR), semi-smooth Newton method, online learning, incremental learning, reduced technology