计算机科学与探索 ›› 2023, Vol. 17 ›› Issue (7): 1599-1608.DOI: 10.3778/j.issn.1673-9418.2112108

• 理论·算法 • 上一篇    下一篇

VC维期望上界最小化的最小二乘支持向量机

李昊,王士同   

  1. 1. 江南大学 人工智能与计算机学院,江苏 无锡 214122
    2. 江苏省物联网应用技术重点建设实验室,江苏 无锡 214122
  • 出版日期:2023-07-01 发布日期:2023-07-01

Least Squares Support Vector Machine for Minimizing VC Dimensional Expectation Upper Bound

LI Hao, WANG Shitong   

  1. 1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
    2. Jiangsu Key Construction Laboratory of IoT Application Technology, Wuxi, Jiangsu 214122, China
  • Online:2023-07-01 Published:2023-07-01

摘要: 机器学习是现代计算机技术中比较重要的一个方面,而其中的支持向量机方法也因其良好的性能近年来得到了广泛关注,被应用于各行各业当中。而支持向量机的性能可以用VC维来衡量,VC维是衡量机器复杂性的一种指标,理论上来说,低VC维可以得到很好的泛化。但是,对于以传统支持向量机为基础的一些分类器方法来说,VC维在处理各种各样的数据时,支持向量机的VC维的上界可能是无穷的。尽管在实践与应用中得到了很好的结果,但并不能保证很好的泛化,导致对于一些特殊的数据取得的效果不好。因此,提出了一种改进LSSVM的算法,以LSSVM算法为基础,将VC维上界最小化并找到其期望的最佳投影方案,最后带入到LSSVM算法中来对数据进行分类。实验结果表明,在所采用的基准数据集上,该分类器的错误率低于传统最小二乘支持向量机,这意味着,所提出的算法以近似的支持向量个数使得测试精度优于比较算法,提高了算法的泛化能力。

关键词: 最小二乘支持向量机(LSSVM), 机器学习, VC维, 分类

Abstract: Machine learning is an important aspect of modern computer technology, and the support vector machine method has been widely used in all walks of life because of its good performance in recent years. The performance of support vector machine can be measured by VC (Vapnik-Chervonenkis) dimension, which is an index to measure the complexity of the machine. In theory, low VC dimension can be well generalized. However, for some classifier methods based on traditional support vector machine, the upper bound of VC dimension of support vector machine may be infinite when dealing with a variety of data. Although good results have been obtained in practice and application, it can not guarantee good generalization, resulting in poor results for some special data. Therefore, this paper proposes an improved LSSVM (least squares support vector machines) algorithm. Based on LSSVM algorithm, this paper minimizes the upper bound of VC dimension and finds the desired best projection scheme. Finally, this paper brings it into LSSVM algorithm to classify the data. Experimental results show that the error rate of the classifier is lower than that of the traditional least squares support vector machine on the benchmark dataset, which means that the proposed algorithm makes the test accuracy better than the comparison algorithm with the approximate number of support vectors, and improves the generalization ability of the algorithm.

Key words: least squares support vector machine (LSSVM), machine learning, VC dimension, classification