计算机科学与探索 ›› 2019, Vol. 13 ›› Issue (4): 647-656.DOI: 10.3778/j.issn.1673-9418.1805036

• 人工智能与模式识别 • 上一篇    下一篇

自适应的SVM增量算法

何  丽,韩克平+,刘  颖   

  1. 天津财经大学 理工学院,天津 300222
  • 出版日期:2019-04-01 发布日期:2019-04-10

Self-Adaptive SVM Incremental Learning Algorithm

HE Li, HAN Keping+, LIU Ying   

  1. College of Science and Technology, Tianjin University of Finance & Economics, Tianjin 300222, China
  • Online:2019-04-01 Published:2019-04-10

摘要: 支持向量机(support vector machine,SVM)算法因其在小样本训练集上的优势和较好的鲁棒性,被广泛应用于处理分类问题。但是对于增量数据和大规模数据,传统的SVM分类算法不能满足需求,增量学习是解决这些问题的有效方法之一。基于数据分布的结构化描述,提出了一种自适应SVM增量学习算法。该算法根据原样本和新增样本与当前分类超平面之间的几何距离,建立了自适应的增量样本选择模型,该模型能够有效地筛选出参与增量训练的边界样本。为了平衡增量学习的速度和性能,模型分别为新增样本和原模型样本设置了基于空间分布相似性的调整系数。实验结果表明,该算法在加快分类速度的同时提高了模型性能。

关键词: 支持向量机(SVM), 增量学习, 数据分布, 超平面距离

Abstract: SVM algorithm is widely used to deal with classification problem due to its good robustness and performance on small datasets. However, the traditional SVM algorithm fails to address some classification problems when the data are large or growing. One of the strategies to overcome this challenge is to train the classifier using an incremental learning technique. This paper illustrates a self-adaptive SVM incremental learning algorithm derived from the structured description of data distribution. According to the geometric distance between the hyperplane and samples which contain original sample set and new sample set, a self-adaptive incremental sample selection model is established. This model can filter the boundary samples during the process of increment training accurately. The adjustment coefficients based on spatial distribution similarity are set up for new samples and original model samples in order to balance the speed and performance of incremental learning. Experimental results demonstrate that the proposed algorithm has higher training speed and better performance of classifications.

Key words: support vector machine (SVM), incremental learning, data distribution, hyperplane-distance