计算机科学与探索 ›› 2014, Vol. 8 ›› Issue (9): 1085-1091.DOI: 10.3778/j.issn.1673-9418.1405034

• 人工智能与模式识别 • 上一篇    下一篇

采用双重特征扰动的最小平方有序回归

余海犇,陈松灿+   

  1. 南京航空航天大学 计算机科学与技术学院,南京 210016
  • 出版日期:2014-09-01 发布日期:2014-09-03

Least Squares Ordinal Regression Using Doubly Corrupted Features

YU Haiben, CHEN Songcan+   

  1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
  • Online:2014-09-01 Published:2014-09-03

摘要: 有序回归是一种特殊的机器学习范式,其目标是利用类间内在的有序标号来划分模式。尽管已有众多有序学习方法相继被提出,但其性能常受制于有限的训练样本。借鉴最近提出的边际特征扰动思想,通过对训练样本的输入和输出分别施加已知分布噪声的随机扰动和确定偏差的可控扰动,以弥补样本有限的不足,进而在最小平方有序回归基础上发展出采用双重特征扰动的最小平方有序回归(least squares ordinal regression using doubly corrupted features,LSOR-DCF)。实验结果表明,LSOR-DCF性能优于无扰动或单一输入/输出的扰动,且在小数据集上表现得尤其明显。

关键词: 有序回归, 最小平方回归, 边际特征扰动, 双重扰动

Abstract: Ordinal regression is a special machine learning paradigm whose purpose is to classify patterns by using between-class natural ordinal scale. Many ordinal regression algorithms have been proposed. However, their performance will largely be constrained when facing the dataset with the limited size. To remedy the shortcoming of finite dataset, inspired by recently-proposed marginalized corrupted features, this paper develops the least squares ordinal regression using doubly corrupted features (LSOR-DCF) which is based on least squares ordinal regression by corrupting both the samples using random noise from known distributions and the labels using deterministic biases. The experimental results demonstrate the superiority of LSOR-DCF in performance, especially in the small data sets, to related methods without adding either noise in samples or corrupted noise in samples and labels alone.

Key words: ordinal regression, least squares regression, marginalized corrupted features, double corruption