Journal of Frontiers of Computer Science and Technology ›› 2017, Vol. 11 ›› Issue (7): 1140-1149.DOI: 10.3778/j.issn.1673-9418.1605051

Previous Articles     Next Articles

Robust Canonical Correlation Analysis Based on Generalized Mean

GU Gaosheng1, GE Hongwei1,2+, ZHOU Mengxuan1   

  1. 1. School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
    2. Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2017-07-01 Published:2017-07-07

基于广义均值的鲁棒典型相关分析算法

顾高升1,葛洪伟1,2+,周梦璇1   

  1. 1. 江南大学 物联网工程学院,江苏 无锡 214122
    2. 江南大学 轻工过程先进控制教育部重点实验室,江苏 无锡 214122

Abstract: Canonical correlation analysis (CCA) is a multivariate statistical analysis method which aims at searching for the linear correlation between two sets of variables of same object. And the criterion function based on L2 norm of minimum mean square error used in CCA results in robustness problem. Generalized mean has been proved to be robust in theory, and has received validation in some applications, such as clustering, object recognition. This paper develops a robust CCA based on generalized mean (GMCCA), which successfully overcomes the drawback that CCA is sensitive to outliers. The method not only inhibits the influence of outliers to achieve robust results, but also avoids the problem of singular covariance matrix in small size of samples. Experiments on multiple feature database (MFD), face database (ORL) and object database (COIL-20) demonstrate the effectiveness of GMCCA.

Key words: generalized mean, mean square error, canonical correlation analysis, robustness, robust canonical correlation analysis

摘要: 典型相关分析(canonical correlation analysis,CCA)是一种寻求同一对象的两组变量之间最大相关性的多元统计方法,其基于L2范数的最小均方误差(mean square error,MSE)的准则函数对于野值点非鲁棒。广义均值不仅在理论上被证明是鲁棒的,而且在聚类和对象识别等应用中获得了有效性验证。将广义均值应用于CCA,提出了一种基于广义均值的鲁棒CCA(CCA based on generalized mean,GMCCA),成功克服了CCA对野值点敏感的不足。一方面,通过抑制野值点对准则函数的影响,达到鲁棒的效果。另一方面,GMCCA避免了高维小样本导致协方差矩阵奇异的问题。在多特征手写体数据库(multiple feature database MFD)、人脸数据库(ORL)和对象图像数据库(COIL-20)上的实验结果验证了该算法的有效性。

关键词: 广义均值, 均方误差, 典型相关分析, 鲁棒性, 鲁棒典型相关分析