计算机科学与探索 ›› 2017, Vol. 11 ›› Issue (7): 1102-1113.DOI: 10.3778/j.issn.1673-9418.1605038

• 人工智能与模式识别 • 上一篇    下一篇

邻域嵌入的张量学习

路  梅1,2,李凡长1+   

  1. 1. 苏州大学 计算机学院,江苏 苏州 215006
    2. 江苏师范大学 计算机学院,江苏 徐州 221116
  • 出版日期:2017-07-01 发布日期:2017-07-07

Neighborhood-Embedded Tensor Learning

LU Mei1,2, LI Fanzhang1+   

  1. 1. College of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
    2. College of Computer Science and Technology, Jiangsu Normal University, Xuzhou, Jiangsu 221116, China
  • Online:2017-07-01 Published:2017-07-07

摘要: 传统的机器学习算法把数据表示成向量的形式进行处理,而现实世界许多应用中的数据都是以张量形式存在的,如图像、视频数据等,如果将这些本质上非向量形式的数据强制转换成向量表示,不仅会产生维数灾难和和小样本问题,而且会破坏数据本身的内部空间排列结构,不利于发现数据的好的低维表示。判别邻域嵌入(discriminant neighborhood embedding,DNE)是比较流行的面向向量的判别分析方法,在改进DNE算法的基础上,提出了面向张量数据的局部一致保持的邻域嵌入张量判别学习(neighborhood-embedded tensor learning,NTL)算法。NTL算法不仅克服了DNE面向向量的缺点,而且弥补了DNE方法偏重数据的邻域点而忽略数据的非邻域点影响的不足,通过精心设计目标函数(嵌入3个图:同类结点的邻接图、不同类结点的邻接图、其他结点的关联图),使投影空间的同类结点更加紧凑,不同类结点更加疏远,从而增强了算法的判别能力。3个公开数据库(ORL、PIE和COIL20)上的实验验证了NTL拥有更高的识别率,同时也拥有更高的算法效率。

关键词: 判别邻域嵌入(DNE), 张量子空间分析(TSA), 维数约简, 判别分析, 张量学习

Abstract: Most of traditional machine learning algorithms process vectorized data, while in real world a lot of data exist in the form of tensor, such as images and video. If these tensor data are forced to be vectorized, the so called “curse of dimensionality” and “small sample size problem” will be encountered as well as the intrinsic structure will be destroyed. Thus, the good lower dimensional representation of the original data can not be captured. Discriminant neighborhood embedding (DNE) is a popular discriminant analysis method but based on vectorized data. To address this issue, this paper proposes a novel neighborhood-embedded tensor learning (NTL) which inherits the power of DNE. In addition, NTL overcomes another limitation of DNE that it neglects the role of the points out of the neighborhood of a data point. By designing an objective function elaborately (three graphs are encoded in the   object function: intraclass graph, interclass graph and other points association graph), NTL maps the data into a low dimension subspace where the data in the same class will be more compact and the data in the different class will be more separable. The experimental results on three public data bases (ORL, PIE and COIL20) demonstrate that NTL achieves better recognition rate, while being much more efficient.

Key words: discriminant neighborhood embedding (DNE), tensor subspace analysis (TSA), dimensionality reduction, discriminant analysis, tensor learning