计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (8): 1809-1818.DOI: 10.3778/j.issn.1673-9418.2103101

• 人工智能 • 上一篇    下一篇

块增量典型相关分析

潘玉1, 陈晓红+(), 李舜酩2, 李纪永3   

  1. 1. 南京航空航天大学 理学院,南京 211106
    2. 南京航空航天大学 能源与动力学院,南京 211106
    3. 四川航天中天动力装备有限责任公司,成都 610100
  • 收稿日期:2021-03-29 修回日期:2021-06-15 出版日期:2022-08-01 发布日期:2021-06-09
  • 通讯作者: +E-mail: lyandcxh@nuaa.edu.cn
  • 作者简介:潘玉(1997—),女,安徽阜阳人,硕士研究生,主要研究方向为模式识别、人工智能。
    陈晓红(1977—),女,山东临沂人,副教授,硕士生导师,主要研究方向为模式识别、人工智能。
    李舜酩(1962—),男,山东青州人,教授,博士生导师,主要研究方向为智能机器学习与人工神经网络、智能故障诊断等。
    李纪永(1985—),男,山东临沂人,博士,高级工程师,主要研究方向为振动信号处理、振动抑制。
  • 基金资助:
    国家自然科学基金(11971231);国家自然科学基金(12111530001);国家重点研发计划(2018YFB2003300)

Chunk Incremental Canonical Correlation Analysis

PAN Yu1, CHEN Xiaohong+(), LI Shunming2, LI Jiyong3   

  1. 1. College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
    2. College of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
    3. Sichuan Aerospace Zhongtian Power Equipment Co., Ltd., Chengdu 610100, China
  • Received:2021-03-29 Revised:2021-06-15 Online:2022-08-01 Published:2021-06-09
  • About author:PAN Yu, born in 1997, M.S. candidate. Her res-earch interests include pattern recognition and artificial intelligence.
    CHEN Xiaohong, born in 1977, associate pro-fessor, M.S. supervisor. Her research interests include pattern recognition and artificial intellig-ence.
    LI Shunming, born in 1962, professor, Ph.D. supervisor. His research interests include intel-ligent machine learning and artificial neural net-work, intelligent fault diagnosis, etc.
    LI Jiyong, born in 1985, Ph.D., senior engineer. His research interests include vibration signal processing and vibration suppression.
  • Supported by:
    the National Natural Science Foundation of China(11971231);the National Natural Science Foundation of China(12111530001);the National Key Research and Development Program of China(2018YFB2003300)

摘要:

增量学习是处理大规模动态流数据的重要技术,在机器学习领域得到广泛应用。已有众多学者将其与降维方法相结合得到增量式降维算法,其中增量典型相关分析(ICCA)是典型相关分析(CCA)的增量式改进版本,可有效处理多视图的高维数据流降维问题。由于ICCA每次只利用单对样本更新投影向量,每新增一对样本均需更新一次投影向量,导致该算法比较耗时。为了提高算法的效率,提出了块增量典型相关分析(CICCA)算法。该算法无需计算样本协方差矩阵,直接将数据流按批处理,每次利用新增的批样本信息对上一步投影向量进行修正更新,从而得到主投影向量。进一步,在投影向量的正交补空间中计算其他投影向量,进而将原始高维的多视图数据投影到低维空间。在人工数据集和真实数据集上的实验结果表明,该算法提取低维特征的分类性能与CCA、ICCA相当,但训练时间大幅度减少。

关键词: 典型相关分析(CCA), 数据降维, 增量学习, 多视图分类

Abstract:

For the large-scale dynamic data stream, incremental learning is an effective and efficient technique and is widely used in machine learning. Incremental dimensionality reduction algorithms have been proposed by many scholars. As an improved canonical correlation analysis (CCA) method based on incremental learning, incremental canonical correlation analysis (ICCA) can effectively deal with the problem of dimensionality reduction of high-dimensional multi-view data stream. However, there is a drawback in this approach that the projection vector must be updated once for each new sample, which consumes a lot of time on the issue of online learning. Aiming at this problem, chunk incremental canonical correlation analysis (CICCA) is proposed in this paper. It can avoid the calculation of sample covariance matrices and process batch data stream directly. The main projection vector is updated each time with the newly added batch sample information, which is used to revise and update the projection vector of the previous step. Further, the other projection vectors are calculated in the orthogonal complement space of the projection vector. Therefore, data can be got from low-dimensional spaces. Experimental results show that the classification performance of CICCA is comparable to CCA and ICCA, but the training time is greatly reduced on synthetic dataset and real dataset.

Key words: canonical correlation analysis (CCA), dimensionality reduction, incremental learning, multi-view classification

中图分类号: