计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (4): 690-701.DOI: 10.3778/j.issn.1673-9418.1912074

• 人工智能 • 上一篇    下一篇

在高斯分布下优化仿射变换的极限学习机

张毅,王士同   

  1. 1. 江南大学 人工智能与计算机学院,江苏 无锡 214122
    2. 江南大学 江苏省媒体设计与软件技术重点实验室,江苏 无锡 214122
  • 出版日期:2021-04-01 发布日期:2021-04-02

Extreme Learning Machine for Optimized Affine Transformation Based on Gaussian Distribution

ZHANG Yi, WANG Shitong   

  1. 1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
    2. Key Laboratory of Media Design and Software Technology of Jiangsu Province, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2021-04-01 Published:2021-04-02

摘要:

极限学习机(ELM)会大量映射到激活函数的饱和区域,同时隐含层输入与输出远远不能获得共同的分布方式,导致泛化性能大打折扣。针对这一问题,研究了在高斯分布下优化激活函数中仿射变换(AT)的极限学习机,主要思想是在隐含层输入数据上引入新型的线性关系,利用梯度下降算法对误差函数中的缩放参数和平移参数进行优化,以满足隐含层输出能够高度服从高斯分布。基于高斯分布计算仿射参数的方法,能够保证隐节点相互独立的同时,也强调了高度的依赖关系。实验结果表明,在实际分类数据集和图像回归数据集中,隐含层输出数据不能很好地服从均匀分布,但服从高斯分布趋势,总体上能够达到更好的实验效果。与原始ELM算法和AT-ELM1算法比较,均有显著的改善。

关键词: 极限学习机(ELM), 仿射变换(AT), 高斯分布, 分类

Abstract:

Extreme learning machine (ELM) is massively mapped to the saturation region of the activation function. Moreover, the input and output of the hidden layer are far from being able to obtain a common distribution method, which gives rise to poor generalization performance. Aiming at this problem, the extreme learning machine that optimizes the affine transformation (AT)  in the activation function under the Gaussian distribution is studied. The proposed algorithm introduces a new linear relationship of input data in the hidden layer. The gradient descent algorithm is used to optimize the scaling parameters and translation parameters in the objective function to satisfy the hidden layer output highly obeying the Gaussian distribution. The new method of calculating affine parameters based on the Gaussian distribution can ensure that the hidden nodes are independent of each other while retaining a high degree of dependency. The experimental results show that the output data of the hidden layer do not obey the uniform distribution well in the actual classification dataset and the image regression dataset, but obey the Gaussian distribution trend, which can achieve better experimental results in general. Compared with the original ELM algorithm and the AT-ELM1 algorithm, there are significant improvements in general.

Key words: extreme learning machine (ELM), affine transformation (AT), Gaussian distribution, classification