多样性正则化极限学习机的集成方法

doi:10.3778/j.issn.1673-9418.2101001

计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (8): 1819-1928.DOI: 10.3778/j.issn.1673-9418.2101001

多样性正则化极限学习机的集成方法

陈洋⁺(), 王士同²

1. 江南大学人工智能与计算机学院，江苏无锡 214122
2. 江南大学江苏省媒体设计与软件技术重点实验室，江苏无锡 214122

收稿日期:2021-01-04 修回日期:2021-03-02 出版日期:2022-08-01 发布日期:2021-03-25
通讯作者: +E-mail: 6191611002@stu.jiangnan.edu.cn。
作者简介:陈洋（1995—），女，江苏扬州人，硕士研究生，主要研究方向为机器学习、模式识别。
王士同（1964—），男，江苏扬州人，教授，博士生导师，CCF会员，主要研究方向为人工智能、模式识别等。
基金资助:
江苏省自然科学基金(BK20191331)

Ensemble Method of Diverse Regularized Extreme Learning Machines

CHEN Yang⁺(), WANG Shitong²

1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
2. Key Laboratory of Media Design and Software Technology of Jiangsu Province, Jiangnan University, Wuxi, Jiangsu 214122, China

Received:2021-01-04 Revised:2021-03-02 Online:2022-08-01 Published:2021-03-25
About author:CHEN Yang, born in 1995, M.S. candidate. Her research interests include machine learning and pattern recognition.
WANG Shitong, born in 1964, professor, Ph.D. supervisor, member of CCF. His research inte-rests include artificial intelligence, pattern recog-nition, etc.
Supported by:
the Natural Science Foundation of Jiangsu Province(BK20191331)

摘要/Abstract

摘要：

极限学习机（ELM）是一种单隐层前向网络的训练算法，随机确定输入层权值和隐含层偏置，通过分析的方法确定输出层的权值，ELM克服了基于梯度的学习算法的很多不足，如局部极小、不合适的学习速率、学习速度慢等，却不可避免地造成了过拟合的隐患且稳定性较差，特别是对于规模较大的数据集。针对上述问题，提出多样性正则化极限学习机（DRELM）的集成方法。首先，从改变隐层节点参数的分布来为每个ELM随机选取输入权重，采用LOO交叉验证方法和 $M S E^{P R E S S}$ 方法来寻找每个基学习器的最优隐节点数，计算并输出最优隐含层输出权重，训练出较好且具有差异性的基学习器。然后，将有关多样性的新惩罚项显式添加到整个目标函数中，迭代更新每个基学习器的隐含层输出权重并输出结果。最后，集成所有基学习器的输出结果对其求平均值，得到整个网络模型最后的输出结果。该方法能够有效地实现多样性正则化极限学习机（RELM）的融合，兼顾准确率和多样性。在10个不同规模的UCI数据集上的实验结果表明所提出的方法是行之有效的。

关键词: 极限学习机（ELM）, 集成学习, 多样性, 正则化极限学习机（RELM）

Abstract:

As a fast training algorithm of single hidden layer forward networks, extreme learning machine (ELM) randomly initializes the input layer weights and hidden layer biases, and gets the weights of output layer through the analysis method. It overcomes many shortcomings of gradient based learning algorithm, such as local minimum, inappropriate learning rate, slow learning speed, etc. However, ELM still inevitably has overfitting and poorly stable phenomenon, especially on large-scale datasets. This paper proposes the ensemble method of diverse regularized extreme learning machines (DRELM) to solve the above problems. First, its own random distribution weigthts are used to assure the diversity between each ELM base learner, then leave-one-out (LOO) cross validation method and $M S E^{P R E S S}$ method are used to find the optimal hidden node number of each base learner, calculate the optimal hidden layer output weights to train better and different base learners. Then the new penalty term about diversity is explicitly added to the objective function and the output matrix of each learner is updated iteratively. Finally, the final output of the whole network model is obtained by averaging the output of all base learners. This method can effectively realize the ensemble of regularized extreme learning machines (RELM) with both accuracy and diversity. Experimental results on 10 UCI datasets indicate the effectiveness of DRELM.

Key words: extreme learning machine (ELM), ensemble learning, diversity, regularized extreme learning machines (RELM)

中图分类号:

TP181

陈洋, 王士同. 多样性正则化极限学习机的集成方法[J]. 计算机科学与探索, 2022, 16(8): 1819-1928.

CHEN Yang, WANG Shitong. Ensemble Method of Diverse Regularized Extreme Learning Machines[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1819-1928.

图/表 8

图1 ELM模型框架

Fig.1 ELM network structure

图2 多样性正则化ELM集成学习系统模型框架

Fig.2 Ensemble learning of different regularized ELM network structure

图3 多样性正则化ELM集成方法流程图

Fig.3 Flow chart of ensemble learning of different regularized ELM

表1 UCI的10个数据集的详细信息

Table 1 Details of 10 datasets for UCI

数据集	样本数		特征数	类别数
数据集	训练集	测试集	特征数	类别数
austra	552	138	15	2
Letter	16 000	4 000	16	16
Magic	15 216	3 804	10	2
Vehicle	677	169	18	4
Diabets	614	154	8	2
PCAMC	1 554	389	3 289	3
pendigits	8 794	2 198	16	10
Onlinenews	31 715	7 929	16	2
optdigits	4 496	1 124	64	10
Ionosphere	281	70	34	2

图4 austra数据集的精度变化图

Fig.4 Accuracy changes on austra dataset

表2 用于UCI数据集的各种模型的参数设置

Table 2 Parameter settings of various models for UCI datasets

数据集	SVM		RELM	DRM	二参数BP		DRELM
数据集	$C$	$γ$	$λ$	$μ$	$λ_{1}$	$λ_{2}$	$γ$	$μ$
austra	2³	2^-1	e^-2	10^-2	1.2	0.4	2^-3	2^-3
Letter	2⁴	2⁰	e⁰	10^-3	1.2	0.6	2²	2^-1
Magic	2²	2^-2	e^-4	10^-6	1.1	0.4	2^-2	2^-6
Vehicle	2²	2²	e³	10⁰	1.1	0.8	2³	2^-2
Diabets	2⁸	2⁰	e^-1	10^-5	1.2	0.6	2¹	2^-3
PCAMC	2¹⁰	2⁸	2¹³	10^-4	1.2	0.4	2^-2	2^-4
pendigits	2⁶	2^-1	e^-21	10^-1	1.1	0.6	2^-4	2^-5
Onlinenews	2⁵	2⁵	e^-10	10^-2	1.2	0.4	2⁰	2^-2
optdigits	2²	2^-4	e¹	10^-6	1.1	0.4	2³	2^-7
Ionosphere	2⁴	2^-4	e^-12	10^-4	1.1	0.8	2^-1	2^-4

表2 用于UCI数据集的各种模型的参数设置

Table 2 Parameter settings of various models for UCI datasets

数据集	SVM		RELM	DRM	二参数BP		DRELM
数据集	$C$	$γ$	$λ$	$μ$	$λ_{1}$	$λ_{2}$	$γ$	$μ$
austra	2³	2^-1	e^-2	10^-2	1.2	0.4	2^-3	2^-3
Letter	2⁴	2⁰	e⁰	10^-3	1.2	0.6	2²	2^-1
Magic	2²	2^-2	e^-4	10^-6	1.1	0.4	2^-2	2^-6
Vehicle	2²	2²	e³	10⁰	1.1	0.8	2³	2^-2
Diabets	2⁸	2⁰	e^-1	10^-5	1.2	0.6	2¹	2^-3
PCAMC	2¹⁰	2⁸	2¹³	10^-4	1.2	0.4	2^-2	2^-4
pendigits	2⁶	2^-1	e^-21	10^-1	1.1	0.6	2^-4	2^-5
Onlinenews	2⁵	2⁵	e^-10	10^-2	1.2	0.4	2⁰	2^-2
optdigits	2²	2^-4	e¹	10^-6	1.1	0.4	2³	2^-7
Ionosphere	2⁴	2^-4	e^-12	10^-4	1.1	0.8	2^-1	2^-4

表3 各种模型对于不同数据集的测试结果与性能比较

Table 3 Test result and performance comparison of various models on different datasets

数据集	SVM		RELM		DRM		二参数BP		DRELM
数据集	acc/%	std	acc/%	std	acc/%	std	acc/%	std	acc/%	std
austra	86.03	2.15	63.44	1.10	86.07	1.30	84.92	2.14	88.54	0.79
Letter	90.89	1.20	79.98	2.59	91.20	1.18	76.65	2.81	91.29	1.19
Magic	84.59	0.82	63.01	4.28	84.64	1.11	72.74	0.78	85.95	0.38
Vehicle	75.26	1.35	66.03	1.13	77.60	1.33	69.10	0.76	77.05	0.55
Diabets	75.21	0.50	59.82	1.43	76.13	0.69	76.12	1.66	75.44	1.23
PCAMC	88.43	1.39	72.04	1.51	88.01	1.35	85.96	1.92	91.02	1.31
pendigits	97.05	1.79	85.44	3.65	97.67	0.17	93.31	4.42	98.16	0.41
Onlinenews	62.19	1.20	52.82	1.45	65.01	1.12	60.17	1.13	68.66	0.70
optdigits	93.79	3.51	80.84	2.64	94.68	1.04	91.01	1.49	98.10	1.06
Ionosphere	66.25	0.94	60.64	1.53	66.18	1.21	63.54	0.96	70.01	0.89

表4 各种模型对于不同数据集的训练时间比较

Table 4 Training time comparison of various models on different datasets s

数据集	SVM	RELM	DRM	BP	DRELM
austra	12.93	0.14	31.92	71.69	19.27
Letter	9 125.81	42.29	13 687.04	76 345.41	65.72
Magic	7 846.74	983.15	17 261.16	69 530.23	1 471.57
Vehicle	412.61	0.92	487.22	518.21	502.94
Diabets	361.65	0.68	411.15	60.43	443.17
PCAMC	4 741.97	847.56	5 626.57	40 725.52	1 185.41
pendigits	5 874.13	614.09	6 755.14	33 672.81	920.67
Onlinenews	12 136.65	1 984.95	19 863.42	99 215.07	2 679.80
optdigits	1 294.92	4.62	1 785.38	30 674.16	560.57
Ionosphere	7.08	0.08	22.29	20.14	8.99

参考文献 25

[1]	KROGH A S, VEDELSBY J. Neural network ensembles, cross validation and active learning[C]// Advances in Neural Information Processing Systems 7, Denver, 1994. Cambridge:MIT Press, 1994: 231-238.
[2]	HASTIE T, TIBSHIRANI R, FRIEDMAN J. The elements of statistical learning[M]. Berlin, Heidelberg: Springer, 2007.
[3]	BROWN G. An information theoretic perspective on multiple classifier systems[C]// LNCS 5519: Proceedings of the 8th International Workshop on Multiple Classifier Systems, Reyk-javik, Jun 10-12, 2009. Berlin, Heidelberg: Springer, 2009: 344-353.
[4]	ZHOU Z H, LI N. Multi-information ensemble diversity[C]// LNCS 5997: Proceedings of the 9th International Workshop on Multiple Classifier Systems, Cairo, Apr 7-9, 2010. Berlin, Heidelberg: Springer, 2010: 134-144.
[5]	YU Y, LI Y F, ZHOU Z H. Diversity regularized machine[C]// Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Jul 16-22, 2011. Menlo Park: AAAI, 2011: 1603-1608.
[6]	BREIMAN L. Bagging predicators[J]. Machine Learning, 1996, 24(2): 123-140.
[7]	FREUND Y, SCHAPIRE R E. A desicion-theoretic genera-lization of on-line learning and an application to Boosting[J]. Journal of Computer and System Sciences, 1995, 55: 119-139.
[8]	GOPIKA D, AZHAGUSUNDARN B. An analysis on ensem-ble methods in classification tasks[J]. International Journal of Advanced Research in Computer and Communication Engineering, 2014, 3(7): 7423-7427.
[9]	ZHAO X G, WANG G, BI X, et al. XML document classi-fication based on ELM[J]. Neurocomputing, 2011, 74(16): 2444-2451.
[10]	JIANG Y L, SHEN Y F, LIU Y, et al. Multiclass AdaBoost ELM and its application in LBP based face recognition[J]. Mathematical Problems in Engineering, 2015: 918105.
[11]	LI M, XIAO P L, ZHANG J. Text classification based on ensemble extreme learning machine[J]. arXiv:1805.06525, 2018.
[12]	HUANG G B, ZHU Q Y, SIEW C K. Extreme learning ma-chine: a new learn ing scheme for feedforward neural net-works[C]// Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, Budapest, Jul 25-29, 2004. Piscataway: IEEE, 2004: 1-5.
[13]	CAO J W, ZHANG K, LUO M X, et al. Extreme learning machine and adaptive sparse representation for image classi-fication[J]. Neural Networks, 2016, 81: 91-102.
[14]	左鹏玉, 王士同. 无逆矩阵在线序列极限学习机[J]. 计算机科学与探索, 2020, 14(1): 117-124.
	ZUO P Y, WANG S T. Inverse-matrix-free online sequen-tial extreme learning machine[J]. Journal of Frontiers of Com-puter Science and Technology, 2020, 14(1): 117-124.
[15]	于化龙, 祁云嵩, 杨习贝, 等. 类不平衡模糊加权极限学习机算法研究[J]. 计算机科学与探索, 2017, 11(4): 619-632.
	YU H L, QI Y S, YANG X B, et al. Research on class imbalance fuzzy weighted extreme learning machine algori-thm[J]. Journal of Frontiers of Computer Science and Tec-hnology, 2017, 11(4): 619-632.
[16]	MICHE Y, SORJAMAA A, BAS P, et al. OP-ELM: optimally-pruned extreme learning machine[J]. IEEE Transactions on Neural Networks, 2010, 21(1): 158-162.
[17]	KUNCHEVA L I, WHITAKER C J, SHIPP C A, et al. Limits on the majority vote accuracy in classifier fusion[J]. Pattern Analysis & Applications, 2003, 6(1): 22-31.
[18]	HO T K. The random subspace method for constructing decision forests[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(8): 832-832.
[19]	GIACINTO G, ROLI F. Design of effective neural network ensembles for image classification purposes[J]. Image and Vision Computing, 2001, 19(9/10): 699-707.
[20]	DIETTERICH T G. An experimental comparison of three methods for constructing ensembles of decision trees: Bag-ging, Boosting, and randomization[J]. Machine Learning, 2000, 40(2): 139-157.
[21]	LUO Z Q, TSENG P. On the convergence of the coordinate descent method for convex differentiable minimization[J]. Journal of Optimization Theory and Applications, 1992, 72(1): 7-35.
[22]	BONAB H, CAN F. Less Is More: a comprehensive frame-work for the number of components of ensemble classifiers[J]. IEEE Transactions on Neural Networks and Learning Sys-tems, 2019, 30(9): 2735-2745.
[23]	JOHN A C. Classical and modern regression with applica-tions[J]. Technometrics, 1987, 29(3): 377-378.
[24]	VAN HEESWIJK M, MICHE Y. Binary/ternary extreme learning machines[J]. Neurocomputing, 2015, 149: 187-197.
[25]	李森林, 邓小武. 基于二参数的BP神经网络算法改进与应用[J]. 河北科技大学学报, 2010, 31(5): 447-450.
	LI S L, DENG X W. Improvement and application of BP algorithm with two arguments in neural networks[J]. Jour-nal of Hebei University of Science and Technology, 2010, 31(5): 447-450.

编辑推荐 0

Metrics

阅读次数

全文

131

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	5	28	0	98

来源	本网站	其他网站

次数	123	8
比例	94%	6%

摘要

328

最新录用	在线预览	正式出版

55	0	273

	来源	本网站

	次数	328
	比例	100%

多样性正则化极限学习机的集成方法

Ensemble Method of Diverse Regularized Extreme Learning Machines

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 25

相关文章 15

编辑推荐 0

Metrics

[1]	杨政, 邓赵红, 罗晓清, 顾鑫, 王士同. 利用ELM-AE和迁移表征学习构建的目标跟踪系统[J]. 计算机科学与探索, 2022, 16(7): 1633-1648.
[2]	张壮, 王士同. 不平衡数据的Takagi-Sugeno-Kang模糊分类集成模型[J]. 计算机科学与探索, 2022, 16(6): 1374-1382.
[3]	申瑞彩, 翟俊海, 侯璎真. 选择性集成学习多判别器生成对抗网络[J]. 计算机科学与探索, 2022, 16(6): 1429-1438.
[4]	杨永兆, 张玉金, 张立军. 由形状结构和位姿特征学习的稠密点云重建[J]. 计算机科学与探索, 2022, 16(5): 1117-1127.
[5]	张毅, 王士同. 在高斯分布下优化仿射变换的极限学习机[J]. 计算机科学与探索, 2021, 15(4): 690-701.
[6]	王曙燕, 金航, 孙家泽. GAN图像对抗样本生成方法[J]. 计算机科学与探索, 2021, 15(4): 702-711.
[7]	叶进, 谢紫琪, 肖庆宇, 宋玲, 李晓欢. 数据中心网络中基于ELM的流簇大小推理机制[J]. 计算机科学与探索, 2021, 15(2): 261-269.
[8]	黄宇翔, 黄栋, 王昌栋, 赖剑煌. 基于集成学习的改进深度嵌入聚类算法[J]. 计算机科学与探索, 2021, 15(10): 1949-1957.
[9]	孙伟, 张羽. 利用流挖掘和图挖掘的内网异常检测方法[J]. 计算机科学与探索, 2020, 14(7): 1154-1163.
[10]	陈兴国，徐修颖，陈康扬，杨光. 基于CMAES集成学习方法的地表水质分类[J]. 计算机科学与探索, 2020, 14(3): 426-436.
[11]	杨浩，陈红梅. 结合样本局部密度的非平衡数据集成分类算法[J]. 计算机科学与探索, 2020, 14(2): 274-284.
[12]	孙涛，周志华. 近似多元信息多样性[J]. 计算机科学与探索, 2019, 13(4): 639-646.
[13]	和凤珍，石进平. 非均匀划分拟阵约束下的多样性推荐方法[J]. 计算机科学与探索, 2019, 13(2): 226-238.
[14]	丁毅，王明亮，张道强. 差异性随机子空间集成[J]. 计算机科学与探索, 2018, 12(9): 1434-1443.
[15]	郭晓彤，李玲燕，朱春阳. Pareto支配关系下两阶段进化高维多目标优化算法[J]. 计算机科学与探索, 2018, 12(8): 1350-1360.