计算机科学与探索 ›› 2020, Vol. 14 ›› Issue (3): 426-436.DOI: 10.3778/j.issn.1673-9418.1905020

• 人工智能 • 上一篇    下一篇

基于CMAES集成学习方法的地表水质分类

陈兴国,徐修颖,陈康扬,杨光   

  1. 1.南京邮电大学 江苏省大数据安全与智能处理重点实验室,南京 210023
    2.南京大学 计算机软件新技术国家重点实验室,南京 210023
  • 出版日期:2020-03-01 发布日期:2020-03-13

Surface Water Quality Classification via CMAES Ensemble Method

CHEN Xingguo, XU Xiuying, CHEN Kangyang, YANG Guang   

  1. 1.Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecom-munications, Nanjing 210023, China
    2.State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China
  • Online:2020-03-01 Published:2020-03-13

摘要:

为了提高人民生活质量,政府部门不断加强水质管理,然而人工分类方法无法满足实时处理的需求,传统机器学习方法的分类准确率又不够高。集成学习使用多种学习算法来获得比单一学习算法更好的预测性能。首先,对集成学习进行概述,简要介绍了Bagging和Boosting算法,并提出基于协方差自适应调整的进化策略算法(CMAES)的集成学习方法。接着,介绍了数据处理方式、模型评估方法和评价指标。最后,用CMAES集成学习方法对逻辑回归、线性判别分析、支持向量机、决策树、完全随机树、朴素贝叶斯、K-邻近算法、随机森林、完全随机树林、深度级联森林十种模型进行集成。实验结果表明,CMAES集成学习方法优于所有其他模型,该方法将继续被应用到未来的研究之中。

关键词: 水质分类, Boosting, 基于协方差自适应调整的进化策略算法(CMAES), 集成学习, 参数优化

Abstract:

In order to improve the quality of people’s daily life, the government departments continue to strengthen water quality management. However, artificial classification method cannot meet the needs of real-time processing, additionally the classification accuracy of traditional machine learning methods is not high enough. Ensemble learning uses multiple learning algorithms to obtain better prediction performance than a single learning algorithm. First of all, this paper briefly introduces ensemble learning, the Bagging and Boosting algorithms, and then proposes an ensemble learning method based on the covariance matrix adaptation evolution strategy (CMAES) algorithm. Next, data processing method, model evaluation method and index are introduced. Finally, the CMAES ensemble method is used to ensemble the following ten models, including logistic regression, linear discriminant analysis, support vector machine, decision tree, completely-random tree, naive Bayes, K-nearest neighbors, random forest, completely-random tree forest and deep cascade forest. Experiments show that the CMAES ensemble method is superior to all the other models, and this method will continue to be applied in future research.

Key words: water quality classification, Boosting, covariance matrix adaptation evolution strategy (CMAES), ens-emble learning, parameter optimization