计算机科学与探索 ›› 2021, Vol. 15 ›› Issue (7): 1339-1349.DOI: 10.3778/j.issn.1673-9418.2005011

• 理论与算法 • 上一篇    下一篇

改进混合二进制蝗虫优化特征选择算法

赵泽渊,代永强   

  1. 甘肃农业大学 信息科学技术学院,兰州 730070
  • 出版日期:2021-07-01 发布日期:2021-07-09

Improved Shuffled Binary Grasshopper Optimization Feature Selection Algorithm

ZHAO Zeyuan, DAI Yongqiang   

  1. College of Information Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
  • Online:2021-07-01 Published:2021-07-09

摘要:

特征选择是从数据集的原始特征中选出最优或较优特征子集,从而在加快分类速度的同时提高分类准确率。提出了一种改进的混合二进制蝗虫优化特征选择算法:通过引入步长引导个体位置变化的二进制转化策略,降低了进制转换的盲目性,提高了算法在解空间中的搜索性能;通过引入混合复杂进化方法,将蝗虫群体划分子群并独立进化,提高了算法的多样性,降低了早熟收敛的概率。采用改进算法对UCI部分数据集进行特征选择,使用K-NN分类器对特征子集进行分类评价,实验结果表明:与基本二进制蝗虫优化算法、二进制粒子群优化算法和二进制灰狼优化算法相比,改进算法具有较优的搜索性能、收敛性能与较强的鲁棒性,能够获得更好的特征子集,取得更好的分类效果。

关键词: 二进制, 蝗虫优化算法, 混合复杂进化方法, 特征选择, 分类, [K]邻近(K-NN)算法

Abstract:

Feature selection is to select the optimal or relatively optimal feature subsets from the original feature set of the data set to speed up classification and improve classification accuracy. An improved shuffled binary grass-hopper optimization feature selection algorithm is proposed in this paper. By introducing a binary transformation strategy that uses step size to guide individual position change, the blindness of the binary conversion is reduced, and the search performance of the algorithm in solution space is improved. By introducing shuffled complex evolution, the grasshopper population is divided into subgroups and evolved independently, which improves the diversity of algorithm and reduces the probability of premature convergence. The improved algorithm is used to select features of some data sets of UCI, and K-NN (K-nearest neighbor) classifier is used to classify and evaluate the feature subset. Experimental results show that compared with the basic binary grasshopper optimization algorithm, binary particle swarm optimization algorithm and binary gray wolf optimization algorithm, the improved algorithm has better search performance, convergence performance and strong robustness, and can obtain better feature subsets and better classification effect.

Key words: binary, grasshopper optimization algorithm, shuffled complex evolution, feature selection, classification, K-nearest neighbor (K-NN) algorithm