Journal of Frontiers of Computer Science and Technology ›› 2018, Vol. 12 ›› Issue (9): 1522-1530.DOI: 10.3778/j.issn.1673-9418.1706078

Previous Articles    

Classifier Selection Method with Hybrid Diversity Measure

MI Aizhong, LU Yao+   

  1. School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China
  • Online:2018-09-01 Published:2018-09-10


米爱中,陆    瑶+   

  1. 河南理工大学 计算机科学与技术学院,河南 焦作 454000

Abstract: At present, classifier ensemble system has been widely used in various fields of pattern recognition. However, as the number of classifiers increases, the diversity among classifiers can be reduced and redundancy can be generated. Therefore, it is necessary to study how to delete redundant classifiers and reduce the size of the classifier set without ensemble performance degradation. This paper proposes a classifier selection method with hybrid diversity measure. In the method, the paired diversity matrix is transformed into adjacency matrix which is expressed by one or more graphs. Then, using graph coloring method based on genetic algorithm, the classifiers are grouped according to the coloring result. Finally, an evaluation system based on information entropy and non-pairwise diversity measure is presented. According to the weights of each group, a group is selected as the final classifier set. The feasibility of the proposed method is demonstrated by comparing experiments with a variety of ensemble methods on 8 datasets of UCI databases.

Key words: classifier ensemble, diversity, genetic algorithm, graph coloring method, information entropy

摘要: 分类器集成系统已广泛应用于模式识别的各个领域,然而随着分类器数量的增加,导致分类器间差异度的减小而产生冗余。因此需要研究在保障集成性能的同时,剔除冗余分类器,减小分类器集合的规模。提出了一种使用混合差异性度量的分类器选择方法。该方法首先将成对差异度矩阵转换为邻接矩阵,以一个图或多个图的形式表示;然后利用基于遗传算法的图着色方法,根据着色结果将分类器分组;最后给出一种基于信息熵和非成对差异性度量的评价体系,根据各组的权值选出一组作为最终分类器集合。通过与多种集成方法在UCI数据库的8组数据集上的实验对比,证明了所提方法的可行性。

关键词: 分类器集成, 差异性, 遗传算法, 图着色方法, 信息熵