计算机科学与探索 ›› 2010, Vol. 4 ›› Issue (5): 455-463.DOI: 10.3778/j.issn.1673-9418.2010.05.008

• 学术研究 • 上一篇    下一篇

使用PCA建立基于规则的组合分类器*

石国强, 牛常勇, 范 明+   

  1. 郑州大学 信息工程学院, 郑州 450052
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2010-05-11 发布日期:2010-05-11
  • 通讯作者: 范 明

Constructing Ensembles of Rule-based Classifiers Using PCA*

SHI Guoqiang, NIU Changyong, FAN Ming+   

  1. School of Information and Engineering, Zhengzhou University, Zhengzhou 450052, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2010-05-11 Published:2010-05-11
  • Contact: FAN Ming

摘要: 提出了一种使用基于规则的基分类器建立组合分类器的新方法PCARules。尽管新方法也采用基分类器预测的加权投票来决定待分类样本的类, 但是为基分类器创建训练数据集的方法与bagging和boosting完全不同。该方法不是通过抽样为基分类器创建数据集, 而是随机地将特征划分成K个子集, 使用PCA得到每个子集的主成分, 形成新的特征空间, 并将所有训练数据映射到新的特征空间作为基分类器的训练集。在UCI机器学习库的30个随机选取的数据集上的实验表明:算法不仅能够显著提高基于规则的分类方法的分类性能, 而且与bagging和boosting等传统组合方法相比, 在大部分数据集上都具有更高的分类准确率。

关键词: 组合分类器, 特征提取, 主成分分析

Abstract: A new method, called PCARules, is presented for constructing ensembles of rule-based classifiers. Although the class label of a sample to be classified is also determined by taking weighted vote among the predictions made by each base classifier, this method is very different from bagging and boosting in the way of creating the training data for a base classifier. Instead of creating a training data for each base classifier by sampling, this method splits the feature set into K subsets randomly, upon each of which principal component analysis (PCA) is applied to find the corresponding principal components. And then all principal components are put together to form a new feature space, into which all original training data are mapped to create the training set for a base classifier. Experiments carried on 30 benchmark datasets selected randomly from the UCI machine learning repository show that the method not only improves performance of rule-based classifiers significantly, but also achieves higher accuracy in most of data sets than traditional combining methods such as bagging and boosting.

Key words: classifier ensemble, feature extraction, principal component analysis (PCA)

中图分类号: