Journal of Frontiers of Computer Science and Technology ›› 2015, Vol. 9 ›› Issue (3): 368-375.DOI: 10.3778/j.issn.1673-9418.1406046

Previous Articles     Next Articles

Feature Selection Algorithm Based on Weighted Positive Region

WANG Chenxi1, LIN Yaojin2+, LIU Jinghua2, LIN Menglei2   

  1. 1. Department of Computer Engineering, Zhangzhou Institute of Technology, Zhangzhou, Fujian 363000, China
    2. School of Computer Science, Minnan Normal University, Zhangzhou, Fujian 363000, China
  • Online:2015-03-01 Published:2015-03-09

基于加权正域的特征选择算法

王晨曦1,林耀进2+,刘景华2,林梦雷2   

  1. 1. 漳州职业技术学院 计算机工程系,福建 漳州 363000
    2. 闽南师范大学 计算机学院,福建 漳州 363000

Abstract: It is noticeable that the feature selection based on neighborhood rough sets cannot fully measure the relationship between features and samples. This paper proposes a feature selection algorithm based on weighted positive region, by fusing the measure criterion of feature obtained from samples based on large margin. This algorithm comprehensively uses the discernment ability of samples obtained from features and the contribution degree of samples obtained from features. The experimental results based on UCI datasets and high-dimension small-sample datasets show that compared with the traditional single criterion algorithm, the proposed algorithm gets better classification performance, and is particularly useful for handling high-dimension and small-sample datasets.

Key words: feature selection, positive region, large margin, neighborhood rough sets

摘要: 基于邻域粗糙集的特征选择算法无法评价特征与样本之间的相互关系,为此,通过融合基于大间隔获得样本对特征的评价准则,提出了基于加权正域的特征选择算法。该算法有效地实现了特征对样本的区分能力与样本对特征的贡献程度的综合利用。在UCI数据集和5个高维小样本数据集上的实验结果表明,相比传统的单准则评价的特征选择方法,该方法不仅能有效地提高特征选择的分类性能,而且更加有利于处理高维小样本数据集。

关键词: 特征选择, 正域, 大间隔, 邻域粗糙集