计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (1): 90-96.DOI: 10.3778/j.issn.1673-9418.2012.01.007

• 学术研究 • 上一篇    

属性赋权的K-Modes算法优化

李仁侃, 叶东毅   

  1. 福州大学 数学与计算机科学学院, 福州 350108
  • 出版日期:2012-01-01 发布日期:2012-01-01

Optimization of K-Modes Algorithm with Feature Weights

LI Renkan, YE Dongyi   

  1. College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350108, China
  • Online:2012-01-01 Published:2012-01-01

摘要: 传统K-Modes算法的一个主要问题是属性选择问题。K-Modes算法在聚类过程中对每一个属性都同等看待, 而在实际应用中, 很多数据集仅有几个重要属性对聚类起作用。为了考虑不同属性对聚类的不同影响, 将K-Modes聚类算法与属性权重的最优化结合起来, 提出一种属性自动赋权的FW-K-Modes算法。该算法不仅可以提高传统K-Modes聚类算法的聚类精度, 还能分析各维属性对聚类的贡献程度, 实现关键属性的选择。对多个UCI数据集进行了实验, 验证了该算法的优良特性。

关键词: K-Modes聚类, 属性选择, 自动属性赋权

Abstract: One major problem of the traditional K-Modes algorithm is the selection of features. The K-Modes clustering algorithm treats all features equally in the clustering process. But in practice, there are only a few important features in many data sets. To consider the particular contribution of different attributes, this paper proposes an improved algorithm called FW-K-Modes algorithm, which incorporates the K-Modes clustering algorithm with feature weight optimization. The proposed algorithm can not only improve the clustering precision in comparison with the traditional K-Modes clustering algorithm, but also analyze the important level of each feature in the clustering pro¬cess and implement the selection of key features. The experimental results on several UCI machine learning data sets validate the effectiveness of the proposed algorithm.

Key words: K-Modes clustering, feature selection, automated feature weighting