计算机科学与探索 ›› 2009, Vol. 3 ›› Issue (6): 612-620.DOI: 10.3778/j.issn.1673-9418.2009.06.006

• 学术研究 • 上一篇    下一篇

应用突函数差异算法进行癌症分类的基因选择

LE THI Hoai An+, NGUYEN Van-Vinh, OUCHANI Samir   

  1. Laboratory of Theoretical and Applied Computer Science (LITA) UFR MIM,University of Paul Verlaine-Metz Ile du Saulcy, 57045 Metz, France
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-11-15 发布日期:2009-11-15
  • 通讯作者: LE THI Hoai An

Gene Selection for Cancer Classification Using DCA

LE THI Hoai An+, NGUYEN Van-Vinh, OUCHANI Samir   

  1. Laboratory of Theoretical and Applied Computer Science (LITA) UFR MIM,University of Paul Verlaine-Metz Ile du Saulcy, 57045 Metz, France
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-11-15 Published:2009-11-15
  • Contact: LE THI Hoai An

摘要: 研究了有关癌症分类的基因选择问题。开发了集成的基于平滑剪切绝对偏差罚分的SVM—特征选择方法,直接最小化分类器的性能。为解决优化问题,应用了突函数差异算法(difference of convex functions algorithms,DCA)这一进行非突连续优化的通用框架,致使连续线性规划算法有限收敛。真实数据集上的先验实验表明算法达到了预想目标:在压缩大量属性的同时,保持了较小分类差错。

关键词: 基因选择, 特征选择, 癌症分类, 支持向量机, 非突优化, DC编程

Abstract: The problem of gene selection for cancer classification is considered. A combined SVM-feature selection approach based on the smoothly clipped absolute deviation (SCAD) penalty is developed, minimizing directly the classifier performance. To solve the optimization problems, apply the DCA (difference of convex functions algori-thms) which is a general framework for nonconvex continuous optimization. This leads to a successive linear programming algorithm with finite convergence. Preliminary computational experiments on different real data demonstrate that this method accomplishes the desired goal: Suppression of a large number of features with a small error of classification.

Key words: gene selection, feature selection, cancer classification, support vector machine (SVM), nonconvex op-timization, DC programming

中图分类号: