计算机科学与探索 ›› 2012, Vol. 6 ›› Issue (2): 183-192.DOI: 10.3778/j.issn.1673-9418.2012.02.010

• 学术研究 • 上一篇    

径向基网络中样本属性的贡献因子研究

徐 昇, 业 宁, 徐姗姗   

  1. 南京林业大学 信息科学与技术学院, 南京 210037
  • 出版日期:2012-02-01 发布日期:2012-02-01

Research on Contribution Factor for Sample Properties in RBF Neural Network

XU Sheng, YE Ning, XU Shanshan   

  1. College of Information Science and Technology, Nanjing Forestry University, Nanjing 210037, China
  • Online:2012-02-01 Published:2012-02-01

摘要: 通常对径向基(radial basis function, RBF)神经网络的改进大多是注重隐藏节点选取、大规模数据学习速率和函数组织形式, 忽视了初始输入样本自身间的结构信息。研究发现, 输入样本的不同属性对分类能力影响的程度也不同, 即每个属性应该有自己的分类权重。在对样本归一化预处理后, 研究了不同属性在分类时的贡献因子, 提出了新的算法模型CFRBF (contribution factors RBF), 用贡献因子来描述样本各个属性的重要性。选用了蛋白质二级预测问题来验证模型, 传统的二级预测是将样本直接输入网络, 仅仅依靠海明距离来分类, 丢失大量信息。针对设计的新模型, 使用了一种新的组织形式来解决预测问题。实验证明, 采用新的组织形式后网络性能明显提高, 而用CFRBF算法后其精度再次提高。同时通过贡献因子可以揭示看似无规律的蛋白质序列之间氨基酸构态影响关系, 而且还能给出样本不同属性的分类重要性。

关键词: 贡献因子, 样本属性, 径向基网络, 结构信息, 蛋白质二级预测

Abstract: At present the improvement of RBF (radial basis function) neural network mostly focuses on the hidden node selection, the function of learning rate or large-scale data organization, while the structure of the initial information in input samples is often ignored. It is also found that the different properties in input samples have a different influence on the classification capability and each property should have own category weight. After normalizing the sample, this paper studies the contribution factor to the classification for each property, then proposes to add the contribution factor to RBF network named CFRBF (contribution factor RBF) algorithm and uses the contribution factor to describe the importance of different properties. The paper chooses the prediction of protein secondary problem to validate the algorithm. The traditional secondary prediction is just to put the original samples into the network, and then classify them by hamming distance alone, so it loses a lot of information. For the new design model, the paper uses a new organization form to deal with the prediction problem. Experimental results show that after the introduction of new organizational forms, the net will be significantly improved, and its accuracy will be again increased by the CFRBF algorithm. At the same time, the contribution factor appears to reveal the irregular structure of amino acids between the state of relations, which also shows the importance of different properties in the samples.

Key words: contribution factor, sample properties, radial basis function network, structure information, protein secondary prediction