Journal of Frontiers of Computer Science and Technology ›› 2009, Vol. 3 ›› Issue (3): 293-302.DOI: 10.3778/j.issn.1673-9418.2009.03.007

• 学术研究 • Previous Articles     Next Articles

Identification of Interface Residues Involved in Protein-protein Interactions Using Naïve Bayes Classifier

WANG Chishe1,2+, CHENG Jiaxing1, SU Shoubao1, XU Dongzhe3   

  1. 1. Key Laboratory of Intelligent Computing and Signal Processing, Ministry of Education, Anhui University, Hefei 230039, China
    2. Department of Computer Science and Technology, Chaohu College, Chaohu, Anhui 238000, China
    3. Department of Modern Mechanics, University of Science and Technology of China, Hefei 230026, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-05-15 Published:2009-05-15
  • Contact: WANG Chishe



  1. 1. 安徽大学 计算智能与信号处理教育部重点实验室,合肥 230039
    2. 巢湖学院 计算机科学与技术系,安徽 巢湖 238000
    3. 中国科技大学 近代力学系,合肥 230026
  • 通讯作者: 王池社

Abstract: The identification of interface residues involved in protein-protein interactions (PPIs) has broad application in rational drug design and metabolic etc. A naïve Bayes classifier for PPIs prediction with features including protein sequence profile and residue accessible surface area is proposed. This method adequately uses the character of naïve Bayes classifier which assumes independence of the attributes given the class. The test results on a diversity dataset made up of only hetero-complex proteins achieve 68.1% overall accuracy with a correlation coefficient of 0.201, 40.2% specificity and 49.9% sensitivity in identify interface residues as estimated by leave-one-out cross-validation. This result indicates that the method performs substantially better than chance (zero correlation). Examination of the predictions in the context of 3-dimensional structures of proteins demonstrates the effectiveness of this method in identifying protein-protein sites.

Key words: naï, ve Bayes classifier, protein-protein interactions, sequence profile, residue accessible surface area

摘要: 蛋白质相互作用中界面残基的识别在药物设计与生物体的新陈代谢等方面有着广泛应用。基于朴素贝叶斯分类器对属性条件独立性的要求,构建了由蛋白质序列谱和溶剂可及表面积组成的蛋白质相互作用特征模型。在一个具有代表性的蛋白质异源复合物组成的数据集中取得了68.1%的准确率、0.201的相关系数、40.2%的特异度和49.9%的灵敏度,取得了比其他方法更优的结果,且远优于随机的实验结果。通过一个三维可视化的结果更好地验证了方法的有效性。

关键词: 朴素贝叶斯分类器, 蛋白质相互作用界面, 序列谱, 残基溶剂可及表面积