Protein Structure Class Prediction Based on Autocorrelation Coefficient and PseAAC

doi:10.3778/j.issn.1673-9418.1307017

Journal of Frontiers of Computer Science and Technology ›› 2014, Vol. 8 ›› Issue (1): 103-110.DOI: 10.3778/j.issn.1673-9418.1307017

Previous Articles Next Articles

Protein Structure Class Prediction Based on Autocorrelation Coefficient and PseAAC

ZHANG Yanping1,2, ZHA Yongliang1,2, ZHAO Shu1,2, DU Xiuquan1,2+

1. School of Computer Science and Technology, Anhui University, Hefei 230601, China
2. Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China

Online:2014-01-01 Published:2014-01-03

基于自相关系数和PseAAC的蛋白质结构类预测

张燕平1,2，查永亮1,2，赵姝1,2，杜秀全1,2+

1. 安徽大学计算机科学与技术学院，合肥 230601
2. 安徽大学计算智能和信号处理教育部重点实验室，合肥 230601

Abstract

Abstract: In the traditional prediction methods, only the composition of amino acids was taken into account in constructing feature vector. While both the position and interaction of the amino acids which are at the different locations can be reflected well by the correlation coefficient. Firstly, this paper designs a method which combines amino acid composition and correlation coefficient. Secondly, on the basis of the pseudo-amino acid composition (PseAAC) model proposed by Chou, this paper reconstructs the PseAAC model by extending the information, and combines the PseAAC model and autocorrelation coefficient to construct feature vector. Using the two new methods for coding, several experiments are conducted on the datasets Z277, Z498 and the independent test sets D138 with the prediction tool of support vector machine. The experimental comparison results show that the accuracy of the new method can improve 7.43% and 8.53% on average than the traditional amino acid composition method, which proves that the new method is more effective.

Key words: protein structure class prediction, autocorrelation coefficient, pseudo-amino acid composition (PseAAC), support vector machine (SVM)

摘要： 传统的预测方法在构造特征向量时只考虑了氨基酸的组成，而自相关系数不仅能够很好地反映序列中氨基酸的位置信息，而且考虑了序列内部不同位置的氨基酸间的相互影响。设计了一种将氨基酸组成和自相关系数相结合的方法来构造特征向量；在Chou提出的伪氨基酸组成模型（pseudo-amino acid composition，PseAAC）的基础上，通过扩展信息重新构造了伪氨基酸组成模型，并将其与自相关系数组合在一起来构造特征向量。分别使用两种方法编码，选用支持向量机作为预测工具，在数据集Z277、Z498以及独立测试集D138上进行了若干实验，对比结果显示，新方法比传统的氨基酸组成方法的准确率分别平均提高了7.43%和8.53%，证明了新方法是有效的。

ZHANG Yanping, ZHA Yongliang, ZHAO Shu, DU Xiuquan. Protein Structure Class Prediction Based on Autocorrelation Coefficient and PseAAC[J]. Journal of Frontiers of Computer Science and Technology, 2014, 8(1): 103-110.

张燕平，查永亮，赵姝，杜秀全. 基于自相关系数和PseAAC的蛋白质结构类预测[J]. 计算机科学与探索, 2014, 8(1): 103-110.

[1]	LIN Hao, LI Leixiao, WANG Hui. Survey on Research and Application of Support Vector Machines in Intelligent Transportation System [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(6): 901-917.
[2]	FU Kang'an, WANG Wenjian, GUO Husheng. Categorical Data Classification Approach Based on Space Correlation Analysis [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(7): 1165-1173.
[3]	HE Li, HAN Keping, LIU Ying. Self-Adaptive SVM Incremental Learning Algorithm [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(4): 647-656.
[4]	WANG Lijuan, DING Shifei. SVM-ELM Model Based on Particle Swarm Optimization [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(4): 657-665.
[5]	WU Yifan, LIANG Jiye, WANG Junhong. Classification Algorithm Based on Hybrid Sampling for Unbalanced Data [J]. Journal of Frontiers of Computer Science and Technology, 2019, 13(2): 342-349.
[6]	PENG Qing, JI Guishu, XIE Linjiang1 ZHANG Shaobo. Application of Convolutional Neural Network in Vehicle Recognition [J]. Journal of Frontiers of Computer Science and Technology, 2018, 12(2): 282-291.
[7]	LU Shuxia, ZHOU Mi, JIN Zhao. Imbalanced Weighted Stochastic Gradient Descent Online Algorithm for SVM [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(10): 1662-1671.
[8]	ZHAO Yi, HE Keqing, LI Zhao, HUANG Yiwang. Micro Blog Evolutionary Network to Classification Method of Negative Information [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(1): 91-98.
[9]	WANG Hongqiao, CAI Yanning, FU Guangyuan, WANG Shicheng. Probability Density Estimation for Non-flat Functions [J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(4): 589-599.
[10]	TANG Li, GONG Xiujun, HE Li. Survey on PAC-Bayes Theory and Application Research [J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(1): 1-13.
[11]	GUO Yanxiang, CHEN Yaowu. Vehicle License Plate Location Method Based on Edge Detection and Color-Texture Histogram [J]. Journal of Frontiers of Computer Science and Technology, 2014, 8(6): 719-726.
[12]	ZHANG Ling, QIAN Fulan, HE Fugui. Granular Computing and Statistical Learning [J]. Journal of Frontiers of Computer Science and Technology, 2013, 7(8): 754-761.
[13]	TIAN Hao, LI Guohui, LIAN Lin, JIA Li. Hierarchical Matching Kernel for Buildings Classification in Remote Sensing Images [J]. Journal of Frontiers of Computer Science and Technology, 2011, 5(7): 588-594.
[14]	ZHAI Junhai, WANG Tingting, WANG Xizhao . Instance Reduction Support Vector Machine [J]. Journal of Frontiers of Computer Science and Technology, 2011, 5(12): 1131-1138.
[15]	LE THI Hoai An+， NGUYEN Van-Vinh， OUCHANI Samir. Gene Selection for Cancer Classification Using DCA [J]. Journal of Frontiers of Computer Science and Technology, 2009, 3(6): 612-620.

Protein Structure Class Prediction Based on Autocorrelation Coefficient and PseAAC

基于自相关系数和PseAAC的蛋白质结构类预测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics