计算机科学与探索 ›› 2022, Vol. 16 ›› Issue (1): 1-20.DOI: 10.3778/j.issn.1673-9418.2107029
潘玉亮1,2, 关佶红1,+(), 姚恒1,2, 石运佳1,2, 周水庚3,4
收稿日期:
2021-06-08
修回日期:
2021-08-05
出版日期:
2022-01-01
发布日期:
2021-08-09
通讯作者:
+ E-mail: jhguan@tongji.edu.cn作者简介:
潘玉亮(1992—),男,博士研究生,CCF学生会员,主要研究方向为数据挖掘、计算生物学。基金资助:
PAN Yuliang1,2, GUAN Jihong1,+(), YAO Heng1,2, SHI Yunjia1,2, ZHOU Shuigeng3,4
Received:
2021-06-08
Revised:
2021-08-05
Online:
2022-01-01
Published:
2021-08-09
About author:
PAN Yuliang, born in 1992. Ph.D. candidate, student member of CCF. His research interests include data mining and computational biology.Supported by:
摘要:
蛋白质是生命活动的物质基础,直接参与、执行生命的活动过程。大多数蛋白质通过相互作用形成复合物来实现各种生物功能,因此预测蛋白质复合物有助于了解复合物的结构及其功能,也为细胞机制的研究奠定了重要基础。目前,随着高通量实验技术的不断发展,全基因组蛋白质相互作用(PPI)数据日益增多,领域内已经出现了很多基于计算的蛋白质复合物预测方法。虽然现有方法各具特色与优势,但也存在一些不足。首先,针对现有基于计算的蛋白质复合物预测方法进行了分类和比较全面、详细的分析评述;接着,介绍了复合物预测中常用的评价指标和主要数据集,并比较和分析了几种代表性方法的预测性能;最后,对复合物预测方法进行了总结与展望,提出了今后有待解决的若干问题。希望通过对各类方法的分析与比较,为相关人员使用和研究基于计算的蛋白质复合物预测方法提供有价值的参考和方向指引。
中图分类号:
潘玉亮, 关佶红, 姚恒, 石运佳, 周水庚. 基于计算的蛋白质复合物预测方法综述[J]. 计算机科学与探索, 2022, 16(1): 1-20.
PAN Yuliang, GUAN Jihong, YAO Heng, SHI Yunjia, ZHOU Shuigeng. Computational Methods for Protein Complex Prediction: A Survey[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 1-20.
类别 | 方法名 | 链接 |
---|---|---|
基于局部密集子图预测算法 | MCODE[ | |
ClusterOne[ | ||
DPClus[ | ||
RNSC[ | ||
IPCA[ | ||
Clique[ | ||
CFinder[ | ||
CMC[ | ||
MCL[ | ||
NEOComplex[ | ||
GraphEntropy[ | ||
ProRank+[ | ||
SPICi[ | ||
基于核心-附属结构的预测算法 | CORE[ | |
COACH[ | ||
EWCA[ | ||
基于动态网络的预测算法 | Zhang等[ | |
基于监督学习的预测算法 | Qi等[ | |
从功能到互作的预测算法 | CPredictor2.0[ | |
CPredictor3.0[ | ||
CPredictor5.0[ | ||
基于多源数据的预测算法 | SGNMF[ | |
IdenPC-CAP[ | ||
其他预测算法 | EnsemHC[ | — |
表1 蛋白质复合物预测主要方法及其代码链接汇总
Table 1 Summary of main methods and source code for protein complex prediction
类别 | 方法名 | 链接 |
---|---|---|
基于局部密集子图预测算法 | MCODE[ | |
ClusterOne[ | ||
DPClus[ | ||
RNSC[ | ||
IPCA[ | ||
Clique[ | ||
CFinder[ | ||
CMC[ | ||
MCL[ | ||
NEOComplex[ | ||
GraphEntropy[ | ||
ProRank+[ | ||
SPICi[ | ||
基于核心-附属结构的预测算法 | CORE[ | |
COACH[ | ||
EWCA[ | ||
基于动态网络的预测算法 | Zhang等[ | |
基于监督学习的预测算法 | Qi等[ | |
从功能到互作的预测算法 | CPredictor2.0[ | |
CPredictor3.0[ | ||
CPredictor5.0[ | ||
基于多源数据的预测算法 | SGNMF[ | |
IdenPC-CAP[ | ||
其他预测算法 | EnsemHC[ | — |
数据库 | 网址 |
---|---|
STRING[ | |
DIP[ | |
BioGRID[ | |
IntAct[ |
表2 蛋白质相互作用数据库
Table 2 Protein-protein interaction databases
数据库 | 网址 |
---|---|
STRING[ | |
DIP[ | |
BioGRID[ | |
IntAct[ |
数据集 | 蛋白质数量 | 蛋白质相互作用 |
---|---|---|
Gavin[ | 1 855 | 7 669 |
Krogan[ | 2 674 | 7 075 |
Collins[ | 1 622 | 9 074 |
表3 蛋白质相互作用数据集
Table 3 Protein-protein interaction data sets
数据集 | 蛋白质数量 | 蛋白质相互作用 |
---|---|---|
Gavin[ | 1 855 | 7 669 |
Krogan[ | 2 674 | 7 075 |
Collins[ | 1 622 | 9 074 |
数据集 | 蛋白质数量 | 复合物数量 |
---|---|---|
MIPS[ | 1 627 | 349 |
CYC2008[ | 1 237 | 313 |
表4 蛋白质复合物数据集
Table 4 Protein complex data sets
数据集 | 蛋白质数量 | 复合物数量 |
---|---|---|
MIPS[ | 1 627 | 349 |
CYC2008[ | 1 237 | 313 |
算法 | 复合物数量 | 小复合物 | 大复合物 | 复合物平均大小 |
---|---|---|---|---|
MCODE[ | 111 | 36 | 75 | 7.7 |
ClusterOne[ | 315 | 171 | 144 | 5.4 |
DPClus[ | 335 | 219 | 116 | 4.5 |
RNSC[ | 489 | 397 | 92 | 3.1 |
CORE[ | 299 | 256 | 43 | 2.6 |
Zhang等[ | 496 | 284 | 212 | 5.3 |
MCL[ | 297 | 190 | 107 | 5.4 |
SPICi[ | 174 | 68 | 106 | 6.5 |
CMC[ | 173 | 0 | 173 | 10.8 |
EWCA[ | 659 | 71 | 588 | 21.6 |
IPCA[ | 580 | 212 | 368 | 14.6 |
COACH[ | 246 | 43 | 203 | 17.7 |
GraphEntropy[ | 350 | 209 | 141 | 5.8 |
Clique[ | 490 | 146 | 344 | 9.1 |
ProRank+[ | 659 | 93 | 566 | 20.2 |
CFinder[ | 114 | 43 | 71 | 10.6 |
CPredictor[ | 230 | 118 | 122 | 6.3 |
CPredictor2.0[ | 764 | 525 | 239 | 3.9 |
CPredictor3.0[ | 203 | 37 | 166 | 12.3 |
CPredictor4.0[ | 408 | 215 | 193 | 5.7 |
CPredictor5.0[ | 485 | 284 | 201 | 4.8 |
表5 Collins数据集上各种方法预测结果中蛋白质复合物的属性比较
Table 5 Attribute comparison of protein complexes for different computational methods on Collins data set
算法 | 复合物数量 | 小复合物 | 大复合物 | 复合物平均大小 |
---|---|---|---|---|
MCODE[ | 111 | 36 | 75 | 7.7 |
ClusterOne[ | 315 | 171 | 144 | 5.4 |
DPClus[ | 335 | 219 | 116 | 4.5 |
RNSC[ | 489 | 397 | 92 | 3.1 |
CORE[ | 299 | 256 | 43 | 2.6 |
Zhang等[ | 496 | 284 | 212 | 5.3 |
MCL[ | 297 | 190 | 107 | 5.4 |
SPICi[ | 174 | 68 | 106 | 6.5 |
CMC[ | 173 | 0 | 173 | 10.8 |
EWCA[ | 659 | 71 | 588 | 21.6 |
IPCA[ | 580 | 212 | 368 | 14.6 |
COACH[ | 246 | 43 | 203 | 17.7 |
GraphEntropy[ | 350 | 209 | 141 | 5.8 |
Clique[ | 490 | 146 | 344 | 9.1 |
ProRank+[ | 659 | 93 | 566 | 20.2 |
CFinder[ | 114 | 43 | 71 | 10.6 |
CPredictor[ | 230 | 118 | 122 | 6.3 |
CPredictor2.0[ | 764 | 525 | 239 | 3.9 |
CPredictor3.0[ | 203 | 37 | 166 | 12.3 |
CPredictor4.0[ | 408 | 215 | 193 | 5.7 |
CPredictor5.0[ | 485 | 284 | 201 | 4.8 |
算法 | 复合物数量 | 小复合物 | 大复合物 | 复合物平均大小 |
---|---|---|---|---|
MCODE[ | 94 | 31 | 63 | 11.9 |
ClusterOne[ | 196 | 20 | 176 | 7.5 |
DPClus[ | 418 | 297 | 121 | 3.6 |
RNSC[ | 304 | 56 | 248 | 8.2 |
CORE[ | 413 | 226 | 187 | 4.7 |
Zhang等[ | 447 | 236 | 211 | 3.8 |
MCL[ | 320 | 148 | 172 | 5.7 |
SPICi[ | 149 | 72 | 77 | 4.2 |
CMC[ | 616 | 327 | 289 | 4.3 |
EWCA[ | 913 | 120 | 793 | 11.2 |
IPCA[ | 920 | 595 | 325 | 4.5 |
COACH[ | 360 | 120 | 240 | 5.6 |
GraphEntropy[ | 434 | 236 | 198 | 4.8 |
Clique[ | 1 148 | 346 | 802 | 5.7 |
ProRank+[ | 525 | 80 | 445 | 12.5 |
CFinder[ | 48 | 38 | 10 | 3.2 |
CPredictor[ | 197 | 76 | 121 | 7.3 |
CPredictor2.0[ | 698 | 360 | 338 | 3.8 |
CPredictor3.0[ | 320 | 114 | 206 | 7.7 |
CPredictor4.0[ | 303 | 140 | 163 | 5.6 |
CPredictor5.0[ | 336 | 189 | 147 | 3.9 |
表6 Gavin数据集上各种方法预测结果中蛋白质复合物的属性比较
Table 6 Attribute comparison of protein complexes for different computational methods on Gavin data set
算法 | 复合物数量 | 小复合物 | 大复合物 | 复合物平均大小 |
---|---|---|---|---|
MCODE[ | 94 | 31 | 63 | 11.9 |
ClusterOne[ | 196 | 20 | 176 | 7.5 |
DPClus[ | 418 | 297 | 121 | 3.6 |
RNSC[ | 304 | 56 | 248 | 8.2 |
CORE[ | 413 | 226 | 187 | 4.7 |
Zhang等[ | 447 | 236 | 211 | 3.8 |
MCL[ | 320 | 148 | 172 | 5.7 |
SPICi[ | 149 | 72 | 77 | 4.2 |
CMC[ | 616 | 327 | 289 | 4.3 |
EWCA[ | 913 | 120 | 793 | 11.2 |
IPCA[ | 920 | 595 | 325 | 4.5 |
COACH[ | 360 | 120 | 240 | 5.6 |
GraphEntropy[ | 434 | 236 | 198 | 4.8 |
Clique[ | 1 148 | 346 | 802 | 5.7 |
ProRank+[ | 525 | 80 | 445 | 12.5 |
CFinder[ | 48 | 38 | 10 | 3.2 |
CPredictor[ | 197 | 76 | 121 | 7.3 |
CPredictor2.0[ | 698 | 360 | 338 | 3.8 |
CPredictor3.0[ | 320 | 114 | 206 | 7.7 |
CPredictor4.0[ | 303 | 140 | 163 | 5.6 |
CPredictor5.0[ | 336 | 189 | 147 | 3.9 |
算法 | Collins | Gavin | ||||
---|---|---|---|---|---|---|
Recall | Precision | F1 | Recall | Precision | F1 | |
MCODE[ | 0.26 | 0.71 | 0.38 | 0.18 | 0.56 | 0.27 |
ClusterOne[ | 0.55 | 0.59 | 0.57 | 0.37 | 0.61 | 0.46 |
DPClus[ | 0.55 | 0.64 | 0.60 | 0.35 | 0.36 | 0.35 |
RNSC[ | 0.56 | 0.50 | 0.53 | 0.47 | 0.44 | 0.45 |
CORE[ | 0.47 | 0.52 | 0.49 | 0.46 | 0.36 | 0.40 |
Zhang等[ | 0.57 | 0.62 | 0.59 | 0.48 | 0.50 | 0.48 |
MCL[ | 0.55 | 0.56 | 0.56 | 0.41 | 0.39 | 0.40 |
SPICi[ | 0.35 | 0.63 | 0.45 | 0.29 | 0.62 | 0.40 |
CMC[ | 0.24 | 0.60 | 0.34 | 0.49 | 0.33 | 0.39 |
EWCA[ | 0.33 | 0.69 | 0.44 | 0.39 | 0.53 | 0.45 |
IPCA[ | 0.56 | 0.59 | 0.58 | 0.54 | 0.32 | 0.40 |
COACH[ | 0.32 | 0.63 | 0.42 | 0.38 | 0.37 | 0.37 |
GraphEntropy[ | 0.55 | 0.55 | 0.55 | 0.38 | 0.37 | 0.37 |
Clique[ | 0.34 | 0.39 | 0.36 | 0.41 | 0.30 | 0.35 |
ProRank+[ | 0.34 | 0.66 | 0.45 | 0.36 | 0.60 | 0.45 |
CFinder[ | 0.40 | 0.63 | 0.49 | 0.20 | 0.48 | 0.29 |
CPredictor[ | 0.47 | 0.66 | 0.55 | 0.38 | 0.62 | 0.48 |
CPredictor2.0[ | 0.56 | 0.64 | 0.60 | 0.48 | 0.52 | 0.50 |
CPredictor3.0[ | 0.54 | 0.70 | 0.61 | 0.52 | 0.55 | 0.53 |
CPredictor4.0[ | 0.56 | 0.73 | 0.63 | 0.43 | 0.72 | 0.54 |
CPredictor5.0[ | 0.60 | 0.61 | 0.61 | 0.52 | 0.54 | 0.52 |
表7 CYC2008标准库上各种方法的蛋白质复合物预测结果比较
Table 7 Comparison of protein complex prediction results for various methods on CYC2008 standard set
算法 | Collins | Gavin | ||||
---|---|---|---|---|---|---|
Recall | Precision | F1 | Recall | Precision | F1 | |
MCODE[ | 0.26 | 0.71 | 0.38 | 0.18 | 0.56 | 0.27 |
ClusterOne[ | 0.55 | 0.59 | 0.57 | 0.37 | 0.61 | 0.46 |
DPClus[ | 0.55 | 0.64 | 0.60 | 0.35 | 0.36 | 0.35 |
RNSC[ | 0.56 | 0.50 | 0.53 | 0.47 | 0.44 | 0.45 |
CORE[ | 0.47 | 0.52 | 0.49 | 0.46 | 0.36 | 0.40 |
Zhang等[ | 0.57 | 0.62 | 0.59 | 0.48 | 0.50 | 0.48 |
MCL[ | 0.55 | 0.56 | 0.56 | 0.41 | 0.39 | 0.40 |
SPICi[ | 0.35 | 0.63 | 0.45 | 0.29 | 0.62 | 0.40 |
CMC[ | 0.24 | 0.60 | 0.34 | 0.49 | 0.33 | 0.39 |
EWCA[ | 0.33 | 0.69 | 0.44 | 0.39 | 0.53 | 0.45 |
IPCA[ | 0.56 | 0.59 | 0.58 | 0.54 | 0.32 | 0.40 |
COACH[ | 0.32 | 0.63 | 0.42 | 0.38 | 0.37 | 0.37 |
GraphEntropy[ | 0.55 | 0.55 | 0.55 | 0.38 | 0.37 | 0.37 |
Clique[ | 0.34 | 0.39 | 0.36 | 0.41 | 0.30 | 0.35 |
ProRank+[ | 0.34 | 0.66 | 0.45 | 0.36 | 0.60 | 0.45 |
CFinder[ | 0.40 | 0.63 | 0.49 | 0.20 | 0.48 | 0.29 |
CPredictor[ | 0.47 | 0.66 | 0.55 | 0.38 | 0.62 | 0.48 |
CPredictor2.0[ | 0.56 | 0.64 | 0.60 | 0.48 | 0.52 | 0.50 |
CPredictor3.0[ | 0.54 | 0.70 | 0.61 | 0.52 | 0.55 | 0.53 |
CPredictor4.0[ | 0.56 | 0.73 | 0.63 | 0.43 | 0.72 | 0.54 |
CPredictor5.0[ | 0.60 | 0.61 | 0.61 | 0.52 | 0.54 | 0.52 |
[1] |
EISENBERG D, MARCOTTE E M, XENARIOS I, et al. Protein function in the post-genomic era[J]. Nature, 2000, 405(6788):823-826.
DOI URL |
[2] | 李敏, 孟祥茂. 动态蛋白质网络的构建, 分析及应用研究进展[J]. 计算机研究与发展, 2017, 54(6):1281-1299. |
LI M, MENG X M. The construction, analysis, and applications of dynamic protein-protein interaction networks[J]. Journal of Computer Research and Development, 2017, 54(6):1281-1299. | |
[3] | 王杰, 梁吉业, 郑文萍. 一种面向蛋白质复合体检测的图聚类方法[J]. 计算机研究与发展, 2015, 52(8):1784. |
WANG J, LIANG J Y, ZHENG W P. A graph clustering method for detecting protein complexes[J]. Journal of Computer Research and Development, 2015, 52(8):1784. | |
[4] |
RIGAUT G, SHEVCHENKO A, RUTZ B, et al. A generic protein purification method for protein complex characterization and proteome exploration[J]. Nature Biotechnology, 1999, 17(10):1030-1032.
DOI URL |
[5] |
AEBERSOLD R, MANN M. Mass spectrometry-based proteomics[J]. Nature, 2003, 422(6928):198-207.
DOI URL |
[6] |
ITO T, CHIBA T, OZAWA R, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome[J]. Proceedings of the National Academy of Sciences, 2001, 98(8):4569-4574.
DOI URL |
[7] |
UETZ P, GIOT L, CAGNEY G, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae[J]. Nature, 2000, 403(6770):623-627.
DOI URL |
[8] |
GAVIN A C, BÖSCHE M, KRAUSE R, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes[J]. Nature, 2002, 415(6868):141-147.
DOI URL |
[9] | 李舟军, 陈义明, 刘军万, 等. 蛋白质相互作用研究中的计算方法综述[J]. 计算机研究与发展, 2008, 45(12):2129-2137. |
LI Z J, CHEN Y M, LIU J W, et al. A survey of computational method in protein-protein interaction research[J]. Journal of Computer Research and Development, 2008, 45(12):2129-2137. | |
[10] |
PAN Y, LIU D, DENG L. Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties[J]. PLoS One, 2017, 12(6):e0179314.
DOI URL |
[11] |
PAN Y, ZHOU S, GUAN J. Computationally identifying hot spots in protein-DNA binding interfaces using an ensemble approach[J]. BMC Bioinformatics, 2020, 21(13):1-16.
DOI URL |
[12] |
GIRVAN M, NEWMAN M J. Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences, 2002, 99(12):7821-7826.
DOI URL |
[13] |
NEWMAN M E J. Fast algorithm for detecting community structure in networks[J]. Physical Review E, 2004, 69(6):066133.
DOI URL |
[14] | RADICCHI F, CASTELLANO C, CECCONI F, et al. Defining and identifying communities in networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(9):2658-2663. |
[15] |
DUNN R, DUDBRIDGE F, SANDERSON C M. The use of edge-betweenness clustering to investigate biological function in protein interaction networks[J]. BMC Bioinformatics, 2005, 6(1):39.
DOI URL |
[16] |
YOON J, BLUMER A, LEE K. An algorithm for modularity analysis of directed and weighted biological networks based on edge-betweenness centrality[J]. Bioinformatics, 2006, 22(24):3106-3108.
DOI URL |
[17] |
PALLA G, DERÉNYI I, FARKAS I, et al. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature, 2005, 435(7043):814-818.
DOI URL |
[18] |
WANG R, LIU G, WANG C. Identifying protein complexes based on an edge weight algorithm and core-attachment structure[J]. BMC Bioinformatics, 2019, 20(1):1-20.
DOI URL |
[19] |
YAO H, SHI Y, GUAN J, et al. Accurately detecting protein complexes by graph embedding and combining functions with interactions[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2019, 17(3):777-787.
DOI URL |
[20] |
WANG R, WANG C, LIU G. A novel graph clustering method with a greedy heuristic search algorithm for mining protein complexes from dynamic and static PPI networks[J]. Information Sciences, 2020, 522:275-298.
DOI URL |
[21] |
JALILI S, MARASHI S A. CAMWI: detecting protein complexes using weighted clustering coefficient and weighted density[J]. Computational Biology and Chemistry, 2015, 58:231-240.
DOI URL |
[22] |
TANG X, WANG J, LIU B, et al. A comparison of the functional modules identified from time course and static PPI network data[J]. BMC Bioinformatics, 2011, 12(1):339.
DOI URL |
[23] |
WANG J, PENG X, LI M, et al. Construction and application of dynamic protein interaction network based on time course gene expression data[J]. Proteomics, 2013, 13(2):301-312.
DOI URL |
[24] |
QI Y, BALEM F, FALOUTSOS C, et al. Protein complex identification by supervised graph local clustering[J]. Bioinformatics, 2008, 24(13):i250-i268.
DOI URL |
[25] | YONG C H, WONG L, MARUYAMA O. Discovery of small protein complexes from PPI networks with size-specific supervised weighting[J]. BMC Systems Biology, 2014, 8(5):S3. |
[26] |
XU B, GUAN J. From function to interaction: a new paradigm for accurately predicting protein complexes based on protein-to-protein interaction networks[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2014, 11(4):616-627.
DOI URL |
[27] |
XU B, WANG Y, WANG Z, et al. An effective approach to detecting both small and large complexes from protein-protein interaction networks[J]. BMC Bioinformatics, 2017, 18(12):419.
DOI URL |
[28] |
CHEN B, FAN W, LIU J, et al. Identifying protein complexes and functional modules—from static PPI networks to dynamic PPI networks[J]. Briefings in Bioinformatics, 2013, 15(2):177-194.
DOI URL |
[29] |
WU Z, LIAO Q, LIU B. A comprehensive review and evaluation of computational methods for identifying protein complexes from protein-protein interaction networks[J]. Briefings in Bioinformatics, 2020, 21(5):1531-1548.
DOI URL |
[30] | 于杨. 基于静态网络的蛋白质复合物预测方法综述[J]. 软件工程与应用, 2018, 7(3):151-159. |
YU Y. A survey of computational methods for protein complexes prediction based on static PPI networks[J]. Software Engineering and Applications, 2018, 7(3):151-159. | |
[31] | 代启国, 郭茂祖. 基于蛋白质网络的复合体识别研究综述[J]. 智能计算机与应用, 2015, 5(3):1-3. |
DAI Q G, GUO M Z. Survey on detecting complexes from protein-protein interaction network[J]. Intelligent Computer and Applications, 2015, 5(3):1-3. | |
[32] |
PAN Y, WANG Z, ZHAN W, et al. Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach[J]. Bioinformatics, 2018, 34(9):1473-1480.
DOI URL |
[33] |
GAO Y, YAN L, HUANG Y, et al. Structure of the RNA-dependent RNA polymerase from COVID-19 virus[J]. Science, 2020, 368(6492):779-782.
DOI URL |
[34] |
BARABASI A L, OLTVAI Z N. Network biology: understanding the cell’s functional organization[J]. Nature Reviews Genetics, 2004, 5(2):101-113.
DOI URL |
[35] | 郭茂祖, 代启国, 徐立秋, 等. 一种蛋白质复合体模块度函数及其识别算法[J]. 计算机研究与发展, 2014, 51(10):2178-2186. |
GUO M Z, DAI Q G, XU L Q, et al. On protein complexes identifying algorithm based on the novel modularity function[J]. Journal of Computer Research and Development, 2014, 51(10):2178-2186. | |
[36] |
BADER G D, HOGUE C W V. An automated method for finding molecular complexes in large protein interaction networks[J]. BMC Bioinformatics, 2003, 4(1):2.
DOI URL |
[37] |
PEREIRA-LEAL J B, ENRIGHT A J, OUZOUNIS C A. Prediction of functional modules from protein interaction networks[J]. Proteins: Structure, Function, and Bioinformatics, 2004, 54(1):49-57.
DOI URL |
[38] |
NEPUSZ T, YU H, PACCANARO A. Detecting overlapping protein complexes in protein-protein interaction networks[J]. Nature Methods, 2012, 9(5):471-472.
DOI URL |
[39] |
KENLEY E C, CHO Y R. Detecting protein complexes and functional modules from protein interaction networks: a graph entropy approach[J]. Proteomics, 2011, 11(19):3835-3844.
DOI URL |
[40] |
SPIRIN V, MIRNY L A. Protein complexes and functional modules in molecular networks[J]. Proceedings of the National Academy of Sciences, 2003, 100(21):12123-12128.
DOI URL |
[41] | LI X L, FOO C S, TAN S H, et al. Interaction graph mining for protein complexes using local clique merging[J]. Genome Informatics, 2005, 16(2):260-269. |
[42] |
ADAMCSEK B, PALLA G, FARKAS I J, et al. CFinder: locating cliques and overlapping modules in biological networks[J]. Bioinformatics, 2006, 22(8):1021-1023.
DOI URL |
[43] |
WANG Y, QIAN X. Finding low-conductance sets with dense interactions (FLCD) for better protein complex prediction[J]. BMC Systems Biology, 2017, 11(3):22.
DOI URL |
[44] |
OMRANIAN S, ANGELESKA A, NIKOLOSKI Z. PC2P: parameter-free network-based prediction of protein complexes[J]. Bioinformatics, 2021, 37(1):73-81.
DOI URL |
[45] |
SHEN X, ZHAO Y, LI Y, et al. An integrated approach to identify protein complex based on best neighbour and modularity increment[J]. International Journal of Data Mining and Bioinformatics, 2015, 11(4):458-473.
DOI URL |
[46] |
DIMITRAKOPOULOS C, THEOFILATOS K, PEGKAS A, et al. Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods[J]. Artificial Intelligence in Medicine, 2016, 71:62-69.
DOI URL |
[47] |
LI P, HE T, HU X, et al. A novel protein complex identification algorithm based on connected affinity clique extension (CACE)[J]. IEEE Transactions on Nanobioscience, 2014, 13(2):89-96.
DOI URL |
[48] | UCAR D, ASUR S, ÇATALYÜREK Ü V, et al. Improving functional modularity in protein-protein interactions graphs using hub-induced subgraphs[C]// LNCS 4213: Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, Berlin, Sep 18-22, 2006. Berlin, Heidelberg: Springer, 2006: 371-382. |
[49] | METE M, TANG F, XU X, et al. A structural approach for finding functional modules from large biological networks[J]. BMC Bioinformatics, 2008, 9(9):S19. |
[50] |
ZHANG W, ZOU X. A new method for detecting protein complexes based on the three node cliques[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(4):879-886.
DOI URL |
[51] | REN J, WANG J, LI M, et al. Identifying protein complexes based on density and modularity in protein-protein interaction network[J]. BMC Systems Biology, 2013, 7(4):S12. |
[52] |
NAVLAKHA S, SCHATZ M C, KINGSFORD C. Revealing biological modules via graph summarization[J]. Journal of Computational Biology, 2009, 16(2):253-264.
DOI URL |
[53] |
GEVA G, SHARAN R. Identification of protein complexes from co-immunoprecipitation data[J]. Bioinformatics, 2010, 27(1):111-117.
DOI URL |
[54] |
JIA S, GAO L, GAO Y, et al. Defining and identifying cograph communities in complex networks[J]. New Journal of Physics, 2015, 17(1):013044.
DOI URL |
[55] | HU L, YUAN X, XIONG S. Identifying overlapping protein complexes in yeast protein interaction network via fuzzy clustering[C]// Proceedings of the 2017 IEEE International Conference on Fuzzy Systems, Naples, Jul 9-12, 2017. Piscataway: IEEE, 2017: 1-6. |
[56] | RAHMAN M S, NGOM A. A fast agglomerative community detection method for protein complex discovery in protein interaction networks[C]// LNCS 7986: Proceedings of the 8th IAPR International Conference on Pattern Recognition in Bioinformatics, Nice, Jun 17-20, 2013. Berlin, Heidelberg: Springer, 2013: 1-12. |
[57] | CHIN E, ZHU J. B3Clustering: identifying protein complexes from protein-protein interaction network[C]// LNCS 7808: Proceedings of the 15th Asia-Pacific Web Conference on Web Technologies and Applications, Sydney, Apr 4-6, 2013. Berlin, Heidelberg: Springer, 2013: 108-119. |
[58] |
ALTAF-UL-AMIN M, SHINBO Y, MIHARA K, et al. Development and implementation of an algorithm for prediction of protein complexes in large interaction networks[J]. BMC Bioinformatics, 2006, 7(1):207.
DOI URL |
[59] |
LI M, CHEN J, WANG J, et al. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures[J]. BMC Bioinformatics, 2008, 9(1):398.
DOI URL |
[60] |
LIU G, WONG L, CHUA H N. Complex discovery from weighted PPI networks[J]. Bioinformatics, 2009, 25(15):1891-1897.
DOI URL |
[61] | SHEN X, ZHAO Y, LI Y, et al. An efficient protein complex mining algorithm based on multistage kernel extension[J]. BMC Bioinformatics, 2014, 15(12):S7. |
[62] |
NI W, XIONG H, ZHAO B, et al. Predicting overlapping protein complexes in weighted interactome networks[J]. Journal of Zhejiang University: Science C, 2013, 14(10):756-765.
DOI URL |
[63] |
HANNA E M, ZAKI N. Detecting protein complexes in protein interaction networks using a ranking algorithm with a refined merging procedure[J]. BMC Bioinformatics, 2014, 15(1):204.
DOI URL |
[64] |
JIANG P, SINGH M. SPICi: a fast clustering algorithm for large biological networks[J]. Bioinformatics, 2010, 26(8):1105-1111.
DOI URL |
[65] |
BANDYOPADHYAY S, RAY S, MUKHOPADHYAY A, et al. A multiobjective approach for identifying protein complexes and studying their association in multiple disorders[J]. Algorithms for Molecular Biology, 2015, 10(1):24.
DOI URL |
[66] |
THEOFILATOS K, PAVLOPOULOU N, PAPASAVVAS C, et al. Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: evolutionary enhanced Markov clustering[J]. Artificial Intelligence in Medicine, 2015, 63(3):181-189.
DOI URL |
[67] |
RAMADAN E, NAEF A, AHMED M. Protein complexes predictions within protein interaction networks using genetic algorithms[J]. BMC Bioinformatics, 2016, 17(7):269.
DOI URL |
[68] |
CAO B, LUO J, LIANG C, et al. MOEPGA: a novel method to detect protein complexes in yeast protein-protein interaction networks based on multiobjective evolutionary programming genetic algorithm[J]. Computational Biology and Chemistry, 2015, 58:173-181.
DOI URL |
[69] |
ARNAU V, MARS S, MARÍN I. Iterative cluster analysis of protein interaction data[J]. Bioinformatics, 2004, 21(3):364-378.
DOI URL |
[70] | MA C Y, CHEN Y P P, BERGER B, et al. Identification of protein complexes by integrating multiple alignment of protein interaction networks[J]. Bioinformatics, 2017, 33(11):1681-1688. |
[71] | LI P, HU X, HE T, et al. Mining protein complexes based on connected affinity clique extension[C]// Proceedings of the 2013 IEEE International Conference on Bioinformatics and Biomedicine, Shanghai, Dec 18-21, 2013. Washington: IEEE Computer Society, 2013: 53-56. |
[72] |
CHUA H N, NING K, SUNG W K, et al. Using indirect protein-protein interactions for protein complex prediction[J]. Journal of Bioinformatics and Computational Biology, 2008, 6(3):435-466.
DOI URL |
[73] |
FRIEDEL C C, KRUMSIEK J, ZIMMER R. Bootstrapping the interactome: unsupervised identification of protein complexes in yeast[J]. Journal of Computational Biology, 2009, 16(8):971-987.
DOI URL |
[74] |
WU Z, LIAO Q, LIU B. idenPC-MIIP: identify protein complexes from weighted PPI networks using mutual important interacting partner relation[J]. Briefings in Bioinformatics, 2021, 22(2):1972-1983.
DOI URL |
[75] |
YAO H, GUAN J, LIU T. Denoising protein-protein interaction network via variational graph auto-encoder for protein complex detection[J]. Journal of Bioinformatics and Computational Biology, 2020, 18(3):2040010.
DOI URL |
[76] |
KOMUROV K, WHITE M. Revealing static and dynamic modular architecture of the eukaryotic protein interaction network[J]. Molecular Systems Biology, 2007, 3(1):110.
DOI URL |
[77] |
FENG J, JIANG R, JIANG T. A max-flow-based approach to the identification of protein complexes using protein interaction and microarray data[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2011, 8(3):621-634.
DOI URL |
[78] |
MARAZIOTIS I A, DIMITRAKOPOULOU K, BEZERIANOS A. Growing functional modules from a seed protein via integration of protein interaction and gene expression data[J]. BMC Bioinformatics, 2007, 8(1):408.
DOI URL |
[79] |
CHEN W, LI M, WU X, et al. Identifying protein complexes based on the integration of PPI network and gene expression data[J]. International Journal of Bioinformatics Research and Applications, 2015, 11(1):30-44.
DOI URL |
[80] |
KERETSU S, SARMAH R. Weighted edge based clustering to identify protein complexes in protein-protein interaction networks incorporating gene expression profile[J]. Computational Biology and Chemistry, 2016, 65:69-79.
DOI URL |
[81] |
ULITSKY I, SHAMIR R. Identification of functional modules using network topology and high-throughput data[J]. BMC Systems Biology, 2007, 1(1):8.
DOI URL |
[82] |
OU-YANG L, DAI D Q, ZHANG X F. Detecting protein complexes from signed protein-protein interaction networks[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(6):1333-1344.
DOI URL |
[83] |
WANG R, WANG C, SUN L, et al. A seed-extended algorithm for detecting protein complexes based on density and modularity with topological structure and GO annotations[J]. BMC Genomics, 2019, 20(1):637.
DOI URL |
[84] |
HU L, CHAN K C C. A density-based clustering approach for identifying overlapping protein complexes with functional preferences[J]. BMC Bioinformatics, 2015, 16(1):174.
DOI URL |
[85] |
CAI B, WANG H, ZHENG H, et al. Identification of protein complexes from tandem affinity purification/mass spectrometry data via biased random walk[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(2):455-466.
DOI URL |
[86] | ZHANG Y, LIN H, YANG Z, et al. Integrating experimental and literature protein-protein interaction data for protein complex prediction[J]. BMC Genomics, 2015, 16(2):S4. |
[87] |
RIZZETTO S, PRIAMI C, CSIKÁSZ-NAGY A. Qualitative and quantitative protein complex prediction through proteome-wide simulations[J]. PLoS Computational Biology, 2015, 11(10):e1004424.
DOI URL |
[88] |
KING A D, PRŽULJ N, JURISICA I. Protein complex prediction via cost-based clustering[J]. Bioinformatics, 2004, 20(17):3013-3020.
DOI URL |
[89] |
LUBOVAC Z, GAMALIELSSON J, OLSSON B. Combining functional and topological properties to identify core modules in protein interaction networks[J]. Proteins: Structure, Function, and Bioinformatics, 2006, 64(4):948-959.
DOI URL |
[90] |
CHO Y R, HWANG W, RAMANATHAN M, et al. Semantic integration to identify overlapping functional modules in protein interaction networks[J]. BMC Bioinformatics, 2007, 8(1):265.
DOI URL |
[91] |
KIM P M, LU L J, XIA Y, et al. Relating three-dimensional structures to protein networks provides evolutionary insights[J]. Science, 2006, 314(5807):1938-1941.
DOI URL |
[92] |
OU-YANG L, YAN H, ZHANG X F. A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks[J]. BMC Bioinformatics, 2017, 18(13):463.
DOI URL |
[93] | JUNG S H, JANG W H, HUR H Y, et al. Protein complex prediction based on mutually exclusive interactions in protein interaction network[J]. Genome Informatics, 2008, 21(1):77-88. |
[94] |
WILL T, HELMS V. Identifying transcription factor complexes and their roles[J]. Bioinformatics, 2014, 30(17):i415-i421.
DOI URL |
[95] | MARUYAMA O, WONG L. Regularizing predicted complexes by mutually exclusive protein-protein interactions[C]// Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Paris, Aug 25-28, 2015. New York: ACM, 2015: 1068-1075. |
[96] |
GAVIN A C, ALOY P, GRANDI P, et al. Proteome survey reveals modularity of the yeast cell machinery[J]. Nature, 2006, 440(7084):631-636.
DOI URL |
[97] |
LEUNG H C M, XIANG Q, YIU S M, et al. Predicting protein complexes from PPI data: a core-attachment approach[J]. Journal of Computational Biology, 2009, 16(2):133-144.
DOI URL |
[98] |
WU M, LI X, KWOH C K, et al. A core-attachment based method to detect protein complexes in PPI networks[J]. BMC Bioinformatics, 2009, 10(1):169.
DOI URL |
[99] | KOUHSAR M, ZARE-MIRAKABAD F, JAMALI Y. WCOACH: protein complex prediction in weighted PPI networks[J]. Genes & Genetic Systems, 2015, 90(5):317-324. |
[100] |
PENG W, WANG J, ZHAO B, et al. Identification of protein complexes using weighted PageRank-Nibble algorithm and core-attachment structure[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(1):179-192.
DOI URL |
[101] | LUO J, LIN D, CAO B. A cell-core-attachment approach for identifying protein complexes in yeast protein-protein interaction network[J]. Journal of Intelligent & Fuzzy Systems, 2016, 31(2):967-978. |
[102] |
MEHRANFAR A, GHADIRI N, KOUHSAR M, et al. A Type-2 fuzzy data fusion approach for building reliable weighted protein interaction networks with application in protein complex prediction[J]. Computers in Biology and Medicine, 2017, 88:18-31.
DOI URL |
[103] |
HANNA E M, ZAKI N, AMIN A. Detecting protein complexes in protein interaction networks modeled as gene expression biclusters[J]. PLoS One, 2015, 10(12):e0144163.
DOI URL |
[104] |
SHEN X, YI L, JIANG X, et al. Mining temporal protein complex based on the dynamic PIN weighted with connected affinity and gene co-expression[J]. PLoS One, 2016, 11(4):e0153967.
DOI URL |
[105] |
OU-YANG L, DAI D Q, LI X L, et al. Detecting temporal protein complexes from dynamic protein-protein interaction networks[J]. BMC Bioinformatics, 2014, 15(1):335.
DOI URL |
[106] |
MUCHA P J, RICHARDSON T, MACON K, et al. Community structure in time-dependent, multiscale, and multiplex networks[J]. Science, 2010, 328(5980):876-878.
DOI URL |
[107] | JIN R, MCCALLEN S, LIU C C, et al. Identifying dynamic network modules with temporal and spatial constraints[C]// Proceedings of the 2009 Pacific Symposium, Kohala Coast, Jan 5-9, 2009: 203-214. |
[108] | SHEN X J, LI Y, JIANG X P, et al. Detecting temporal protein complexes based on neighbor closeness and time course protein interaction networks[C]// Proceedings of the 2015 IEEE International Conference on Bioinformatics and Biomedicine, Washington, Nov 9-12, 2015. Washington: IEEE Computer Society, 2015: 109-112. |
[109] |
ZHANG Y, LIN H, YANG Z, et al. A method for predicting protein complex in dynamic PPI networks[J]. BMC Bioinformatics, 2016, 17(7):229.
DOI URL |
[110] |
LEI X, WANG F, WU F X, et al. Protein complex identification through Markov clustering with firefly algorithm on dynamic protein-protein interaction networks[J]. Information Sciences, 2016, 329:303-316.
DOI URL |
[111] |
LEI X, ZHANG Y, CHENG S, et al. Topology potential based seed-growth method to identify protein complexes on dynamic PPI data[J]. Information Sciences, 2018, 425:140-153.
DOI URL |
[112] |
SHI Y, YAO H, GUAN J, et al. CPredictor 4.0: effectively detecting protein complexes in weighted dynamic PPI networks[J]. International Journal of Data Mining and Bioinformatics, 2018, 20(4):303-319.
DOI URL |
[113] | FIANNACA A, LA ROSA M, URSO A, et al. A know-ledge-based decision support system in bioinformatics: an application to protein complex extraction[J]. BMC Bioinformatics, 2013, 14(S1):S5. |
[114] |
SHI L, LEI X, ZHANG A. Protein complex prediction with semi-supervised learning in protein interaction networks[J]. Proteome Science, 2011, 9(1):S5.
DOI URL |
[115] | YU F Y, YANG Z H, TANG N, et al. Predicting protein complex in protein interaction network—a supervised learning based method[J]. BMC Systems Biology, 2014, 8(3):S4. |
[116] | YU F Y, YANG Z H, HU X H, et al. Protein complex detection in PPI networks based on data integration and supervised learning method[J]. BMC Bioinformatics, 2015, 16(12):S3. |
[117] |
SIKARNDAR M, ANWAR W, ALMOGREN A, et al. IoMT-based association rule mining for the prediction of human protein complexes[J]. IEEE Access, 2020, 8:6226-6237.
DOI URL |
[118] |
XU Y, ZHOU J, ZHOU S, et al. CPredictor3.0: detecting protein complexes from PPI networks with expression data and functional annotations[J]. BMC Systems Biology, 2017, 11(7):45-56.
DOI URL |
[119] | WU Z, LIAO Q, FAN S, et al. idenPC-CAP: identify protein complexes from weighted RNA-protein heterogeneous interaction networks using co-assemble partner relation[J]. Briefings in Bioinformatics, 2021, 22(4):372. |
[120] |
SHARAN R, IDEKER T, KELLEY B, et al. Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data[J]. Journal of Computational Biology, 2005, 12(6):835-846.
DOI URL |
[121] |
WU M, OUYANG L, LI X L. Protein complex detection via effective integration of base clustering solutions and co-complex affinity scores[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 14(3):733-739.
DOI URL |
[122] |
SZKLARCZYK D, GABLE A L, LYON D, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets[J]. Nucleic Acids Research, 2019, 47(D1):D607-D613.
DOI URL |
[123] |
XENARIOS I, SALWINSKI L, DUAN X J, et al. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions[J]. Nucleic Acids Research, 2002, 30(1):303-305.
DOI URL |
[124] |
OUGHTRED R, STARK C, BREITKREUTZ B J, et al. The BioGRID interaction database: 2019 update[J]. Nucleic Acids Research, 2019, 47(D1):D529-D541.
DOI URL |
[125] |
KERRIEN S, ARANDA B, BREUZA L, et al. The IntAct molecular interaction database in 2012[J]. Nucleic Acids Research, 2012, 40(D1):D841-D846.
DOI URL |
[126] |
KROGAN N J, CAGNEY G, YU H, et al. Global landscape of protein complexes in the yeast saccharomyces cerevisiae[J]. Nature, 2006, 440(7084):637-643.
DOI URL |
[127] |
COLLINS S R, KEMMEREN P, ZHAO X C, et al. Toward a comprehensive atlas of the physical interactome of saccharomyces cerevisiae[J]. Molecular & Cellular Proteomics, 2007, 6(3):439-450.
DOI URL |
[128] |
MEWES H W, FRISHMAN D, MAYER K F X, et al. MIPS: analysis and annotation of proteins from whole genomes in 2005[J]. Nucleic Acids Research, 2006, 34:D169-D172.
DOI URL |
[129] |
PU S, WONG J, TURNER B, et al. Up-to-date catalogues of yeast protein complexes[J]. Nucleic Acids Research, 2009, 37(3):825-831.
DOI URL |
[130] |
OZAWA Y, SAITO R, FUJIMORI S, et al. Protein complex prediction via verifying and reconstructing the topology of domain-domain interactions[J]. BMC Bioinformatics, 2010, 11(1):350.
DOI URL |
[131] |
JUNG S H, HYUN B, JANG W H, et al. Protein complex prediction based on simultaneous protein interaction network[J]. Bioinformatics, 2009, 26(3):385-391.
DOI URL |
[132] | LIU G, YONG C H, CHUA H N, et al. Decomposing PPI networks for complex discovery[C]// Proceedings of the 2010 IEEE International Conference on Bioinformatics and Biomedicine, Hong Kong, China, Dec 18-21, 2010. Washington: IEEE Computer Society, 2010: 280-283. |
[133] |
SRIHARI S, LEONG H W. Employing functional interactions for characterisation and detection of sparse complexes from yeast PPI networks[J]. International Journal of Bioinformatics Research and Applications, 2012, 8(3/4):286-304.
DOI URL |
[134] | GROVER A, LESKOVEC J . node2vec: scalable feature learning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, Aug 13-17, 2016. New York: ACM, 2016: 855-864. |
[135] | WANG C, PAN S, LONG G, et al. MGAE: marginalized graph autoencoder for graph clustering[C]// Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, Nov 6-10, 2017. New York: ACM, 2017: 889-898. |
[1] | 李鹏, 罗爱静, 闵慧, 谭荪怡, 郭惠敏. 采用隐马尔科夫模型的蛋白质复合物识别研究[J]. 计算机科学与探索, 2021, 15(10): 1980-1989. |
[2] | 郑文萍,李晋玉,王杰. 基于遗传算法的蛋白质复合物识别算法[J]. 计算机科学与探索, 2018, 12(5): 794-803. |
[3] | 丁玉连,雷秀娟,代才. 模拟鸽子优化过程的蛋白质复合物识别算法[J]. 计算机科学与探索, 2017, 11(8): 1279-1287. |
[4] | 周 超+,孙海龙,胡春明,葛在兴. 面向生物信息的网格工作流开发与运行环境[J]. 计算机科学与探索, 2010, 4(3): 275-282. |
[5] | 朱扬勇1,2+ ,熊 赟1 . BioSeg:一个生物序列数据模型[J]. 计算机科学与探索, 2008, 2(1): 77-96. |
[6] | 王 飞,唐 音,奚燕萍,陆汝钤. 生物过程的数学方法*[J]. 计算机科学与探索, 2007, 1(第1期): 17-38. |
[7] | 王 飞,唐 音,奚燕萍,陆汝钤. 生物过程的数学方法*[J]. 计算机科学与探索, 2007, 1(1): 17-38. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||