Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (1): 1-20.DOI: 10.3778/j.issn.1673-9418.2107029

• Surveys and Frontiers • Previous Articles     Next Articles

Computational Methods for Protein Complex Prediction: A Survey

PAN Yuliang1,2, GUAN Jihong1,+(), YAO Heng1,2, SHI Yunjia1,2, ZHOU Shuigeng3,4   

  1. 1. College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China
    2. Key Laboratory of Embedded System and Service Computing of Ministry of Education, Tongji University, Shanghai 201804, China
    3. School of Computer Science, Fudan University, Shanghai 200433, China
    4. Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 200433, China
  • Received:2021-06-08 Revised:2021-08-05 Online:2022-01-01 Published:2021-08-09
  • About author:PAN Yuliang, born in 1992. Ph.D. candidate, student member of CCF. His research interests include data mining and computational biology.
    GUAN Jihong, born in 1969, professor, Ph.D. supervisor, distinguished member of CCF. Her research interests include databases, data mining, bioinformatics, etc.
    YAO Heng, born in 1993, Ph.D. His research interests include network representation learning and bioinformatics.
    SHI Yunjia, born in 1993, M.S. Her research interest is bioinformatics.
    ZHOU Shuigeng, born in 1966, professor, Ph.D. supervisor, CCF Fellow. His research interests include big data management and analytics, artificial intelligence and bioinformatics.
  • Supported by:
    National Natural Science Foundation of China(61772367);National Natural Science Foundation of China(61972100)

基于计算的蛋白质复合物预测方法综述

潘玉亮1,2, 关佶红1,+(), 姚恒1,2, 石运佳1,2, 周水庚3,4   

  1. 1.同济大学 电子与信息工程学院,上海 201804
    2.同济大学 嵌入式系统与服务计算教育部重点实验室,上海 201804
    3.复旦大学 计算机科学技术学院,上海 200433
    4.复旦大学 上海市智能信息处理重点实验室,上海 200433
  • 通讯作者: + E-mail: jhguan@tongji.edu.cn
  • 作者简介:潘玉亮(1992—),男,博士研究生,CCF学生会员,主要研究方向为数据挖掘、计算生物学。
    关佶红(1969—),女,教授,博士生导师,CCF杰出会员,主要研究方向为数据库、数据挖掘、生物信息学等。
    姚恒(1993—),男,博士,主要研究方向为网络表征学习、生物信息学。
    石运佳(1993—),女,硕士,主要研究方向为生物信息学。
    周水庚(1966—),男,教授,博士生导师,CCF会士,主要研究方向为大数据管理与分析、人工智能、生物信息学。
  • 基金资助:
    国家自然科学基金(61772367);国家自然科学基金(61972100)

Abstract:

Proteins, as the material basis of life, are the ultimate controller and direct performer of life activities. Most proteins perform their biological functions by binding to other proteins to form complexes. The identification of protein complexes can help people to better understand the organization and functions of the complexes, as well as the mechanisms of cells. At present, the rapid development of high-throughput experimental technology has led to huge amounts of protein-protein interaction (PPI) data. Many methods for computationally predicting protein complexes based on PPI data have been proposed. Different methods have their own characteristics and advantages, and they also have inherent drawbacks. Firstly, this paper classifies and comprehensively analyzes and reviews the existing protein complex prediction methods. Then, it introduces the commonly used evaluation indicators and main data sets in complex prediction, compares and analyzes the prediction performance of several representative methods. Finally, it summarizes state-of-the-art complex prediction methods, highlights future research directions, and puts forward several issues to be resolved in the future. It is hoped that through the analysis and comparison of various methods, this paper can provide some valuable guidance and future directions for users and researchers on using the existing methods and developing new methods of protein complex prediction.

Key words: bioinformatics, protein complexes, protein-protein interaction (PPI) network, prediction methods

摘要:

蛋白质是生命活动的物质基础,直接参与、执行生命的活动过程。大多数蛋白质通过相互作用形成复合物来实现各种生物功能,因此预测蛋白质复合物有助于了解复合物的结构及其功能,也为细胞机制的研究奠定了重要基础。目前,随着高通量实验技术的不断发展,全基因组蛋白质相互作用(PPI)数据日益增多,领域内已经出现了很多基于计算的蛋白质复合物预测方法。虽然现有方法各具特色与优势,但也存在一些不足。首先,针对现有基于计算的蛋白质复合物预测方法进行了分类和比较全面、详细的分析评述;接着,介绍了复合物预测中常用的评价指标和主要数据集,并比较和分析了几种代表性方法的预测性能;最后,对复合物预测方法进行了总结与展望,提出了今后有待解决的若干问题。希望通过对各类方法的分析与比较,为相关人员使用和研究基于计算的蛋白质复合物预测方法提供有价值的参考和方向指引。

关键词: 生物信息学, 蛋白质复合物, 蛋白质相互作用(PPI)网络, 预测方法

CLC Number: