Title: Finding Remote Homologous Proteins: Alignment-Based, Alignment-Free and Cross-Modal Methods
Abstract: Proteins function in living organisms as enzymes, antibodies, sensors, and transporters, among myriad other roles. Understanding protein functions has great implications for the study of biological and medical sciences. Finding remote homologous proteins, with conserved structure similarities but limited sequence similarities, is an indispensable step towards understanding protein functions.
Here, three novel methods are presented for finding remote homologous proteins with different goals: (a) the PROtein STructure Alignment (PROSTA) methods that automatically determine and align homologous structures of protein pockets and interaction interfaces; (b) the ContactLib method that scans tens of thousands of protein structures for homologous structures in seconds; (c) the CMsearch method that simultaneously explore the sequence space and the structure space to perform cross-modal search for homologous proteins. Our methods do not only improve the accuracy of finding homologous proteins, but also improve the accuracy of predicting protein structures. Moreover, case studies where our method discovers, for the first time, structural similarities between pairs of functionally related protein-DNA complexes are presented.
Speaker: 崔学峰,清华大学交叉信息研究院特聘研究员。在加拿大滑铁卢大学先后获得了本科、硕士、博士学位。硕士与博士导师为加拿大基拉姆奖(Killam Prize,加拿大最高科研奖)得主、加拿大皇家科学院院士、ACM院士、IEEE院士 李明 讲席教授(University Professor)。博士毕业后,还在沙特阿拉伯阿卜杜拉国王科技大学完成了两年多的博士后工作。主要科研领域为生物信息学。一直致力于设计机器学习与并行算法,用来解决与人类生活息息相关的生物问题。第一作者论文3次发表在会议Intelligent Systems for Molecular Biology(ISMB,生物信息学顶级会议,每年仅录取约40篇论文)。此外,创新科研成果被国际媒体Bio-Techniques报道1次,被国际媒体Science X报道2次。