Improving the conformational sampling for protein structure prediction
Wenzhi Mao, Tong Wang, Wenze Ding and Haipeng Gong
School of Life Sciences, Tsinghua University, Beijing, China
Abstract: Conventional ab initio protein structure prediction generally requires intensive sampling in the conformational space to find the near-native structures in computer simulations. However, according to Levinthal’s paradox, near-native conformations are almost impossible to visit by random sampling within finite simulation time. Therefore, special strategies should be developed to optimize the conformational searching scheme and/or to restrict the sampling space. In this work, I will introduce our trials in two different directions to improve the sampling efficiency. In the first aspect, we constructed machine-learning models to optimize the extraction of near-native templates for fragments of 7-15 residues in the target protein. Fragment templates collected using our method show significant improvement in the degree of resemblance to native ones over the other state-of-the-art methods and thus could enhance the efficiency of structure prediction algorithms using the fragment assembly protocol. In the second aspect, we developed a few machine-learning models either to predict the native residue contacts or to further refine the accuracy of a predicted residue contact map. Our methods show better or at least comparable performance to the other state-of-the-art ones. The predicted native residue contacts can be properly utilized in simulations to restrict the conformational space and thus to improve sampling efficiency in practical protein structure prediction.