The prediction of protein structures has become easier with a method developed by KAUST researchers that outperforms state-of-the-art techniques. Resulting insight into proteins could help to identify drug targets and develop therapeutics.
Proteins have multiple structure levels, the most basic of which is a string of building blocks called amino acids that are linked in a protein-specific sequence. The complete amino acid chains fold into 3-D structures that are essential for protein function. The 3-D structure of a protein tells us about proteins' roles in cells, but determining this structure is remarkably difficult.
“Searching for protein 3-D structure from scratch is difficult due to the huge search space,” explained Xin Gao from the KAUST Computational Bioscience Research Center. “The most promising way to predict protein structures is by homology modeling, which is based on the observation that homologous proteins have similar structures.”
Homology modeling looks for similarities between the query protein and thousands of proteins with known 3-D structures. Current systems can look for similarities in amino acid sequence through a process called sequence alignment or assess how well the amino acid sequence maps to known 3-D structures in a process called threading. However, they do not account for the entire protein sequence space (all known sequences) and the entire structure space (all known 3-D structures). Gao and colleagues developed a cross-modal method called CMsearch that integrates such information.
“CMsearch systematically and simultaneously incorporates sequence and structure features and sequence space and structure space information,” Gao said. “Its ability to consider more information in a systematic way means that CMsearch can successfully detect some remote homologs with relatively weak sequence alignment or threading scores that existing methods cannot detect.”
Read the full article