“The dream of predicting a protein shape just from its gene sequence is now a reality,” said Paul Adams, Associate Laboratory Director for Biosciences at Berkeley Lab. For Adams and other structural biologists who study proteins, predicting their shape offers a key to understanding their function and accelerating treatments for diseases like cancer and COVID-19.
The current approaches to accurately mapping that shape, however, usually rely on complex experiments at synchrotrons. But even these sophisticated processes have their limitations—the data and quality aren’t always sufficient to understand a protein at an atomic level. By applying powerful machine learning methods to the large library of protein structures it is now possible to predict a protein’s shape from its gene sequence.
Researchers in Berkeley Lab’s Molecular Biophysics & Integrated Bioimaging Division joined an international effort led by the University of Washington to produce a computer software tool called RoseTTAFold. The algorithm simultaneously takes into account patterns, distances, and coordinates of amino acids. As these data inputs flow in, the tool assesses relationships within and between structures, eventually helping to build a very detailed picture of a protein’s shape.
