Deep Learning in Protein Structure Prediction and Design

Posted by Lisa George on June 2nd, 2021

Artificial intelligence, represented by deep learning technology, has been highly integrated into the field of biological science and technology nowadays, and has greatly promoted the development of the biological field. Not long ago, the AlphaFold2 artificial intelligence system developed by DeepMind, a subsidiary of Google, achieved amazing accuracy in the Critical Assessment of protein Structure Prediction (CASP). Most of the models it predicted are highly consistent with experimentally measured protein structure models. The excellent performance of AlphaFold2 has attracted worldwide attention. In fact, the structure of a protein can not only be predicted but also be designed.

What is protein structure design?

Amino acids form peptide chains, which fold into proteins (i.e. biological macromolecules). Peptide chains composed of different amino acids fold into various shapes in space to perform different functions. By arranging the amino acid sequence of the protein, it can fold spontaneously to form the required three-dimensional structure and have certain functions. This is what we call protein structure design. Protein design can be divided into artificial modification of protein and de novo protein design. The artificial modification of protein is based on the structure of the existing protein to carry out certain mutations and evolution. The de novo design of protein is completely based on the principles of biophysics and biochemistry-it does not rely on existing natural protein structures, but builds and designs proteins with new structures and functions from scratch. Compared with proteins evolved in nature, artificially designing proteins from scratch can help researchers explore the folding space of the entire protein sequence and better meet any specific needs in terms of performance. The protein design research institute led by David Baker, a giant in the field of protein design, has achieved a series of groundbreaking results in this field, and has continued to achieve important breakthroughs and progress.

How does deep learning affect the field of protein design?

First, deep learning algorithms can be directly used to improve the accuracy and success rate of protein design. At present, the success rate of protein design is not high. Most of the new amino acid sequences designed by computers cannot be folded into, or can only be folded approximately into the structure that are wanted. In order for the designed protein to have the desired function, it is necessary to ensure high precision in the three-dimensional structure. Traditionally, the designer usually needs to spend a lot of time and energy in the laboratory to screen out from a large number sequences the protein with specific structure and high activity via the use of high-throughput screening and directed evolution methods. AlphaFold2 provides a very good structure verification tool: through high-precision protein structure prediction, the sequence that can be folded into the target structure is screened, and the amino acid sequence can be optimized to make the final three-dimensional structure closer to the protein structure to be designed. This will reduce a lot of tedious laboratory screening and optimization steps, shorten the time of protein design, reduce labor costs, and increase the success rate of design. Most importantly, it is possible to design proteins with more complex structures and functions.

Achievements and challenges of AI in protein design

At present, scientists have made many attempts and efforts in the field of protein design using deep learning, and have achieved many exciting results. For example, by studying the relationship between protein structure and sequence in nature, deep neural networks can directly predict the best amino acid sequence that can be folded into the structure based on the three-dimensional structure of the protein. This will greatly accelerate the entire protein design process, and even completely replace the traditional process of designing amino acid sequences.

Protheragen MedAI is an AI-driven drug R&D company that has successively launched a number of drug discovery prototypes, from the early development stage (AI-driven drug synthesis, drug design, drug activity prediction) to the clinical research stage (AI-driven pharmacovigilance system, registration transaction system, clinical data programming system) and so on, covering a series of key nodes in the whole process of new drug research and development.

Like it? Share it!


Lisa George

About the Author

Lisa George
Joined: April 29th, 2020
Articles Posted: 25

More by this author