New Frontiers of Artificial Intelligence in Protein Research
The Data Engineering Laboratory (LADE) at Area Science Park has recently published an innovative study into Bioinformatics, opening up new perspectives in the study of proteins, the fundamental building blocks of life. In fact, Francesca Cuturello, Marco Celoria, Alessio Ansuini and Alberto Cazzaniga, the authors of the study, have demonstrated how artificial intelligence can predict the impact of genetic mutations on protein stability, helping to get a better understanding of the mechanisms underlying many diseases and potentially developing new treatments. The genome of living beings is constantly mutating due to external agents or random events and this leads us to observe changes in the sequences of the proteins they synthesise.
Conducted as part of the Pathogen Readiness Platform for CERIC-ERIC (PRP@CERIC) project, the study uses AI models similar to GPT, applied to proteomics. These models are based on the analogy between a protein sequence and a sentence, with amino acids acting as “words”, allowing algorithms trained on hundreds of millions of protein sequences to be applied. Using this technique, the LADE researchers were able to predict how small variations in the amino acid sequence, such as those induced by mutations, can affect protein stability.
A particularly innovative aspect is the use of the MSA Transformer model, which utilises information on the ancestral relationships between protein sequences to enhance the accuracy of predictions. The algorithm developed by LADE offers cutting-edge performance and will be made available to the scientific community to encourage further advancements in this field.
“Predicting the effect of protein mutations through artificial intelligence allows us to explore, with great precision, complex biological phenomena that, until recently, were difficult to observe directly”, explains Francesca Cuturello, the study’s lead author. “This technology is a step forward towards innovative therapeutic solutions for a wide range of diseases.”
The team’s work has already received widespread recognition, including Francesca Cuturello’s invitation to the prestigious Research Retreat “Physics of Biological Data Analysis” at the Aspen Center for Physics and it will be presented at other international research centres, such as the ICTP and the Leibniz Center for Informatics.
For more information about LADE’s activities, click here.