A chatbot for gene regulation

October 28, 2023

Despite its importance for biology, medicine, and biotechnology, the underpinning regulatory code of this second step remains undeciphered. The lack of understanding of post-transcriptional regulation implies we still do not have a complete picture of how genes are regulated. In our EPIC project, we aim to unravel the eukaryotic post-transcriptional regulatory code. For no single organism so far, this regulatory code is fully cracked. But we are approaching a turning point, disruptive one may say, through the convergence of three major technologies: omics, AI, and synthetic biology.

Hello Kevin, Vicente, and Julien. First of all, we would like to congratulate you on receiving an ERC Synergy Grant. Can you tell us more about the project that convinced the ERC jury?

Kevin: Sure! As we all know, the genome encodes the instructions to regulate gene activity. This gene regulation involves two major steps. First, there is the transcription of genes to mRNA which is largely regulated by the promoter sequence. Second, post-transcriptional mechanisms take clues from codes in the 5’ and 3’ untranslated parts of the mRNA to regulate mRNA stability and the rate at which it is translated into proteins. Despite its importance for biology, medicine, and biotechnology, the underpinning regulatory code of this second step remains undeciphered. The lack of understanding of post-transcriptional regulation implies we still do not have a complete picture of how genes are regulated. In our EPIC project, we aim to unravel the eukaryotic post-transcriptional regulatory code.

Julien: Deciphering the language in which genomes are written, that is knowing the words and grammar and instructions made of A, C, G, and Ts encoding how cells should react to environmental changes, is a long-standing goal of biology. For no single organism so far, this regulatory code is fully cracked. But we are approaching a turning point, disruptive one may say, through the convergence of three major technologies: omics, AI, and synthetic biology. Omics encompasses an arsenal of high-throughput molecular assays that allow us to quantify every step of gene expression from DNA to protein abundance via the entire life cycle of RNAs; recent AI techniques have the scalability and flexibility to learn complex rules from massive data; and synthetic biology opens the path to synthesize artificial genes by the thousands – we will design nearly a million of them to systematically test and refine our model of the regulatory code and eventually design genes and cells for biotechnological applications.

I believe we convinced the jury by bringing together three groups at the edge of these three technologies – Vicente for the omics, myself for the AI, and Kevin for the synthetic biology – and proposing an integrated project that systematically attacks this fundamental problem.