PhD in AI-Driven Molecular Biology: Large-Scale Data Generation and Protein/RNA Engineering
This fully funded PhD position at the Centre for Genomic Regulation (CRG) offers an exciting opportunity to advance molecular biology using large-scale data generation and artificial intelligence. The project aims to generate unprecedented datasets to train next-generation AI models for predicting and engineering the sequence-encoded properties of proteins and RNAs. While machine learning has revolutionized protein structure prediction, many other protein properties—such as stability, binding affinities, and regulatory mechanisms—remain poorly understood due to limited training data. This research will bridge that gap by integrating large-scale experimental approaches with advanced machine learning techniques.
The project can be tailored to be primarily computational, experimental, or a blend of both. Experimentally, you will develop and apply massively parallel DNA synthesis–selection–sequencing experiments to create high-resolution datasets. Computationally, you will design, test, and apply machine learning models to interpret sequence–property relationships, engineer protein and RNA functions, and design novel biomolecules. Central to the project are explainable AI and lab-in-the-loop frameworks, enabling iterative cycles of design, testing, and refinement. The ultimate goal is to deliver well-calibrated datasets and interpretable AI models for protein and RNA variant analysis, design, and optimization, with broad applications in biomedicine, biotechnology, and synthetic biology.
Doctoral training is enhanced by international secondments: a two-month placement at UPF (Year 1) focusing on structural information and bacterial resistance, a one-month secondment at IRB (Year 2) integrating chemical biology, and a one-month placement at MSAID (Year 3) exploring interpretable and generative AI models. These experiences will broaden your scientific and translational impact.
The ideal candidate will have a strong background in machine learning, statistics, mathematics, genomics, biophysics, or molecular biology, and a keen interest in interdisciplinary research. The position is open to applicants of any nationality, subject to the MSCA mobility rule. Funding is provided for 36 months, including full salary, mobility allowance, and family allowance (where eligible) in line with MSCA-DN regulations.
To apply, visit the ProtAIomics website for eligibility criteria and the application form. The deadline for applications is March 23, 2026. For more information, see the full project description and references to recent high-impact publications from the group.