Creating a flexible, scalable, and reproducible automated pipeline that can handle the entire process, from a protein’s sequence to a list of prioritized leads, can help reduce the notoriously lengthy timelines for new medicines to reach the market. This pipeline involves integrating various complex, state-of-the-art computational tools for molecular dynamics, modeling, and docking, while also ensuring the pipeline can function even when a protein’s crystal structure isn’t available.
Why is this project particularly interesting for BioExcel?
This project is interesting because it merges several high-impact computer-aided drug discovery techniques into a single, automated workflow. It offers an end-to-end solution for drug design, moving researchers from an initial protein sequence to prioritized ligand binding poses. The pipeline is built on the BioExcel Building Blocks (BioBBs), a collection of modular and interoperable wrappers that make complex biomolecular simulations accessible and reproducible. Its successful validation in collaborative projects with industrial partners proves its real-world applicability and robustness.
What are we doing in BioExcel?
We are developing and showcasing an automated drug design pipeline using the BioBB ecosystem. The process begins with either a protein sequence or an existing structure. If a structure isn’t available, we use homology modeling or AlphaFold to predict one. We then prepare the structure for molecular dynamics (MD) simulations to understand its behavior. Using trajectory analysis, we identify the most biologically relevant conformations, which serve as a basis for virtual screening and docking. This screening process involves a multi-step, tiered approach to filter a large library of molecules down to the most promising candidates, using criteria like docking scores, PAINS filters, and physicochemical properties. The final output is a set of prioritized ligand poses that can be used for further analysis or experimental validation.