One of the main challenges in predicting the structures of antibodies and nanobodies is the accurate prediction of their hypervariable loops which are crucial for binding to antigens. The most significant challenge is accurately modeling the H3 loop on the heavy chain, as it has low structural and sequence similarity with known motifs, making it difficult for current deep learning methods like AlphaFold to predict. Another challenge is that while models can be generated, it’s not always easy to identify the most accurate ones. For nanobodies, the H3 loop is typically longer than in antibodies, and the framework residues play a more significant role in antigen binding, which complicates the prediction process. Even when combining different computational methods, a good fraction of nanobody H3 loop structures remain difficult to model with high accuracy.

Why is this project particularly interesting for BioExcel?

This project is interesting because it addresses the difficult task of accurately predicting the structure of the H3 loop, which is a key component of antibody and nanobody binding. The research combines and refines existing machine learning tools to overcome their limitations. The development of a new method called AlphaFlow built on top of AlphaFold, and its subsequent use with a hierarchical clustering algorithm to find the best models, is a novel approach. This method significantly improves the success rate of antibody-antigen docking. The project also extends this research to nanobodies, which have unique structural features, providing a deeper understanding of their binding mechanisms. The findings lead to a new pipeline that can be used for epitope-specific antibody design, which has significant implications for drug discovery and other biomedical applications.

What are we doing in BioExcel? 

We are using and refining machine learning tools to improve the prediction of antibody and nanobody structures and their interactions with antigens. The main efforts include:

  • Using AlphaFlow to model H3 loops: We apply AlphaFlow, a diffusion model, to challenging H3 loops that are typically low-confidence regions in AlphaFold’s predictions.
  • Clustering to select the best models: We use complete linkage hierarchical clustering to analyze the models generated by AlphaFlow. This helps us obtain a smaller, more reliable ensemble of models that are closer to the experimental structure.
  • Testing performance with HADDOCK: We test the performance of our clustered models in an antibody-antigen docking scenario using HADDOCK, a molecular docking tool. The results show a significantly higher success rate compared to standard AlphaFold models.
  • Analyzing nanobodies: We extended our analysis to nanobodies, creating a comprehensive benchmark of nanobody-antigen complexes. We found that combining models from different sources, such as AlphaFold-Multimer and ImmuneBuilder, is key to improving predictions.
  • Developing a new pipeline: We developed a new HADDOCK pipeline to incorporate information about framework residues that are likely to interact with the antigen, which is particularly important for nanobodies. This new pipeline improves the docking success rate in certain scenarios.