This project addresses a major challenge in molecular dynamics: capturing and quantifying dynamic differences between protein conformational states, such as ligand-free and ligand-bound forms. While these states have different structures, their internal dynamic fluctuations also differ, and measuring these differences is a complex task.
Why is this project particularly interesting for BioExcel?
This project is interesting because it proposes an unsupervised machine learning approach using an autoencoder to address the challenge of capturing subtle dynamic differences in proteins. This method provides a new way to gain insights into how proteins change their flexibility and motion due to things like ligand binding or mutations. It’s a novel application of machine learning that can pinpoint specific, functionally significant regions undergoing conformational changes.
What are we doing in BioExcel?
The project uses an autoencoder to detect and localize dynamic differences in proteins. We train an autoencoder on an MD trajectory of a reference protein state (e.g., ligand-free). Then, we evaluate the trained model on a different state (e.g., ligand-bound). The reconstruction errors from the autoencoder reveal where the dynamics of the second state deviate from the first. We analyze these errors by comparing the Root Mean Square Fluctuation (RMSF) profiles of the original and reconstructed trajectories to identify regions with altered flexibility. This entire process is implemented using the biobb_pytorch module, which allows for a reproducible workflow from data preparation to final analysis.