This project involves two sub-projects, one involving Proton Dynamics and another Fluorescent Proteins, which are related by their use of hybrid QM/MM modelling and simulation approaches and the software used. The software packages involved here are GROMACS, CP2K, CPMD, MiMiC, and pmx.
Mass spectrometry has revolutionized proteomics, i.e. the investigation of the myriad of protein/protein interactions in the cell. By using a very small sample content of proteins (usually in the micromolar concentration), a powerful implementation of the technique, so-called ionization/ion mobility mass spectrometry (ESI/IM-MS), can measure stoichiometry, shape and subunit architecture of protein and protein complexes in the gas phase.
Reversibly fluorescent proteins have become a reliable resource to monitor cellular functions, gene expressions, protein-protein interactions, intra-cellular interactions in living systems and for understanding diseases and finding novel strategies to tackle them. After fusing the fluorescent protein to another protein using gene technology, the exact location of that protein can be determined by monitoring the fluorescence emitted by the photo-excited fluorescent protein with a normal microscope. Consequently, fluorescent proteins are used routinely to monitor various phenomena inside living cells, such as gene expression, protein dynamics, protein-protein interactions, intercellular transport, or biogenesis. Recent developments have even pushed the spatial resolution of this technique beyond the diffraction limit (Nobel Prize in Chemistry 2014). One strategy for achieving nanometer resolution in cells is to use reversibly photo-switchable fluorescent proteins (RSFPs), whose fluorescence can be turned on and off repeatedly with different wavelengths.
Why is this project particularly interesting for BioExcel?
The key assumption in the ESI/IM-MS technique is that the rearrangement of biomolecular units on passing from solution to gas phase is minimal. However, proton transfer phenomena may significantly affect structural properties, therefore understanding the mechanisms and quantifying the extent of these configurational changes will allow significant improvement in the predictions of this already powerful experimental technique.
Current approaches available to achieve the full potential of fluorescent proteins rely on random mutagenesis in combination with screening. However this approach is not only time consuming, but also requires substantial resources, and is thus only affordable for few bio-imaging laboratories. Furthermore, because this approach does not provide physical insight into the process that is the target of the optimization, it is also very difficult to optimize more than one parameter simultaneously (e.g. absorption spectrum, emission spectrum, switching efficiency, fluorescence yield, protein stability). Our bottom-up approach for predicting the effects of mutations in fluorescent proteins is based on atomistic computer simulations that will provide atomistic insights into the effect of the protein environment on its thermo-stability (folding stability and dimerization stability). These use force field MD simulations, PMX and free-energy calculations, and the new QM/MM interface between GROMACS and CP2K to predict photochemical properties (absorption spectrum, emission spectrum, quantum yields of switching/fluorescence) of the fluorescent proteins. This project will also demonstrate how large-scale computational resources can be used to perform biomolecular QM/MM simulations.
What are we doing in BioExcel?
We will use massively parallel hybrid QM/MM simulations to address this problem on an important example use case system: the protein beta-lactoglobulin. In particular, we will employ both the already developed MiMiC interface (which couples the ab-initio quantum molecular dynamics code CPMD to GROMACS), as well as the novel QM/MM interface between GROMACS and CP2K that is going to be developed by BioExcel. Comparison with available experimental data will allow us to establish the accuracy of the methods used. This project will illustrate parallel computational scaling, performance, and efficiency of unprecedented large-scale predictions using BioExcel’s newly developed high performance computing approaches to study a highly relevant biological problem.
MD, PMX and free energy calculations will be validated for Green Fluorescent Protein (GFP) and its known mutants against known experimental crystal structure, thermal stability, oligomerization affinity, absorption spectra, and emission spectra dimerization affinities reported in previous studies. The second validation test will validate the properties of hitherto unknown mutants in RsGreen0.7 (a variant of GFP). At the same time, the group of Peter Dedecker will express these mutants and measure their properties experimentally. While carrying out these critical blind prediction tests, we will convene regularly with this group to discuss results. Bringing together experts with all required know-how of the experiments and a mechanistic understanding of fluorescent proteins is essential to rapidly identify problems or limitations of our software and simulations. Because testing and code development will go hand in hand at this stage, systematic errors in our implementation or parameters, including model chemistries, can still be corrected if needed.
Forschungszentrum Jülich, University of Jyväskylä, The University of Edinburgh, KTH Royal Institute of Technology