QM/MM with GROMACS & CP2K

Most biochemical systems, such as enzymes, are too large to be described at any level of ab initio or density functional theory. At the same time, the available molecular mechanics force fields are not sufficiently flexible to model processes in which chemical bonds are broken or formed. To overcome the limitations of a full quantum mechanical description on the one hand, and a full molecular mechanics treatment on the other hand, methods have been developed that treat a small part of the system at the level of quantum chemistry (QM), while retaining the computationally cheaper force field (MM) for the larger part. This hybrid QM/MM strategy was originally introduced by Warshel and Levitt more than four decades ago and is illustrated in the figure below.

The justification for dividing a system into regions that are described at different levels of theory is the local character of chemical reactions in condensed phases. A distinction can therefore be made between a ‘reaction center’ with atoms that are directly involved in the reaction and a ‘spectator’ region, in which the atoms do not directly participate in the reaction. For example, most reactions in solution involve the reactants and the first few solvation shells. The bulk solvent is hardly affected by the reaction, but can influence the reaction via long-range interactions. The same is true for most enzymes, in which the catalytic process is restricted to an active site located somewhere inside the protein. The rest of the protein provides an electrostatic background that may or may not facilitate the reaction.

To provide access to QM/MM simulations, we are developing an interface between molecular dynamics program GROMACS and the density functional theory program CP2K. With the new interface, users can automatically locate minima and transition states on QM/MM potential energy surfaces, as well as minimal energy pathways connecting these stationary points. While the enthalpy profile obtained from such calculations is usually sufficient to understand the enzyme, the missing entropic contributions can be added afterwards by sampling the orthogonal degrees of freedom on tomorrow’s exascale computer hardware.