Alchemical Free Energy Calculations in Biomolecules

Free energies and molecular dynamics

Free energy gradients underly all biomolecular motions, thus governing the processes at molecular level that are essential to maintaining life, such as protein folding, DNA recognition by transcription factors, substrate binding to enzymes etc. It is the identification of these free energy differences that allows understanding and, subsequently, controlling properties such as the affinity of a drug binding to its target, a protein’s resistance to extreme temperatures or a precise antibody recognition of an intruding antigen. Molecular dynamics (MD) based simulations are particularly suited to investigate free energy gradients, as the simulations with physical rigour incorporate both entropic and enthalpic contributions to free energy for a well defined statistical ensemble. While there are many flavours of approaches to extracting free energies from simulations, in the current BioExcel Use Case we concentrate on the alchemical methods.


At its core the method of computational alchemy rests on the notion that thermodynamic properties like the free energy are path independent, so even pathways that are in practice inaccessible, such as alchemical transitions, yield meaningful quantities from simulation. To establish a link to experiments, thermodynamic cycles are frequently employed.

For example, as depicted in the figures, cycles can be used to efficiently compute the change in protein stability upon an amino acid mutation or the change in protein-DNA affinity upon a base mutation in the DNA. Instead of computing the cumbersome folding/unfolding or binding/unbinding (vertical arrows), the alchemical mutation free energy can be computed efficiently. As the difference between the two vertical transitions must be identical to the difference between the two horizontal ones, this setup gives direct access to the desired stability or binding free energy change. Naturally, the setup of such specific alchemical simulations differs from a standard MD setup and in fact can become highly technically involved. To facilitate the procedure of free energy calculation setup and the subsequent simulations we are developing a software package pmx.

pmx for proteins

pmx readily allows setting up free energy calculations for the amino acid mutations. All the canonical amino acid combinations have been collected in a special set of libraries that enable an easy to use automation of the setup procedures. As the MD simulations rely on empirical force fields, the pmx based mutation libraries are also compatible with a number of contemporary molecular mechanics force fields. The approach provides means to perform large scale mutation scans to assess changes in protein thermodynamic stability, protein-protein or protein-ligand interactions.

pmx for DNA

Similarly to the amino acid mutation setup, pix also supports nucleic acid mutations in DNA. A large scale nucleotide mutation scan over a number of protein-DNA complexes has demonstrated that alchemical pmx based free energy calculations are capable of capturing correct trends in DNA interactions with various transcription factors and nucleases. Furthermore, we are working on providing support for nucleotide mutations in RNA as well.

pmx for ligands

Automation of the alchemical ligand modifications entails an additional challenge: in contrast to amino and nucleic acids a library of mutations cannot be pre-generated for an arbitrary set of molecules. For that purpose we are developing an algorithm which could identify an optimal atom mapping for any pair of organic molecules. Furthermore, the prototype of this approach is readily capable of suggesting best suited pairs of ligands to be modified, this way providing an efficient way to navigate in a chemical library guided by the alchemical free energy calculations.


The pmx utilities can be used from the command line. While such an approach may be mainly attractive to a power-user who needs to perform large scale mutation scans, we also provide support for the hybrid structure/topology generation via a web-based interface. Both amino and nucleic acid mutations can be generated online: The webserver interface not only allows performing single point mutations, but also enables setting up scans over a protein by a selected amino acid or a full scan of a DNA chain.