IN A NUTSHELL
WHAT IS CPMD DOING?
The CPMD code (http://www.cpmd.org) is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics. CPMD is currently the most HPC efficient code that allows performing quantum molecular dynamics simulations by using the Car-Parrinello molecular dynamics scheme. CPMD simulations are usually restricted to systems of few hundred atoms. In order to extend its domain of applicability to (much) larger biologically relevant systems, a hybrid quantum mechanical/molecular mechanics (QM/MM) interface, employing routines from the GROMOS96 molecular dynamics code, has been developed.
Any user who wants to study phenomena where:
- electrons are directly involved, e.g. in bond formation/breaking in chemical reactions or in performing spectroscopic predictions
- universal force-fields are too inaccurate, e.g. in modelling catalytic sites of metalloproteins
- in general quantum mechanical effects play a crucial role, e.g. proton transfers or Grotthuss diffusion.
- Atomic coordinates and species
- Pseudopotential files corresponding to the each atomic species involved
- In the case of a CPMD/Gromos96 hybrid QM/MM calculation, a Gromos96 topology file for the system
CPMD allows performing a large amount of different types of calculations, as for example:
- Wave function and geometry optimization
- Molecular dynamics (Car-Parrinello, Born-Oppenheimer, Erhrenfest schemes)
- Electronic properties (dipole moment, polarisability, population analysis, orbital localisation, etc.)
- Perturbation theory/Linear response (IR, EPR, NMR, Rahman, etc.)
- Excited states (through Time-dependent Density Function Theory)
- Enhanced sampling techniques (thermodynamics Integration, metadynamics, etc.)
- Path integral molecular dynamics
The CPMD/Gromos96 interface currently available largely reduces the HPC performance of CPMD. We are currently developing a new modern QM/MM interface that could efficiently allow one to couple any classical molecular dynamics code with minimal code modifications.
BioExcel, as a centre of excellence, is the perfect environment to develop this new interface and subsequently promote its usage among the biophysical community. In particular, GROMACS has been selected as proof of concept code in the coupling procedure to CPMD, in order to exploit the already available competences within BioExcel.
The current version of CPMD, 4.1, is copyrighted jointly by IBM Corp and by Max Planck Institute, Stuttgart, and is distributed free of charge to non-profit organisations and non-commercial users by accepting the license terms and registering on the CPMD website. Profit organizations interested at the code should contact the CPMD consortium at firstname.lastname@example.org.
In order to run CPMD in the fully hamiltonian hybrid QM/MM mode, additional routines are needed that are not included in the standard CPMD release. To use these routines a Gromos96 license is required and they can be obtained by directly contacting the CPMD developers at email@example.com.
The advent of exascale computation will impact tremendously cell biology and pharmacology. Multi-scale simulations and enhanced sampling calculations on exascale platforms will lead on the one hand to describe accurately realistic biological systems in biologically relevant timescale and on the other will lead to high-throughput of molecular data, e.g. enzymatic reactions constants, association/dissociation kinetic constants in protein/ and ligand/protein complexes (the so-called kon and koff constants). These data, often not accessible by experimental means, allow fixing constraints on the parameters required for the mathematical modelling of signalling pathways. Hence, the combination of molecular simulation and system biology will offer an unprecedented predictive tool to investigate the effects of mutations and of drugs for subcellular events, testable by experiment. This will open a new avenue in understanding human biology and in counteracting cell derangement associated to diseases. The latter, indeed, most often, originate from misrelated signalling pathways, due to altered protein expression levels or protein mutations.
The CPMD (CPMD 1990-2008) code is a highly efficient, parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics. It represents the state-of-the-art parallel implementation of this method. In particular, the MD algorithm is based on a coarse-grained algorithm, optimally parallelized for a distributed-memory architecture. This algorithm represents an excellent compromise between load balancing in floating-point computation, memory occupation and parallel efficiency. Moreover, a second fine-grain parallelism has been added into the code to exploit the nowadays-ubiquitous multiprocessor boards with shared memory. The two parallelization strategies are indeed independent on each other, and can be used either individually or in combination. CPMD is the effort of many collaborating groups. Currently, the main scientist maintaining the code is Dr. Alessandro Curioni from the IBM Zürich research laboratory.
CPMD runs on a large variety of different computer architectures. It is optimized to run on either scalar or vector CPUs. The best scaling is achieved on special parallel hardware with lightweight kernels like IBM BlueGene class. Indeed, some of the CPMD simulations devised and run by some members of the CoE and granted by EU HPC initiatives such as PRACE are among the most efficiently parallelized applications in biology ever performed (see http://www.prace-ri.eu/prace-1st-regular-call)
Multi scale approaches, such as the quantum mechanical/molecular mechanics (QM/MM) one, represent one of the most powerful tools to describe a variety of biological events that involve quantum mechanics. These include enzymatic reactions, photochemical processes, proton transfer phenomena and so on. In this context, Ursula Rötlisberger (EPFL, Lausanne) extended the scope of CPMD applications to the realm of biology by implementing an interface for fully-Hamiltonian hybrid quantum mechanical / molecular dynamics (CPMD/MM) simulations (Laio et al. 2002). In this scheme, the relevant portions of the system (the CPMD part), where e.g. enzymatic reactions occur, are accurately described at quantum level while the rest of the system (the MM part) is treated at force field level.
Several performance improvements have been done since the original implementation. However, the QM/MM interface is still far from the state-of-the-art scaling performance of the full-QM code on large parallel machines. Specifically, while the code has been proved to scale up to an entire IBM BlueGene/Q machine (96 racks corresponding to 6,291,456 threads) with a near perfect parallel efficiency, a QM/MM simulation of comparable complexity cannot scale on more than few hundreds of cores on the same architecture, yet.
In collaboration with Prof. Roethlisberger and IBM, we plan to redesign crucial QM/MM algorithms that currently prevent the code in the QM/MM configuration to run efficiently and scale on thousands of cores as CPMD itself does. In addition, we plan to replace the current QM/MM interface, based on the Gromos96 molecular dynamics engine with a more modern, widely used and highly parallel MD codes, such as Gromacs. This will allow the QM/MM interface of CPMD to be more extensively used by the community. In fact, while CPMD is free software, its QM/MM interface requires the Gromos license to be employed. Interfacing CPMD with Gromacs, in particular by linking it with Gromacs as a library, will make the entire CPMD package freely available.
This achievement can be considered a fundamental contribution towards exascale computational biology applications.