Gromacs 5 and newer releases consist of only four binaries that contain the full functionality. Gromacs 5. Gromacs 4. Therefore, it requires a different set of pre-loaded modules to be run properly.
|Published (Last):||8 October 2010|
|PDF File Size:||8.10 Mb|
|ePub File Size:||5.24 Mb|
|Price:||Free* [*Free Regsitration Required]|
Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment.
These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in.
GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations.
Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations.
Contact: es. Supplementary information: Supplementary data are available at Bioinformatics online. Although molecular dynamics simulation of biomolecules is frequently classified as computational chemistry, the scientific roots of the technique trace back to polymer chemistry and structural biology in the s, where it was used to study the physics of local molecular properties—flexibility, distortion and stabilization—and relax early X-ray structures of proteins on short time scales Berendsen, ; Levitt and Lifson, ; Lifson and Warshel, ; McCammon et al.
Molecular simulation in general was pioneered even earlier in physics and applied to simplified hard-sphere systems Alder and Wainwright, The field of molecular simulation has developed tremendously since then, and simulations are now routinely performed on multi-microsecond scale where it is possible to repeatedly fold small proteins Bowman et al.
This classical type of single long simulation continues to be important, as it provides ways to directly monitor molecular processes not easily observed through other means. Mutation studies can now easily build models and run short simulations for hundreds of mutants, model-building web servers frequently offer automated energy minimization and refinement Zhang et al.
In these scenarios, classical molecular dynamics simulations based on empirical models have a significant role to play, as most properties of interest are defined by free energies, which typically require extensive sampling that traditional quantum chemistry methods can not provide for large systems. These developments would not have been possible without significant research efforts in simulation algorithms, optimization, parallelization and not least ways to integrate simulations in modeling pipelines.
All these packages have complementary strengths and profiles; for the GROMACS molecular simulation toolkit, one of our primary long-term development goals has been to achieve the highest possible simulation efficiency for the small- to medium-size clusters that were present in our own research laboratories. As computational resources are typically limited in those settings, it is sometimes preferable to use throughput approaches with moderate parallelization that yield whole sets of simulations rather than maximizing performance in a single long simulation.
However, in recent years, we have combined this with optimizing parallel scaling to enable long simulations when dedicated clusters or supercomputers are available for select critical problems. Many tasks that only a decade ago required exceptionally large dedicated supercomputing resources are now universally accessible, and sometimes they can even be run efficiently on a single workstation or laptop. High-end performance in GROMACS has also been improved with new decomposition techniques in both direct and reciprocal space that push parallelization further and that have made microsecond simulation timescales reachable in a week or two even for large systems using only modest computational resources.
However, in hindsight, the decision to release the package as both open source and free software was a significant advance for the project. The codebase has become a shared infrastructure with contributions from several research laboratories worldwide, where every single patch and all code review are public as soon as they are committed to the repository.
We explicitly encourage extensions and re-use of the code; as examples, GROMACS is used as a module to perform energy minimization in other structural bioinformatics packages including commercial ones ; it is available as a component from many vendors that provide access to cloud computing resources; and some of the optimized mathematical functions such as inverse square roots have been reused in other codes.
Many Linux distributions also provide pre-compiled or contributed binaries of the package. These features per se do not necessarily say anything about scientific qualities, but we believe this open development platform ensures i intensive code scrutiny, ii several state-of-the art implementations of algorithms and iii immediate availability of research work to end users.
Compared with only 10 years ago, the project is now used everywhere from the smallest embedded processors to the largest supercomputers in the world, with applications ranging from genome-scale refinement of coarse-grained models to multi-microsecond simulations of membrane proteins or vesicle fusion.
Supercomputers are still important for the largest molecular simulations, but many users rely on modest systems for their computational needs. GROMACS is designed for maximum portability, with external dependencies kept to a minimum and fall-back internal libraries provided whenever possible.
One of the main challenges in the past few years has been the emergence of multicore machines. Although GROMACS runs in parallel, it was designed to use message-passing interface MPI communication libraries present on supercomputers rather than automatically using multiple cores.
In release 4. As simulation software and computer performance has improved, biomolecular dynamics has increasingly been used for structure equilibration, sampling of models or to test what effects mutations might have on structure and dynamics by introducing many different mutations and perform comparatively short simulations on multiple structures.
Although this type of short simulation might not look as technically impressive as long trajectories, we strongly feel it is a much more powerful approach for many applications.
As simulations build on statistical mechanics, a result seen merely in one long trajectory might as well be a statistical fluctuation that would never be accepted as significant in an experimental setting. In addition, the same toolbox can be applied to liquid simulations, where the need for sampling is limited, but where one often needs to study a range of systems under different conditions e.
Caleman et al. As discussed earlier in the text, GROMACS has always been optimized to achieve the best possible efficiency using scarce resources which we believe is the norm for most users , and version 4. All GROMACS runs are now automatically checkpointed and can be interrupted and continued as frequently as required, and optional flags have been added to enable binary reproducibility of trajectories.
As users often work with many datasets at once, we have implemented MD5 hashes on checkpoint continuation files to guarantee their integrity and to make sure the user does not append to the wrong file by mistake. These additional checks have allowed us to enable file appending on job continuation: repeated short jobs that continue from checkpoints will yield a single set of output just as from a single long job.
Hundreds or even thousands of smaller simulations can be started with a single GROMACS execution command to optimize use on supercomputers that favor large jobs, and each of these can be parallel themselves if advantageous. GROMACS also supports simulations running in several modern cloud computing environments where virtual server instances can be started on demand. As cloud computing usage is also billed by the hour, we believe the most instructive metric for performance and efficiency is to actually measure simulation performance in terms of the cost to complete a given simulation—for an example, see the performance Section 2.
In addition to the high-throughput execution model, there are a number of new code features developed to support modeling and rapid screening of structures. This has changed with version 4. Together with manually tuned assembly kernels, implicit solvent simulations can reach performance in excess of a microsecond per day for small proteins even on standard CPUs.
The neighbor-searching code has been updated to support grid-based algorithms even in vacuo —including support for atoms diffusing away towards infinity with maintained performance—and there are now also highly optimized kernels to compute all—versus—all interactions without cut-offs both for standard and generalized born interactions.
The program now also supports arbitrary knowledge-based statistical interactions through atom-group—specific tables both for bonded and non-bonded interactions.
Constraints such as those used in refinement can be applied either to positions, atomic distances or torsions, and there are several options for ensemble weighting of contributions from multiple constraints. Despite the rapid emergence of high-throughput computing, the usage of massively parallel resources continues to be a cornerstone of high-end molecular simulation.
Absolute performance is the goal for this usage too, but here, it is typically limited by the scalability of the software.
A subset of nodes are dedicated to the PME calculation, and at the beginning of each step, the direct-space nodes send coordinate and charge data to them. As direct space can be composed in all three dimensions, a number of direct-space nodes typically 3—4 map onto a single reciprocal-space node Fig.
Limiting the computation of the 3D—FFT to a smaller number of nodes improves parallel scaling significantly Hess et al. The new pencil decomposition makes it much easier to automatically determine both real- and reciprocal-space decompositions of arbitrary systems to fit a given number of nodes.
The automatic load balancing step of the domain decomposition has also been improved; domain decomposition now works without periodic boundary conditions important for implicit solvent ; and GROMACS now includes tools to automatically tune the balance between direct and reciprocal-space work. In particular when running in parallel over large numbers of nodes, it is advantageous to move more work to real space which scales near-linearly and decrease the reciprocal-space load to reduce the dimensions of the 3D—FFT grid where the number of communication messages scales with the square of the number of nodes involved.
The latest version of GROMACS also supports many types of multilevel parallelism; in addition to coding-level optimizations, such as single-instruction multiple-data instructions and the multithreaded execution, GROMACS supports replica-exchange ensemble simulations where a single simulation can use hundreds of replicas that only communicate every couple of seconds, which makes it possible to scale even fairly small systems e.
Finally, for the largest systems comprising hundreds of millions of particles, we now achieve true linear weak scaling for reaction-field and other non—lattice-summation methods Schulz et al. The pencil grid decomposition improves reciprocal space scaling considerably and makes it easier to use arbitrary numbers of nodes. Colors in the plot refer to a hypothetical system with four cores per node, where three are used for direct-space and one for reciprocal-space calculations.
It was clear that the automated tools to generate input files were somewhat limited in earlier releases of GROMACS; few molecules apart from single-chain proteins worked perfectly. For version 4. Any number of chains and different molecule classes can be mixed, and they are automatically detected. The program provides several different options for how to handle termini and HETATM records in structures, and residue names and numbering from the input files are now maintained throughout the main simulation and analysis tools.
To the best of our knowledge, this range of forcefield support is currently unique among packages and makes it straightforward to systematically compare the influence of the parameter approximations in biomolecular modeling.
The code also provides name translation files to support all the conventions used in the different force fields. Originally, the code only supported leapfrog verlet, which keeps track of the positions at the full step, whereas the velocities are offset by half a time step. The velocity Verlet algorithm Swope et al. In velocity Verlet, positions and velocities are both at the same time point. For constant energy simulations, both algorithms give the same trajectories, but for constant temperature or constant pressure simulations, velocity Verlet integration provides many additional features.
A number of pressure control or temperature control algorithms are only possible with velocity Verlet integrators because these algorithms require both the pressure and temperature to be specified at the same time. Good temperature algorithms already exist for leapfrog algorithm; however, velocity Verlet is, hence, not generally necessary in such cases. Slight errors arise because of the time step mismatch between the components of the pressure calculation involving kinetic energy and potential energy.
The introduction of velocity Verlet allows the use of additional, more rigorous pressure control algorithms, such as that of Martyna, Tuckerman, Tobias, and Klein Martyna et al.
The leapfrog Verlet and velocity Verlet are both implemented as specific instances of a method called Trotter factorization, a general technique for decomposition of the equations of motion, which makes it possible to write out different symplectic integrators based on different ordering of the integration of different degrees of freedom.
This Trotter factorization approach will make it possible to eventually support a large range of multistep and other accelerated sampling integrators in the future, and it is already used for more efficient temperature and pressure scaling. Historically, GROMACS has relied on virtual interaction sites and all-bond constraints to extend the shortest time step in integration in contrast to NAMD that uses multiple time step integration , but this approach will make it possible to support both alternatives in future versions.
Simulation-based free-energy calculations provide a way to accurately include effects both of interactions and entropy, and accurately predict solvation and binding properties of molecules. It is one of the most direct ways that simulations can provide specific predictions of properties that can also be measured experimentally. GROMACS as well as other packages have long supported free-energy perturbation and slow-growth methods to calculate free-energy differences when gradually changing the properties of molecules.
The present release of the code provides an extensive new free-energy framework based on Bennett Acceptance Ratios BAR. The total energy or actually, Hamiltonian is then defined as , where H 0 and H 1 are the Hamiltonians for the two end states. BAR uses differences in Hamiltonian as the basis for calculating the free-energy difference, and it has been shown to be both the most efficient free-energy perturbation method for extracting free-energy differences Bennett, , and a statistically unbiased estimator for this free-energy difference Shirts and Pande, The Hamiltonian differences needed for BAR are now calculated automatically on the fly in simulations, rather than as a post-processing step using large full-precision trajectories, which makes it possible to use distributed computing or cloud resources where the available storage and bandwidth are limited.
Rather than manually defining how to modify each molecule, the user can now simply specify that they want to calculate the free energy of decoupling a particular molecule or group of atoms from the system as a simulation parameter. Free-energy calculations using BAR. The user specifies a sequence of lambda points and runs simulations where the phase space overlaps and Hamiltonian differences are calculated on the fly. In addition to the larger development concepts covered here, several additional parts of GROMACS have been improved and extended for version 4.
We now support symplectic leapfrog and velocity Verlet integrators for fully reversible temperature and pressure coupling, with several new barostats and thermostats, including Nose—Hoover chains for ergodic temperature control and Martyna-Tuckerman-Tobias-Klein MTTK pressure control integrators.
These are important for calculating accurate free energies, in particular for smaller systems or cases where pressure will affect the result. Lipids are removed based on overlap, and the tool has full support for asymmetrically shaped proteins. Non-equilibrium simulation: it is now possible to pull any number of groups in arbitrary directions and to apply torques in addition to forces. Normal-mode analysis can now be performed for extremely large systems through a new sparse-matrix diagonalization engine that also works in parallel, and even for PME simulations, it is possible to perform the traditional non-sparse computationally costly diagonalization in parallel.
Rather than optimizing only for scaling efficiency, we have aimed to improve both absolute performance per core and the scaling efficiency at the same time. Recent enhancements in this respect include the better PME parallel decomposition described earlier in the text. The choice of method for calculating long-range electrostatics can greatly affect simulation performance, and rather than simply optimizing the method that scales best reaction-field , we have worked to optimize the method that is currently viewed as best practice in the field van der Spoel and van Maaren, By implementing 2D pencil node decomposition for PME and improving the dynamic load-balancing algorithms, we obtain close-to-linear scaling over large numbers of nodes for a set of benchmark systems that were selected as real world applications from our and others recent work.
Scaling results are plotted in Figure 3 for a ligand-gated ion channel Murail et al.
Documentation of outdated versions
Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations.
Several files needed for CMake builds were missing in the distributed gromacs Log in Register. All pages Main pages. Downloads of outdat
Index of /lookaside/pkgs/rpms/gromacs