Validation

Automated Topology Builder Validation Statistics

Automated Topology Builder Validation Statistics

Automated Topology Builder version 2.0 is validated against structural and thermodynamical data. Validation against root-mean-square deviation and hydration free energies was performed.

Root Mean Square Deviation

In vacuum

As an initial validation of the topologies and parameters generated by the ATB 2.0, each molecule was energy minimized in vacuum and the resulting structure was compared to that obtained after Quantum Mechanical (QM) optimization in implicit solvent (water) at the B3LYP/6-31G* level (≤ 50 atoms) or at the HF/STO-3G level of theory (>50 atoms) using GAMESS-US [1].

From the distribution of values in Fig.1, it can be seen that 50% of molecules have an RMSD value below 0.01 nm and almost 95% have an RMSD value below 0.03 nm. Overall the agreement between the QM optimized structures and the energy minimized structures using the ATB parameters is very good which suggests that the geometry of the molecules is well maintained.

In water

To further validate the topologies a test set consisting of 178 heteromolecules was simulated for 200 ps in SPC [2] water at 300K and at 1 atm using the GROMOS11 [3] Molecular Dynamics (MD) simulation package. An analysis of the RMSD of the final structure from the simulations with respect to the structure optimised quantum mechanically was calculated after performing a least squares fit on all atoms. The number of atoms in the molecules varied from 6 to 40. The molecular weights ranged from 28 to 410 atomic units. Again, the molecules considered contained carbon, hydrogen, oxygen, sulphur, phosphorus, nitrogen and halogens (chlorine, bromide, fluorine). The topologies were generated on the 1-3-2013 using the ATB version 2.0

From the distribution of values shown in Fig.2 it can be seen that 50% of molecules have an RMSD value ≤ 0.1 nm with ~95% having an RMSD value ≤ 0.2 nm. Note, these values correspond to a single configuration taken at the end of the simulation. The RMSD values therefore reflect fluctuations due to thermal motion at 300K including the effects of dihedral transitions. The test set included a number of highly flexible and/or hydrophobic molecules such as long aliphatic chains. This explained why a small proportion of molecules show large deviations from the QM optimised structures when simulated in water.

Fig.2. Percentage distribution of RMSD values for 178 molecules (MD)

ATB 2.0 Validation

A subset of 75 small organic molecules consisting of alcohols, alkanes, cycloalkanes, alkenes, alkynes, alkyl benzenes, amines, amides, aldehydes, carboxylic acids, esters, ketones, thiols and sulphides was used as an initial test of the ATB parameters. The AUE for these molecules was 3.37 kJ.mol-1 and 77% of the molecules lay within 5 kJ.mol-1 of the experimental value. The largest deviation from experiment was 8.5 kJ.mol-1. What is clear from this result is that while the ATB parameters perform well for the majority of molecules, certain functional groups lead to systematic deviations from experiment (Fig. 4).

Fig. 4. Hydration free energies for 75 small organic molecules calculated using parameters generated by the ATB 2.0. The solid line has a slope of one and represents a one-to-one agreement between the calculated and experimental numbers. The two dotted lines represent a 5 kJ.mol-1 deviation from the ideal line.

Of the set of 167 molecules, 92 were taken from previous SAMPL challenges. The AUE for molecules in the SAMPL0, SAMPL1 and SAMPL2 data sets were 7.2 kJ.mol-1, 9.6 kJ.mol-1 and 8.5 kJ.mol-1 respectively. While approximately 40% still lay within 5 kJ.mol-1 of the experimental value the largest deviation from experiment was 42 kJ.mol-1. This is in part a reflection of the uncertainty in the experimental hydration free energies of molecules contained in SAMPL challenges (which were as large as 8 kJ.mol-1) and in part a reflection of the fact that these molecules contained a range of functional groups not commonly found in biomolecular systems.

Summary

The ATB 2.0 parameters were validated against structural data such as root mean square deviation and thermodynamic data such as hydration free energies:

  • Structural validation has shown good overall agreement between the QM optimized structures and the energy minimized structures using the ATB parameters.
  • The agreement between the predicted and experimental hydration free energies for the majority of molecules investigated is good.
  • The average unsigned-error (AUE) using ATB 2.0 topologies for the complete test set is 6-7 kJ.mol-1. However, for molecules containing functional groups that form part of the main GROMOS force field[8] the AUE is 3-4 kJ.mol-1.
  • The systematic nature of the deviations suggests that it will be possible to greatly improve the overall performance of the ATB by optimizing the parameters for a small number of non-optimal atom types.
  1. Schmidt MW, Baldridge KK, Boatz JA, Elbert ST, Gordon MS, Jensen JH, Koseki S, Matsunaga N, Nguyen KA, Su S, Windus TL, Dupuis M, Montgomery JA (1993) General atomic and molecular electronic structure system. J Comput Chem 14 (11):1347-1363.
  2. Berendsen HJC, Postma JPM, van Gunsteren WF, Hermans J (1981) Interaction models for water in relation to protein hydration. In: Pullman B (ed) Intermolecular Forces. Springer Netherlands, The Netherlands, pp 331-342. doi:10.1007/978-94-015-7658-1_21
  3. Schmid N, Christ CD, Christen M, Eichenberger AP, van Gunsteren WF (2012) Architecture, implementation and parallelisation of the GROMOS software for biomolecular simulation. Comput Phys Commun 183 (4):890-903.
  4. Koziara KB, Stroet M, Malde AK, Mark AE (2013) Testing and validation of the Automated Topology Builder (ATB) version 2.0: Prediction of hydration freee enthalpies. J Comput Aided Mol Des [in press].
  5. Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) The SAMPL2 blind prediction challenge: introduction and overview. J Comput Aided Mol Des 24 (4):259-279.
  6. Nicholls A, Mobley DL, Guthrie JP, Chodera JD, Bayly CI, Cooper MD, Pande VS (2008) Predicting small-molecule solvation free energies: an informal blind test for computational chemistry. J Med Chem 51 (4):769-779.
  7. Guthrie JP (2009) A blind challenge for computational solvation free energies: introduction and overview. J Phys Chem B 113 (14):4501-4507.
  8. Oostenbrink C, Villa A, Mark AE, van Gunsteren WF (2004) A biomolecular force field based on the free enthalpy of hydration and solvation: The GROMOS force-field parameter sets 53A5 and 53A6. J Comput Chem 25 (13):1656-1676.
  9. Schmid N, Eichenberger AP, Choutko A, Riniker S, Winger M, Mark AE, van Gunsteren WF (2011) Definition and testing of the GROMOS force-field versions 54A7 and 54B7. Eur Biophys J. 2011 Jul;40(7):843-56. doi: 10.1007/s00249-011-0700-9. Epub 2011 Apr 30.