Automated Topology Builder version 2.0 is validated against structural and thermodynamical data. Validation against root-mean-square deviation and hydration free energies was performed.
As an initial validation of the topologies and parameters generated by the ATB 2.0, each molecule was energy minimized in vacuum and the resulting structure was compared to that obtained after Quantum Mechanical (QM) optimization in implicit solvent (water) at the B3LYP/6-31G* level (≤ 50 atoms) or at the HF/STO-3G level of theory (>50 atoms) using GAMESS-US .
Retrieving RMSD data ...
From the distribution of values in Fig.1, it can be seen that 50% of molecules have an RMSD value below 0.01 nm and almost 95% have an RMSD value below 0.03 nm. Overall the agreement between the QM optimized structures and the energy minimized structures using the ATB parameters is very good which suggests that the geometry of the molecules is well maintained.
To further validate the topologies a test set consisting of 178 heteromolecules was simulated for 200 ps in SPC  water at 300K and at 1 atm using the GROMOS11  Molecular Dynamics (MD) simulation package. An analysis of the RMSD of the final structure from the simulations with respect to the structure optimised quantum mechanically was calculated after performing a least squares fit on all atoms. The number of atoms in the molecules varied from 6 to 40. The molecular weights ranged from 28 to 410 atomic units. Again, the molecules considered contained carbon, hydrogen, oxygen, sulphur, phosphorus, nitrogen and halogens (chlorine, bromide, fluorine). The topologies were generated on the 1-3-2013 using the ATB version 2.0
From the distribution of values shown in Fig.2 it can be seen that 50% of molecules have an RMSD value ≤ 0.1 nm with ~95% having an RMSD value ≤ 0.2 nm. Note, these values correspond to a single configuration taken at the end of the simulation. The RMSD values therefore reflect fluctuations due to thermal motion at 300K including the effects of dihedral transitions. The test set included a number of highly flexible and/or hydrophobic molecules such as long aliphatic chains. This explained why a small proportion of molecules show large deviations from the QM optimised structures when simulated in water.
Retrieving solvation free energy data ...
A subset of 75 small organic molecules consisting of alcohols, alkanes, cycloalkanes, alkenes, alkynes, alkyl benzenes, amines, amides, aldehydes, carboxylic acids, esters, ketones, thiols and sulphides was used as an initial test of the ATB parameters. The AUE for these molecules was 3.37 kJ.mol-1 and 77% of the molecules lay within 5 kJ.mol-1 of the experimental value. The largest deviation from experiment was 8.5 kJ.mol-1. What is clear from this result is that while the ATB parameters perform well for the majority of molecules, certain functional groups lead to systematic deviations from experiment (Fig. 4).
Of the set of 167 molecules, 92 were taken from previous SAMPL challenges. The AUE for molecules in the SAMPL0, SAMPL1 and SAMPL2 data sets were 7.2 kJ.mol-1, 9.6 kJ.mol-1 and 8.5 kJ.mol-1 respectively. While approximately 40% still lay within 5 kJ.mol-1 of the experimental value the largest deviation from experiment was 42 kJ.mol-1. This is in part a reflection of the uncertainty in the experimental hydration free energies of molecules contained in SAMPL challenges (which were as large as 8 kJ.mol-1) and in part a reflection of the fact that these molecules contained a range of functional groups not commonly found in biomolecular systems.
The ATB 2.0 parameters were validated against structural data such as root mean square deviation and thermodynamic data such as hydration free energies: