Datasets

A QM/ML resource

Molecules

Kristof T. Schütt, Farhad Arbabzadah, Stefan Chmiela, Klaus R. Müller, Alexandre Tkatchenko: Quantum-Chemical Insights from Deep Tensor Neural Networks, Nature Communications 8: 13890, 2017. [DOI];   Stefan Chmiela, Alexandre Tkatchenko, Huziel E. Sauceda, Igor Poltavsky, Kristof Schütt, Klaus-Robert Müller: Machine Learning of Accurate Energy-Conserving Molecular Force Fields, arXiv 1611.04678, 2017. [URL]
Energies and forces from molecular dynamics trajectories of eight organic molecules. Ab initio molecular dynamics trajectories (133k to 993k frames) of benzene, uracil, naphthalene, aspirin, salicylic acid, malonaldehyde, ethanol, toluene at the DFT/PBE+vdW-TS level of theory at 500 K.
Kristof T. Schütt, Farhad Arbabzadah, Stefan Chmiela, Klaus R. Müller, Alexandre Tkatchenko: Quantum-Chemical Insights from Deep Tensor Neural Networks, Nature Communications 8: 13890, 2017. [DOI]
Energies from molecular dynamics trajectories of 113 isomers of C7O2H10. Ab initio molecular dynamics trajectories (5k frames) at the DFT/PBE level of theory at 500 K.
Name Molecules Properties Elements
GDB8-15 21,786 4(4) H,C,N,O,F
GDB9-14 133,885 16(2) H,C,N,O,F
GDB7-13 7,211 8(4) H,C,N,O,S,Cl
GDB7-12 7,165 1(1) H,C,N,O,S
where term in brackets indicates levels of theory used.
Raghunathan Ramakrishnan, Mia Hartmann, Enrico Tapavicza, O. Anatole von Lilienfeld: Electronic Spectra from TDDFT and Machine Learning in Chemical Space, Journal of Chemical Physics 143(8): 084111, 2015. [DOI]
22k small organic molecules, in their ground states, with electronic spectra. 21,786 small organic molecules with up to 8 C, O, N, F atoms, saturated with H. Ground state and two lowest vertical electronic excited states (transition energies and oscillator strengths) at LR-TDCAM-B3LYP/def2TZVP, LR-TDPBE0/def2TZVP, LR-TDPBE0/def2SVP, and RI-CC2 levels of theory.
Raghunathan Ramakrishnan, Pavlo Dral, Matthias Rupp, O. Anatole von Lilienfeld: Quantum Chemistry Structures and Properties of 134 kilo Molecules, Scientific Data 1: 140022, 2014. [DOI]
134k small organic molecules, in their ground states, with energetic, electronic and thermodynamic properties. 133,885 small organic molecules with up to 9 C, O, N, F atoms, saturated with H. Geometries, harmonic frequencies, dipole moments, polarizabilities, energies, enthalpies, and free energies of atomization at the DFT/B3LYP/6-31G(2df,p) level of theory. For a subset of 6,095 constitutional isomers of C7H10O2, energies, enthalpies, and free energies of atomization are provided at the G4MP2 level of theory.
GDB7-13(QM7b)
Grégoire Montavon, Matthias Rupp, Vivekanand Gobre, Alvaro Vazquez-Mayagoitia, Katja Hansen, Alexandre Tkatchenko, Klaus-Robert Müller, O. Anatole von Lilienfeld: Machine learning of molecular electronic properties in chemical compound space, New Journal of Physics 15(9): 095003, 2013. [DOI]
7k small organic molecules, in their ground state, 14 combinations of properties and theory levels. 7,211 small organic molecules composed of H, C, N, O, S, Cl, saturated with H, and up to 7 non-H atoms. Molecules relaxed using DFT with PBE functional. Properties are atomization energy (DFT/PBE0), averaged polarizability (DFT/PBE0, SCS), HOMO and LUMO eigenvalues (GW, DFT/PBE0, ZINDO), and, ionization potential, electron affinity, first excitation energy, frequency of maximal absorption (all ZINDO).
GDB7-12(QM7)
Matthias Rupp, Alexandre Tkatchenko, Klaus-Robert Müller, O. Anatole von Lilienfeld: Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning, Physical Review Letters 108(5): 058301, 2012. [DOI]
7k small organic molecules, close to their ground states, with DFT atomization energies. 7,165 small organic molecules composed of H, C, N, O, S, saturated with H, and up to 7 non-H atoms. Molecules relaxed with an empirical potential. Atomization energies calculated using DFT with hybrid PBE0 functional.

Solids

Wojciech J. Szlachta, Albert P. Bartók, Gábor Csányi: Accuracy and transferability of Gaussian approximation potential models for tungsten, Physical Review B 90(10): 104108, 2014. [DOI]
158k diverse atomic environments of elemental tungsten. DFT/PBE energies, forces and stresses for tungsten, periodic unit cells in the range of 1-135 atoms, including bcc primitive cell, 128-atom bcc cell, vacancies, low index surfaces, gamma-surfaces, and dislocation cores.
Lance J. Nelson, Vidvuds Ozoliņš, C. Shane Reese, Fei Zhou, Gus L.W. Hart: Cluster expansion made easy with Bayesian compressive sensing, Physical Review B 88(15): 155105, 2013. [DOI]
4k DFT calculations for solid AgPd, CuPt and AgPt FCC superstructures. DFT/PBE energy, forces and stresses for cell sizes 1-16 across all compositions including primitive cells.

Liquids

Albert P. Bartók, Michael J. Gillan, Frederick R. Manby, Gábor Csányi: Machine-learning approach for one- and two-body corrections to density functional theory: Applications to molecular and condensed water, Physical Review B 88(5): 054104, 2013. [DOI]
Water monomer and dimer geometries, with calculations at DFT, MP2 and CCSD(T) level of theory. 7k water monomer geometries corresponding to a grid, with energies and forces at DFT / BLYP, PBE, PBE0 with AV5Z basis set. Water dimer (O-O distances < 4.5 Å, geometries sampled from a 300 K MD using the AMOEBA forcefield), interaction energies and forces with counterpoise correction, using MP2 / AVDZ, AVTZ, AVQZ (10k, 10k, 1k configurations, respectively). 1k water dimers (O-O distances between 4.5 and 6.0 Å, geometries sampled from a 300 K MD using the AMOEBA force field), interaction energies and forces with counterpoise correction using MP2/AVTZ. 2k water dimers (O-O distance < 4.5 Å, geometries from a 4000 K DFT MD with a weak confining potential), interaction energies and forces with counterpoise correction using MP2/AVTZ. 800 water dimers (O-O distances < 4.5 Å, geometries from a 300 K MD using the AMOEBA forcefield) interaction energies with counterpoise correction using MP2/AVDZ and CCSD(T)/AVDZ.