Datasets

A QM/ML resource

Molecules

Kristof T. Schütt, Farhad Arbabzadah, Stefan Chmiela, Klaus R. Müller, Alexandre Tkatchenko: Quantum-Chemical Insights from Deep Tensor Neural Networks, Nature Communications 8: 13890, 2017. [DOI];   Stefan Chmiela, Alexandre Tkatchenko, Huziel E. Sauceda, Igor Poltavsky, Kristof Schütt, Klaus-Robert Müller: Machine Learning of Accurate Energy-Conserving Molecular Force Fields, Science Advances 3(5): e1603015, 2017. [DOI]
Energies and forces from molecular dynamics trajectories of eight organic molecules. Ab initio molecular dynamics trajectories (133k to 993k frames) of benzene, uracil, naphthalene, aspirin, salicylic acid, malonaldehyde, ethanol, toluene at the DFT/PBE+vdW-TS level of theory at 500 K.
Kristof T. Schütt, Farhad Arbabzadah, Stefan Chmiela, Klaus R. Müller, Alexandre Tkatchenko: Quantum-Chemical Insights from Deep Tensor Neural Networks, Nature Communications 8: 13890, 2017. [DOI]
Energies from molecular dynamics trajectories of 113 isomers of C7O2H10. Ab initio molecular dynamics trajectories (5k frames) at the DFT/PBE level of theory at 500 K.
Raghunathan Ramakrishnan, Mia Hartmann, Enrico Tapavicza, O. Anatole von Lilienfeld: Electronic Spectra from TDDFT and Machine Learning in Chemical Space, Journal of Chemical Physics 143(8): 084111, 2015. [DOI]
22k small organic molecules, in their ground states, with electronic spectra. 21,786 small organic molecules with up to 8 C, O, N, F atoms, saturated with H. Ground state and two lowest vertical electronic excited states (transition energies and oscillator strengths) at LR-TDCAM-B3LYP/def2TZVP, LR-TDPBE0/def2TZVP, LR-TDPBE0/def2SVP, and RI-CC2 levels of theory.
GDB9-14QM9(85 MB)
Raghunathan Ramakrishnan, Pavlo Dral, Matthias Rupp, O. Anatole von Lilienfeld: Quantum Chemistry Structures and Properties of 134 kilo Molecules, Scientific Data 1: 140022, 2014. [DOI]
134k small organic molecules, in their ground states, with energetic, electronic and thermodynamic properties. 133,885 small organic molecules with up to 9 C, O, N, F atoms, saturated with H. Geometries, harmonic frequencies, dipole moments, polarizabilities, energies, enthalpies, and free energies of atomization at the DFT/B3LYP/6-31G(2df,p) level of theory. For a subset of 6,095 constitutional isomers of C7H10O2, energies, enthalpies, and free energies of atomization are provided at the G4MP2 level of theory.
Fix floating point notation in some coordinates and property values; provide single .xyz files for isomers and molecules. The original version is still available at FigShare.
Grégoire Montavon, Matthias Rupp, Vivekanand Gobre, Alvaro Vazquez-Mayagoitia, Katja Hansen, Alexandre Tkatchenko, Klaus-Robert Müller, O. Anatole von Lilienfeld: Machine Learning of Molecular Electronic Properties in Chemical Compound Space, New Journal of Physics 15(9): 095003, 2013. [DOI]
7k small organic molecules, in their ground state, 14 combinations of properties and theory levels. 7,211 small organic molecules composed of H, C, N, O, S, Cl, saturated with H, and up to 7 non-H atoms. Molecules relaxed using DFT with PBE functional. Properties are atomization energy (DFT/PBE0), averaged polarizability (DFT/PBE0, SCS), HOMO and LUMO eigenvalues (GW, DFT/PBE0, ZINDO), and, ionization potential, electron affinity, first excitation energy, frequency of maximal absorption (all ZINDO).
Matthias Rupp, Alexandre Tkatchenko, Klaus-Robert Müller, O. Anatole von Lilienfeld: Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning, Physical Review Letters 108(5): 058301, 2012. [DOI]
7k small organic molecules, close to their ground states, with DFT atomization energies. 7,165 small organic molecules composed of H, C, N, O, S, saturated with H, and up to 7 non-H atoms. Molecules relaxed with an empirical potential. Atomization energies calculated using DFT with hybrid PBE0 functional.

Solids

Christopher Sutton, Luca M. Ghiringhelli, Takenori Yamamoto, Yury Lysogorskiy, Lars Blumenthal, Thomas Hammerschmidt, Jacek R. Golebiowski, Xiangyue Liu, Angelo Ziletti, Matthias Scheffler: Crowd-sourcing materials-science challenges with the NOMAD 2018 Kaggle competition, npj Computational Materials 5:111, 2019. [DOI]
3k Al-Ga-In sesquioxides with energies and band gaps. Relaxed and Vegard's rule geometries with formation energy and band gaps at DFT/PBE level of theory of 3k (Alx-Gay-Inz)2O3 oxides, x+y+z=1. Contains all structures from the Kaggle challenge training and leaderboard data.
BA10-18DFT-10B
Chandramouli Nyshadham, Matthias Rupp, Brayden Bekker, Alexander V. Shapeev, Tim Mueller, Conrad W. Rosenbrock, Gábor Csányi, David W. Wingate, Gus L. W. Hart: Machine-Learned Multi-System Surrogate Models for Materials Prediction, npj Computational Materials 5:51, 2019. [DOI]
Energies of 10 binary alloys with 1,595 structures each. 10 binary alloys (AgCu, AlFe, AlMg, AlNi, AlTi, CoNi, CuFe, CuNi, FeV, NbNi) with 10 different species and all possible FCC, BCC and HCP structures up to 8 atoms in the unit cell. 15,950 structures in total. Lattice parameters from Vegard's rule. DFT/PBE total and formation energies.
Wojciech J. Szlachta, Albert P. Bartók, Gábor Csányi: Accuracy and Transferability of Gaussian Approximation Potential Models for Tungsten, Physical Review B 90(10): 104108, 2014. [DOI]
158k diverse atomic environments of elemental tungsten. DFT/PBE energies, forces and stresses for tungsten, periodic unit cells in the range of 1-135 atoms, including bcc primitive cell, 128-atom bcc cell, vacancies, low index surfaces, gamma-surfaces, and dislocation cores.
Lance J. Nelson, Vidvuds Ozoliņš, C. Shane Reese, Fei Zhou, Gus L.W. Hart: Cluster Expansion Made Easy with Bayesian Compressive Sensing, Physical Review B 88(15): 155105, 2013. [DOI]
4k DFT calculations for solid AgPd, CuPt and AgPt FCC superstructures. DFT/PBE energy, forces and stresses for cell sizes 1-16 across all compositions including primitive cells.
Felix A. Faber, Alexander Lindmaa, O. Anatole von Lilienfeld, Rickard Armiento: Machine Learning Energies of 2 Million Elpasolite (ABC2D6) Crystals, Physical Review Letters 117(13): 135502, 2016. [DOI]
11k and 12k elpasolite crystals with DFT/PBE formation energies. 11,358 (12 elements, III-VI) and 10,590 (39 elements, I-VIII) elpasolite crystals with relaxed geometries and formation energies computed at the DFT/PBE level of theory.
Linked:
ETIM-17@Dryad(479 GB)
Francesco Ricci, Wei Chen, Umut Aydemir, G. Jeffrey Snyder, Gian-Marco Rignanese, Anubhav Jain, Geoffroy Hautier: An Ab Initio Electronic Transport Database for Inorganic Materials, Scientific Data 4: 170085, 2017. [DOI]
Electronic transport properties of 48k inorganic materials. Electronic conductivity, electronic thermal conductivity, Seebeck coefficient, as well as conductivity effective mass and Fermi levels for different dopings, computed at density functional level of theory using Boltzmann transport theory with constant relaxation time.

Liquids

Albert P. Bartók, Michael J. Gillan, Frederick R. Manby, Gábor Csányi: Machine-Learning Approach for One- and Two-Body Corrections to Density Functional Theory: Applications to Molecular and Condensed Water, Physical Review B 88(5): 054104, 2013. [DOI]
Water monomer and dimer geometries, with calculations at DFT, MP2 and CCSD(T) level of theory. 7k water monomer geometries corresponding to a grid, with energies and forces at DFT / BLYP, PBE, PBE0 with AV5Z basis set. Water dimer (O-O distances < 4.5 Å, geometries sampled from a 300 K MD using the AMOEBA forcefield), interaction energies and forces with counterpoise correction, using MP2 / AVDZ, AVTZ, AVQZ (10k, 10k, 1k configurations, respectively). 1k water dimers (O-O distances between 4.5 and 6.0 Å, geometries sampled from a 300 K MD using the AMOEBA force field), interaction energies and forces with counterpoise correction using MP2/AVTZ. 2k water dimers (O-O distance < 4.5 Å, geometries from a 4000 K DFT MD with a weak confining potential), interaction energies and forces with counterpoise correction using MP2/AVTZ. 800 water dimers (O-O distances < 4.5 Å, geometries from a 300 K MD using the AMOEBA forcefield) interaction energies with counterpoise correction using MP2/AVDZ and CCSD(T)/AVDZ.