Git Product home page Git Product logo

adsorbml's Introduction

AdsorbML: Accelerating Adsorption Energy Calculations with Machine Learning

adsorbml

AdsorbML is an algorithm to calculating the minima adsorbate binding energy (adsorption energy) for a unique adsorbate+surface combination. All ML models are obtained from ocp to perform corresponding structure relaxations.

This repository holds the dataset, scripts, and downloads for the accompanying paper.

OC20-Dense Dataset (OC20-Dense)

OC20-Dense contains a dense sampling of adsorbate configurations on ~1,000 randomly selected adsorbate+surface materials from the OC20 dataset. The dataset comprises a total of 85,658 unique input configurations.

The dataset is stored in an LMDB file and ready to be used in ocp upon download. Additionally, ground truth DFT relaxations are store in ASE trajectories and provided for all converged systems used for evaluation.

NOTE - ASE trajectories exclude systems that were not converged or had invalid configurations as defined by the constraints in the AdsorbML manuscript. This resulted in 65,073 relaxations available for evaluation and are provided here.

Splits Size of compressed version (in bytes) Size of uncompressed version (in bytes) MD5 checksum (download link)
LMDB 654M 9.8G 0163b0e8c4df6d9c426b875a28d9178a
ASE Trajectories 29G 112G ee937e5290f8f720c914dc9a56e0281f

The following files are also provided to be used for evaluation and general information:

  • oc20dense_mapping.pkl : Mapping of the LMDB sid to general metadata information -
    • system_id: Unique system identifier for an adsorbate, bulk, surface combination.
    • config_id: Unique configuration identifier, where rand and heur correspond to random and heuristic initial configurations, respectively.
    • mpid: Materials Project bulk identifier.
    • miller_idx: 3-tuple of integers indicating the Miller indices of the surface.
    • shift: C-direction shift used to determine cutoff for the surface (c-direction is following the nomenclature from Pymatgen).
    • top: Boolean indicating whether the chosen surface was at the top or bottom of the originally enumerated surface.
    • adsorbate: Chemical composition of the adsorbate.
    • adsorption_site: A tuple of 3-tuples containing the Cartesian coordinates of each binding adsorbate atom
  • oc20dense_targets.pkl : DFT adsorption energies across different system and placement ids.
  • oc20dense_compute.pkl : DFT compute as measured in the number of ionic and scf steps for each evaluated relaxation.
  • oc20dense_ref_energies.pkl : Reference energy used for a specified system_id. This energy includes the relaxed clean surface and the gas phase adsorbate energy to ensure consistency across calculations.
  • oc20dense_tags.pkl : Tag information used for a specified system_id. Where 0 = subsurface, 1 = surface, 2 = adsorbate.

All mappings can be obtained at the following downloadable link: https://dl.fbaipublicfiles.com/opencatalystproject/data/adsorbml/oc20_dense_mappings.tar.gz

MD5 checksums:

c18735c405ce6ce5761432b07287d8d9  oc20_dense_mappings.tar.gz
3e26c3bcef01ccfc9b001931065ea6e6  oc20dense_mapping.pkl
fd589b013b72e62e11a6b2a5bd1d323c  oc20dense_targets.pkl
78d25997e0aaf754df526ab37276bb89  oc20dense_compute.pkl
b07c64158e4bfa5f7b9bf6263753ecc5  oc20dense_ref_energies.pkl
1ba0bc266130f186850f5faa547b6a02  oc20dense_tags.pkl

Running AdsorbML

Please see the README inside the scripts directory for instructions.

Citing AdsorbML

If you use this codebase in your work, please consider citing:

@article{lan2022adsorbml,
  title={AdsorbML: Accelerating Adsorption Energy Calculations with Machine Learning},
  author={Lan*, Janice and Palizhati*, Aini and Shuaibi*, Muhammed and Wood*, Brandon M and Wander, Brook and Das, Abhishek and Uyttendaele, Matt and Zitnick, C Lawrence and Ulissi, Zachary W},
  journal={arXiv preprint arXiv:2211.16486},
  year={2022}
}

adsorbml's People

Contributors

mshuaibii avatar janiceblue avatar brookwander avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.