Git Product home page Git Product logo

m3gnet's Introduction

GitHub license Linting Testing Downloads

NOTE: A new implementation based on the Deep Graph Library and PyTorch called the Materials Graph Library (MatGL) has replaced this implementation. This repository has been archived and will no longer be maintained. It will be kept purely as a reference implementation. Users are recommended to use matgl instead.

M3GNet

M3GNet is a new materials graph neural network architecture that incorporates 3-body interactions. A key difference with prior materials graph implementations such as MEGNet is the addition of the coordinates for atoms and the 3×3 lattice matrix in crystals, which are necessary for obtaining tensorial quantities such as forces and stresses via auto-differentiation.

As a framework, M3GNet has diverse applications, including:

  • Interatomic potential development. With the same training data, M3GNet performs similarly to state-of-the-art machine learning interatomic potentials (ML-IAPs). However, a key feature of a graph representation is its flexibility to scale to diverse chemical spaces. One of the key accomplishments of M3GNet is the development of a universal IAP that can work across the entire periodic table of the elements by training on relaxations performed in the Materials Project.
  • Surrogate models for property predictions. Like the previous MEGNet architecture, M3GNet can be used to develop surrogate models for property predictions, achieving in many cases accuracies that better or similar to other state-of-the-art ML models.

For detailed performance benchmarks, please refer to the publication in the References section. The API documentation is available via the Github Page.

Table of Contents

System requirements

Inferences using the pre-trained models can be ran on any standard computer. For model training, the GPU memory needs to be > 18 Gb for a batch size of 32 using the crystal training data. In our work, we used a single RTX 3090 GPU for model training.

Installation

M3GNet can be installed via pip:

pip install m3gnet

You can also directly download the source from Github and install from source.

Apple Silicon Installation

Apple Silicon (M1, M1 Pro, M1 Max, M1 Ultra) has extremely powerful ML capabilities, but special steps are needed for the installation of tensorflow and other dependencies. Here are the recommended installation steps.

  1. Ensure that you already have XCode and CLI installed.

  2. Install Miniconda or Anaconda.

  3. Create a Python 3.9 environment.

    conda create --name m3gnet python=3.9
    conda activate m3gnet
  4. First install tensorflow and its dependencies for Apple Silicon.

    conda install -c apple tensorflow-deps
    pip install tensorflow-macos
  5. If you wish, you can install tensorflow-metal, which helps speed up training. If you encounter strange tensorflow errors, you should uninstall tensorflow-metal and see if it fixes the errors first.

    pip install tensorflow-metal
  6. Install m3gnet but ignore dependencies (otherwise, pip will look for tensorflow).

    pip install --no-deps m3gnet
  7. Install other dependencies like pymatgen, etc. manually.

    pip install protobuf==3.20.0 pymatgen ase cython
  8. Once you are done, you can try running pytest m3gnet to see if all tests pass.

Change Log

See change log

Usage

Structure relaxation

A M3Gnet universal potential for the periodic table has been developed using data from Materials Project relaxations since 2012. This universal potential can be used to perform structural relaxation of any arbitrary crystal as follows.

import warnings

from m3gnet.models import Relaxer
from pymatgen.core import Lattice, Structure

for category in (UserWarning, DeprecationWarning):
    warnings.filterwarnings("ignore", category=category, module="tensorflow")

# Init a Mo structure with stretched lattice (DFT lattice constant ~ 3.168)
mo = Structure(Lattice.cubic(3.3), ["Mo", "Mo"], [[0., 0., 0.], [0.5, 0.5, 0.5]])

relaxer = Relaxer()  # This loads the default pre-trained model

relax_results = relaxer.relax(mo, verbose=True)

final_structure = relax_results['final_structure']
final_energy_per_atom = float(relax_results['trajectory'].energies[-1] / len(mo))

print(f"Relaxed lattice parameter is {final_structure.lattice.abc[0]:.3f} Å")
print(f"Final energy is {final_energy_per_atom:.3f} eV/atom")

The output is as follows:

Relaxed lattice parameter is  3.169 Å
Final energy is -10.859 eV/atom

The initial lattice parameter of 3.3 Å was successfully relaxed to 3.169 Å, close to the DFT value of 3.168 Å. The final energy -10.859 eV/atom is also close to Materials Project DFT value of -10.8456 eV/atom.

The relaxation takes less than 20 seconds on a single laptop.

The table below provides more comprehensive benchmarks for cubic crystals based on exp data on Wikipedia and MP DFT data. The Jupyter notebook is in the examples folder. This benchmark is limited to cubic crystals for ease of comparison since there is only one lattice parameter. Of course, M3GNet is not limited to cubic systems (see LiFePO4 example).

Material Crystal structure a (Å) MP a (Å) M3GNet a (Å) % error vs Expt % error vs MP
Ac FCC 5.31 5.66226 5.6646 6.68% 0.04%
Ag FCC 4.079 4.16055 4.16702 2.16% 0.16%
Al FCC 4.046 4.03893 4.04108 -0.12% 0.05%
AlAs Zinc blende (FCC) 5.6605 5.73376 5.73027 1.23% -0.06%
AlP Zinc blende (FCC) 5.451 5.50711 5.50346 0.96% -0.07%
AlSb Zinc blende (FCC) 6.1355 6.23376 6.22817 1.51% -0.09%
Ar FCC 5.26 5.64077 5.62745 6.99% -0.24%
Au FCC 4.065 4.17129 4.17431 2.69% 0.07%
BN Zinc blende (FCC) 3.615 3.626 3.62485 0.27% -0.03%
BP Zinc blende (FCC) 4.538 4.54682 4.54711 0.20% 0.01%
Ba BCC 5.02 5.0303 5.03454 0.29% 0.08%
C (diamond) Diamond (FCC) 3.567 3.57371 3.5718 0.13% -0.05%
Ca FCC 5.58 5.50737 5.52597 -0.97% 0.34%
CaVO3 Cubic perovskite 3.767 3.83041 3.83451 1.79% 0.11%
CdS Zinc blende (FCC) 5.832 5.94083 5.9419 1.88% 0.02%
CdSe Zinc blende (FCC) 6.05 6.21283 6.20987 2.64% -0.05%
CdTe Zinc blende (FCC) 6.482 6.62905 6.62619 2.22% -0.04%
Ce FCC 5.16 4.72044 4.71921 -8.54% -0.03%
Cr BCC 2.88 2.87403 2.84993 -1.04% -0.84%
CrN Halite 4.149 - 4.16068 0.28% -
Cs BCC 6.05 6.11004 5.27123 -12.87% -13.73%
CsCl Caesium chloride 4.123 4.20906 4.20308 1.94% -0.14%
CsF Halite 6.02 6.11801 6.1265 1.77% 0.14%
CsI Caesium chloride 4.567 4.66521 4.90767 7.46% 5.20%
Cu FCC 3.597 3.62126 3.61199 0.42% -0.26%
Eu BCC 4.61 4.63903 4.34783 -5.69% -6.28%
EuTiO3 Cubic perovskite 7.81 3.96119 3.92943 -49.69% -0.80%
Fe BCC 2.856 2.84005 2.85237 -0.13% 0.43%
GaAs Zinc blende (FCC) 5.653 5.75018 5.75055 1.73% 0.01%
GaP Zinc blende (FCC) 5.4505 5.5063 5.5054 1.01% -0.02%
GaSb Zinc blende (FCC) 6.0959 6.21906 6.21939 2.03% 0.01%
Ge Diamond (FCC) 5.658 5.76286 5.7698 1.98% 0.12%
HfC0.99 Halite 4.64 4.65131 4.65023 0.22% -0.02%
HfN Halite 4.392 4.53774 4.53838 3.33% 0.01%
InAs Zinc blende (FCC) 6.0583 6.18148 6.25374 3.23% 1.17%
InP Zinc blende (FCC) 5.869 5.95673 5.9679 1.69% 0.19%
InSb Zinc blende (FCC) 6.479 6.63322 6.63863 2.46% 0.08%
Ir FCC 3.84 3.87573 3.87716 0.97% 0.04%
K BCC 5.23 5.26212 5.4993 5.15% 4.51%
KBr Halite 6.6 6.70308 6.70797 1.64% 0.07%
KCl Halite 6.29 6.38359 6.39634 1.69% 0.20%
KF Halite 5.34 5.42398 5.41971 1.49% -0.08%
KI Halite 7.07 7.18534 7.18309 1.60% -0.03%
KTaO3 Cubic perovskite 3.9885 4.03084 4.03265 1.11% 0.05%
Kr FCC 5.72 6.49646 6.25924 9.43% -3.65%
Li BCC 3.49 3.42682 3.41891 -2.04% -0.23%
LiBr Halite 5.5 5.51343 5.51076 0.20% -0.05%
LiCl Halite 5.14 5.15275 5.14745 0.15% -0.10%
LiF Halite 4.03 4.08343 4.08531 1.37% 0.05%
LiI Halite 6.01 6.0257 6.02709 0.28% 0.02%
MgO Halite (FCC) 4.212 4.25648 4.2567 1.06% 0.01%
Mo BCC 3.142 3.16762 3.16937 0.87% 0.06%
Na BCC 4.23 4.17262 4.19684 -0.78% 0.58%
NaBr Halite 5.97 6.0276 6.01922 0.82% -0.14%
NaCl Halite 5.64 5.69169 5.69497 0.97% 0.06%
NaF Halite 4.63 4.69625 4.69553 1.42% -0.02%
NaI Halite 6.47 6.532 6.52739 0.89% -0.07%
Nb BCC 3.3008 3.32052 3.32221 0.65% 0.05%
NbN Halite 4.392 4.45247 4.45474 1.43% 0.05%
Ne FCC 4.43 4.30383 6.95744 57.05% 61.66%
Ni FCC 3.499 3.5058 3.5086 0.27% 0.08%
Pb FCC 4.92 5.05053 5.02849 2.21% -0.44%
PbS Halite (FCC) 5.9362 6.00645 6.01752 1.37% 0.18%
PbTe Halite (FCC) 6.462 6.56567 6.56111 1.53% -0.07%
Pd FCC 3.859 3.95707 3.95466 2.48% -0.06%
Pt FCC 3.912 3.97677 3.97714 1.67% 0.01%
Rb BCC 5.59 5.64416 5.63235 0.76% -0.21%
RbBr Halite 6.89 7.02793 6.98219 1.34% -0.65%
RbCl Halite 6.59 6.69873 6.67994 1.36% -0.28%
RbF Halite 5.65 5.73892 5.76843 2.10% 0.51%
RbI Halite 7.35 7.48785 7.61756 3.64% 1.73%
Rh FCC 3.8 3.8439 3.84935 1.30% 0.14%
ScN Halite 4.52 4.51831 4.51797 -0.04% -0.01%
Si Diamond (FCC) 5.43102 5.46873 5.45002 0.35% -0.34%
Sr FCC 6.08 6.02253 6.04449 -0.58% 0.36%
SrTiO3 Cubic perovskite 3.98805 3.94513 3.94481 -1.08% -0.01%
SrVO3 Cubic perovskite 3.838 3.90089 3.90604 1.77% 0.13%
Ta BCC 3.3058 3.32229 3.31741 0.35% -0.15%
TaC0.99 Halite 4.456 4.48208 4.48225 0.59% 0.00%
Th FCC 5.08 5.04122 5.04483 -0.69% 0.07%
TiC Halite 4.328 4.33565 4.33493 0.16% -0.02%
TiN Halite 4.249 4.25353 4.25254 0.08% -0.02%
V BCC 3.0399 2.99254 2.99346 -1.53% 0.03%
VC0.97 Halite 4.166 4.16195 4.16476 -0.03% 0.07%
VN Halite 4.136 4.12493 4.1281 -0.19% 0.08%
W BCC 3.155 3.18741 3.18826 1.05% 0.03%
Xe FCC 6.2 6.66148 7.06991 14.03% 6.13%
Yb FCC 5.49 5.44925 5.45807 -0.58% 0.16%
ZnO Halite (FCC) 4.58 4.33888 4.33424 -5.37% -0.11%
ZnS Zinc blende (FCC) 5.42 5.45027 5.45297 0.61% 0.05%
ZrC0.97 Halite 4.698 4.72434 4.72451 0.56% 0.00%
ZrN Halite 4.577 4.61762 4.61602 0.85% -0.03%

From the table, it can be observed that almost all M3GNet-relaxed cubic lattice constants are within 1% of the DFT values. The only major errors are with EuTiO3, iodides (RbI and CsI) and the noble gases. It is quite likely the Wikipedia value for EuTiO3 is wrong by a factor of 2 and the lower than expected accuracy on iodides and noble gases may be due to the paucity of data in these chemical systems. It should be noted that M3GNet is expected to reproduce the MP DFT value and not the experimental values, which are only provided as an additional point of reference.

All relaxations take less than 1s on a M1 Max Mac.

CLI tool

A simple CLI tool has been written. Right now, it supports just doing structure relaxations with M3GNet, which is immediately useful for quick testing of the capabilities of M3GNet itself. More features will be developed in future if there is user interest. Examples below.

m3g relax --infile Li2O.cif  # Outputs to stdout the relaxed structure.
m3g relax --infile Li2O.cif --outfile Li2O_relaxed.cif  # Outputs to a file the relaxed structure.

Molecular dynamics

Similarly, the universal IAP can be used to perform molecular dynamics (MD) simulations as well.

from pymatgen.core import Structure, Lattice
from m3gnet.models import MolecularDynamics

# Init a Mo structure with stretched lattice (DFT lattice constant ~ 3.168)
mo = Structure(Lattice.cubic(3.3),
               ["Mo", "Mo"], [[0., 0., 0.], [0.5, 0.5, 0.5]])

md = MolecularDynamics(
    atoms=mo,
    temperature=1000,  # 1000 K
    ensemble='nvt',  # NVT ensemble
    timestep=1, # 1fs,
    trajectory="mo.traj",  # save trajectory to mo.traj
    logfile="mo.log",  # log file for MD
    loginterval=100,  # interval for record the log
)

md.run(steps=1000)

After the run, mo.log contains thermodynamic information similar to the following:

Time[ps]      Etot[eV]     Epot[eV]     Ekin[eV]    T[K]
0.0000         -21.3307     -21.3307       0.0000     0.0
0.1000         -21.3307     -21.3307       0.0000     0.0
0.2000         -21.2441     -21.3087       0.0645   249.7
0.3000         -21.0466     -21.2358       0.1891   731.6
0.4000         -20.9702     -21.1149       0.1447   559.6
0.5000         -20.9380     -21.1093       0.1713   662.6
0.6000         -20.9176     -21.1376       0.2200   850.9
0.7000         -20.9016     -21.1789       0.2773  1072.8
0.8000         -20.8804     -21.1638       0.2835  1096.4
0.9000         -20.8770     -21.0695       0.1925   744.5
1.0000         -20.8908     -21.0772       0.1864   721.2

The MD run takes less than 1 minute.

Model training

You can also train your own IAP using the PotentialTrainer in m3gnet.trainers. The training dataset can include:

  • structures, a list of pymatgen Structures
  • energies, a list of energy floats with unit eV.
  • forces, a list of nx3 force matrix with unit eV/Å, where n is the number of atom in each structure. n does not need to be the same for all structures.
  • stresses, a list of 3x3 stress matrices with unit GPa (optional)

For stresses, we use the convention that compressive stress gives negative values. Stresses obtained from VASP calculations (default unit is kBar) should be multiplied by -0.1 to work directly with the model.

We use validation dataset to select the stopping epoch number. The dataset has similar format as the training dataset.

If you want to use the offical MPF dataset shared above, here are some code examples that you can follow to load the dataset smoothly and train your own model.

First, load the MPF dataset consisting of block_0 and block_1

import pickle as pk
import pandas as pd
import pymatgen

print('loading the MPF dataset 2021')
with open('/yourpath/block_0.p', 'rb') as f:
    data = pk.load(f)

with open('/yourpath/block_1.p', 'rb') as f:
    data2 = pk.load(f)
print('MPF dataset 2021 loaded')
data.update(data2)
df = pd.DataFrame.from_dict(data)

Then, split the data based on material id and map the energy to formation energy with unit eV/atom

id_train, id_val, id_test = get_id_train_val_test(
    total_size=len(data),
    split_seed=42,
    train_ratio=0.90,
    val_ratio=0.05,
    test_ratio=0.05,
    keep_data_order=False,
)

cnt = 0
for idx, item in df.items():
    # import pdb; pdb.set_trace()
    if cnt in id_train:
        for iid in range(len(item['energy'])):
            dataset_train.append({"atoms":item['structure'][iid], "energy":item['energy'][iid] / len(item['force'][iid]), "force": np.array(item['force'][iid])})
    elif cnt in id_val:
        for iid in range(len(item['energy'])):
            dataset_val.append({"atoms":item['structure'][iid], "energy":item['energy'][iid] / len(item['force'][iid]), "force": np.array(item['force'][iid])})
    elif cnt in id_test:
        for iid in range(len(item['energy'])):
            dataset_test.append({"atoms":item['structure'][iid], "energy":item['energy'][iid] / len(item['force'][iid]), "force": np.array(item['force'][iid])})
    cnt += 1

print('using %d samples to train, %d samples to evaluate, and %d samples to test'%(len(dataset_train), len(dataset_val), len(dataset_test)))

After this, you can use the dataset_train to train, dataset_val to evaluate, and dataset_test to test.

A minimal example of model training is shown below.

from m3gnet.models import M3GNet, Potential
from m3gnet.trainers import PotentialTrainer

import tensorflow as tf

m3gnet = M3GNet(is_intensive=False)
potential = Potential(model=m3gnet)

trainer = PotentialTrainer(
    potential=potential, optimizer=tf.keras.optimizers.Adam(1e-3)
)

trainer.train(
    structures,
    energies,
    forces,
    stresses,
    validation_graphs_or_structures=val_structures,
    val_energies=val_energies,
    val_forces=val_forces,
    val_stresses=val_stresses,
    epochs=100,
    fit_per_element_offset=True,
    save_checkpoint=False,
)

Matterverse

As an example of the power of M3GNet for materials discovery, we have created a database of yet-to-be-synthesized materials called matterverse.ai. At the time of writing, matterverse.ai has 31 million structures, of which more than 1 million are predicted to be potentially stable. The initial candidate list was generated via combinatorial isovalent ionic substitutions based on the common oxidation states of non-noble-gas elements on 5,283 binary, ternary and quaternary structural prototypes in the 2019 version of the ICSD database.

API docs

The API docs are available here.

Datasets

The training data used to develop the universal M3GNet IAP is MPF.2021.2.8 and is hosted on figshare with DOI 10.6084/m9.figshare.19470599.

Reference

Please cite the following work:

Chen, C., Ong, S.P. A universal graph deep learning interatomic potential for the periodic table. Nat Comput Sci 2, 718–728 (2022). https://doi.org/10.1038/s43588-022-00349-3.

Acknowledgements

This work was primarily supported by the Materials Project, funded by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division under contract no. DE-AC02-05-CH11231: Materials Project program KC23MP. This work used the Expanse supercomputing cluster at the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562.

m3gnet's People

Contributors

alikhamze avatar chc273 avatar dependabot[bot] avatar dgaines2 avatar gpetretto avatar janosh avatar ltalirz avatar pre-commit-ci[bot] avatar sgbaird avatar shyuep avatar vvvlzy avatar ykq98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

m3gnet's Issues

Inconsistent results when predicting on batches with GPU

Hi! I was trying to train m3gnet on a specific set of crystals and noticed that evaluating the trained model gave me 3x different rmse depending on whether I was running the evaluation on GPU or CPU.

Diving deeper into this, I was able to spot that, when run with GPU on batches, m3gnet predicts somewhat biased energies, compared to what it gives for single-structure (batch size = 1) inputs or when running on CPU. I was able to reproduce this bias even on the pre-trained m3gnet. For the pretrained model, the bias is not too large, but it's certainly larger than the 32-bit floating point precision. Whether or not tf.function is used (as controlled globally by tf.config.run_functions_eagerly(...)) also affects the result.

Here are some details about my environment:
tensorflow 2.9.2
Driver Version: 515.48.07
CUDA Version: 11.7
GPU: NVIDIA A40

I was not able to reproduce it on a different machine (with different GPU and CUDA).

Here's the code to reproduce:

import tensorflow as tf
import numpy as np
from ase import Atoms
from tqdm import tqdm
import matplotlib.pyplot as plt

from m3gnet.models import M3GNet, Potential
from m3gnet.graph import MaterialGraphBatchEnergyForceStress

for d in tf.config.list_physical_devices("GPU"):
    tf.config.experimental.set_memory_growth(d, True)

batch_size = 128

structure = Atoms(
    "Cl4Ag4", pbc=True,
    cell=np.diag([5.5956, 5.5956, 5.5956]),
    positions=np.array([
       [0.    , 0.    , 0.    ],
       [2.7978, 2.7978, 0.    ],
       [0.    , 2.7978, 2.7978],
       [2.7978, 0.    , 2.7978],
       [0.    , 2.7978, 0.    ],
       [2.7978, 0.    , 0.    ],
       [0.    , 0.    , 2.7978],
       [5.2459, 1.3989, 4.3715],
    ])
)

def eval_structure_v1(struct):    
    m3gnet = M3GNet.load()
    potential = Potential(m3gnet)

    structure_graph = m3gnet.graph_converter(struct)
    return potential.get_energies(structure_graph.as_tf().as_list()).numpy().squeeze()

def eval_structure_v2(struct):
    m3gnet = M3GNet.load()
    potential = Potential(m3gnet)

    graph = m3gnet.graph_converter(struct)
    pred_e, _ = potential.get_ef_tensor(graph.as_tf().as_list())
    return pred_e.numpy().squeeze()

def eval_structure_v3(struct, batch_size=batch_size):
    m3gnet = M3GNet.load()
    potential = Potential(m3gnet)

    mgb = MaterialGraphBatchEnergyForceStress(
        [m3gnet.graph_converter(struct) for _ in range(batch_size)],
        energies=[0.0] * batch_size,
        forces=[np.zeros((8, 3)) for _ in range(batch_size)],
        stresses=None,
        batch_size=batch_size,
        shuffle=False,
    )

    graph, _ = next(iter(mgb))
    pred_e, _ = potential.get_ef_tensor(graph.as_tf().as_list())
    return pred_e.numpy().squeeze()

print(structure)

results_mean = {}
results_std = {}

bsizes = np.unique(np.round(np.logspace(0, 7, 30, base=2)).astype(int))
output = ""
for device in ["gpu:0", "cpu:0"]:
    for use_tf_func in [True, False]:
        key = f"{device}--useTfFunc:{use_tf_func}"
        results_mean[key] = []
        results_std[key] = []

        output += f"{device}, tf.function {use_tf_func}\n"
        tf.config.run_functions_eagerly(not use_tf_func)
        with tf.device(device):
            output += f"  v1: {eval_structure_v1(structure.copy())}\n"
            output += f"  v2: {eval_structure_v2(structure.copy())}\n"
            e_v3 = eval_structure_v3(structure.copy())
            output += f"  v3: {e_v3.min()}, {e_v3.max()}, {e_v3.mean()}\n"

            for bs in tqdm(bsizes):
                e_v3 = eval_structure_v3(structure.copy(), batch_size=bs)
                results_mean[key].append(e_v3.mean())
                results_std[key].append(e_v3.std())

        output += "\n"

print("", flush=True)
print(output)

for key in results_mean:
    plt.errorbar(x=bsizes, y=results_mean[key], yerr=results_std[key], label=key)
plt.legend()
plt.xlabel("batch size")
plt.ylabel("predicted energy")
plt.savefig("m3gnet_bug.png")

Here's what I see on the plot (energy vs batch size):
image
Printout (note how gpu v3 differs from the rest):

gpu:0, tf.function True
  v1: -10.224565505981445
  v2: -10.224565505981445
  v3: -10.216326713562012, -10.216205596923828, -10.216231346130371

gpu:0, tf.function False
  v1: -10.224565505981445
  v2: -10.224565505981445
  v3: -10.223982810974121, -10.223982810974121, -10.223982810974121

cpu:0, tf.function True
  v1: -10.224571228027344
  v2: -10.224571228027344
  v3: -10.22457218170166, -10.22457218170166, -10.224573135375977

cpu:0, tf.function False
  v1: -10.224571228027344
  v2: -10.224571228027344
  v3: -10.22457218170166, -10.224571228027344, -10.224573135375977

When run on google collab (CUDA 11.6, Tesla T4 GPU), same code gives the following (much more consistent) result:
image
Printout (again, much more consistent):

gpu:0, tf.function True
  v1: -10.224571228027344
  v2: -10.224571228027344
  v3: -10.224573135375977, -10.224570274353027, -10.224571228027344

gpu:0, tf.function False
  v1: -10.224571228027344
  v2: -10.224571228027344
  v3: -10.224573135375977, -10.224569320678711, -10.224571228027344

cpu:0, tf.function True
  v1: -10.22457218170166
  v2: -10.22457218170166
  v3: -10.22457218170166, -10.22457218170166, -10.22457218170166

cpu:0, tf.function False
  v1: -10.22457218170166
  v2: -10.22457218170166
  v3: -10.22457218170166, -10.22457218170166, -10.22457218170166

default stress_weight in M3GNetCalculator 1.0 -> 0.00624?

Relaxer has a default stress_weight of 0.01. If the m3gnet model predicts stresses in GPa, shouldn't the conversion factor from GPa to eV/ang^3 (ASE unit) be used instead? So stress_weight = 0.0062415? If I use this number I get much better agreement to numerically calculated stress tensors.

Actually that should probably be the default stress_weight in M3GNetCalculator (where it's currently 1.0), so that it's used also by MolecularDynamics?

In the ideal case the stress_weight would be stored inside the Potential when training.

Energy as an array in M3GNetCalculator

While using a postprocessing tool from ASE after a calculation with the M3GNetCalculator (a NEB plot) I got an error due to the fact that the M3GNetCalculator sets the energy as a numpy array with one element, rather than a simple float

energy=results[0].numpy().ravel(),

I can fix the problem in some way when doing the postprocess, but since other ASE calculators usually return a float there, I wanted to check if this is a specific choice for the M3GNetCalculator or if it is a bug.

Recovering MatBench results from weights for submission

Dear m3gnet developers,

thanks for providing pretrained models and the code for M3Gnet. I was trying to recover the M3Gnet predictions from MatBench training for submitting your results to https://github.com/materialsproject/matbench. I have this code below but I could not recover the exact values reported in the paper (although very close).
Can you help me? Are there some differences or problems in conversion I did not found out?

import os.path
import os
import requests
import zipfile
import numpy as np
import tensorflow as tf
from m3gnet.models import M3GNet
from matbench.bench import MatbenchBenchmark
from pymatgen.core import Lattice, Structure
import logging
import urllib.request
from m3gnet.layers import AtomRef

download_url = "https://figshare.com/ndownloader/files/35948966"
full_file_path = "weights.zip"
if not os.path.exists(full_file_path):
    r = requests.get(download_url, allow_redirects=True)
    with open(full_file_path, 'wb') as f:
        f.write(r.content)

file_path = "model_weights"
os.makedirs(file_path, exist_ok=True)
archive = zipfile.ZipFile(full_file_path, "r")
archive.extractall(file_path)
archive.close()

subsets_compatible = [
    "matbench_mp_e_form",
    "matbench_mp_gap",
    "matbench_mp_is_metal",
    "matbench_perovskites",
    "matbench_log_kvrh",
    "matbench_log_gvrh",
    "matbench_dielectric",
    "matbench_phonons",
    "matbench_jdft2d"
  ]
units = {"matbench_jdft2d": 1000, "matbench_phonons": 1000}
fit_per_element_offset = False
overwrite = False
mb = MatbenchBenchmark(subset=subsets_compatible, autoload=False)

for idx_task, task in enumerate(mb.tasks):
    task.load()
    for i, fold in enumerate(task.folds):

        tf.keras.backend.clear_session()
        # tf.keras.backend.set_floatx("float64")
        if task.dataset_name in units:
            scale_unit = units[task.dataset_name]
        else:
            scale_unit = 1.0

        predictions_path = "%s_predictions_%s_fold_%s.npy" % (task.dataset_name, "m3gnet", i)
        # model = M3GNet.from_dir("MP-2021.2.8-EFS")
        train_inputs, train_outputs = task.get_train_and_val_data(fold)
        test_inputs = task.get_test_data(fold, include_target=False)
        model = M3GNet.from_dir("model_weights/m3gnet_models/%s/%s/m3gnet" % (task.dataset_name, fold))
        if not os.path.exists(predictions_path) or overwrite:
            if fit_per_element_offset:
                graphs = [model.graph_converter(i) for i in train_inputs]
                ar = AtomRef(max_z=model.n_atom_types + 1)
                ar.fit(graphs, train_outputs)
                model.set_element_refs(ar.property_per_element)
            predictions = model.predict_structures(test_inputs)
            np.save(predictions_path, predictions)
        else:
            predictions = np.load(predictions_path)
            print("loaded predictions: %s" % predictions_path)

        if predictions.shape[-1] == 1:
            predictions = np.squeeze(predictions, axis=-1)

        # train_std = np.std(train_outputs)
        # train_mean = np.mean(train_outputs)
        # predictions = predictions * train_std + train_mean
        predictions = scale_unit * predictions

        # Record data!
        task.record(fold, predictions, params={})

# Save your results
mb.to_file("results.json.gz")

for key, values in mb.scores.items():
    factor = 1000.0 if key in ["matbench_mp_e_form", "matbench_mp_gap", "matbench_perovskites"] else 1.0
    if key not in ["matbench_mp_is_metal"]:
        print(key, factor*values["mae"]["mean"], factor*values["mae"]["std"])
    else:
        print(key, values["rocauc"]["mean"],  values["rocauc"]["std"])

With this script I got:

matbench_mp_e_form 19.48588313396765 0.19626422885988018
matbench_mp_gap 194.9911496262826 6.773441655230544
matbench_mp_is_metal 0.9397143291057087 0.0028206924210185786
matbench_perovskites 32.99242523617427 1.3762758094915537
matbench_log_kvrh 0.07380833213983187 0.010354237232502703
matbench_log_gvrh 0.09913490092743893 0.011390576803829782
matbench_dielectric 0.3168320033220523 0.06471518661133054
matbench_phonons 34.112623907973145 4.56153709897016
matbench_jdft2d 50.06711240004724 11.892898998285041

only log_kvrh, log_gvrh and gap seems worse than supposed to.

Thanks in advance.

m3gnet for 1 or 2 atoms ValueError with tensorflow==2.10.0

If I try to run a prediction on an H2 molecule or an H atom with the M3GNetCalculator I get the below error. Is it possible to run m3gnet for such systems? For example to get a smooth dissociation curve for H2.

from m3gnet.models import M3GNetCalculator, Potential, M3GNet
from ase.io import read
from ase import Atoms

def main():
    potential = Potential(M3GNet.load())
    calc = M3GNetCalculator(potential=potential)
    print("H2")
    atoms = Atoms('H2', positions=[(0, 0, -0.35), (0, 0, 0.35)])
    atoms.set_calculator(calc)
    try:
        atoms.get_potential_energy()
    except Exception as e:
        print(e)

    print("H")
    atoms = Atoms('H', positions=[(0, 0, -0.35)])
    atoms.set_calculator(calc)
    try:
        atoms.get_potential_energy()
    except Exception as e:
        print(e)

if __name__ == '__main__':
    main()
    File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/m3gnet/models/_base.py", line 186, in get_efs_tensor  *
        energies = self.get_energies(graph)
    File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/m3gnet/models/_base.py", line 261, in get_energies  *
        return self.model(graph)
    File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/tmp/__autograph_generated_filex91gvqx1.py", line 13, in tf__call
        three_basis = ag__.converted_call(ag__.ld(self).basis_expansion, (ag__.ld(graph),), None, fscope)
    File "/tmp/__autograph_generated_filekf031m2n.py", line 15, in tf__call
        retval_ = ag__.converted_call(ag__.ld(combine_sbf_shf), (ag__.ld(sbf), ag__.ld(shf)), dict(max_n=ag__.ld(self).max_n, max_l=ag__.ld(self).max_l, use_phi=ag__.ld(self).use_phi), fscope)
    File "/tmp/__autograph_generated_filemd4pex2a.py", line 81, in tf__combine_sbf_shf
        ag__.if_stmt((ag__.converted_call(ag__.ld(tf).shape, (ag__.ld(sbf),), None, fscope)[0] == 0), if_body_2, else_body_2, get_state_2, set_state_2, ('do_return', 'retval_'), 2)
    File "/tmp/__autograph_generated_filemd4pex2a.py", line 50, in else_body_2
        expanded_sbf = ag__.converted_call(ag__.ld(tf).repeat, (ag__.ld(sbf),), dict(repeats=ag__.ld(repeats_sbf), axis=1), fscope)

    ValueError: Exception encountered when calling layer "m3g_net" "                 f"(type M3GNet).
    
    in user code:
    
        File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/m3gnet/models/_m3gnet.py", line 253, in call  *
            three_basis = self.basis_expansion(graph)
        File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler  **
            raise e.with_traceback(filtered_tb) from None
        File "/tmp/__autograph_generated_filekf031m2n.py", line 15, in tf__call
            retval_ = ag__.converted_call(ag__.ld(combine_sbf_shf), (ag__.ld(sbf), ag__.ld(shf)), dict(max_n=ag__.ld(self).max_n, max_l=ag__.ld(self).max_l, use_phi=ag__.ld(self).use_phi), fscope)
        File "/tmp/__autograph_generated_filemd4pex2a.py", line 81, in tf__combine_sbf_shf
            ag__.if_stmt((ag__.converted_call(ag__.ld(tf).shape, (ag__.ld(sbf),), None, fscope)[0] == 0), if_body_2, else_body_2, get_state_2, set_state_2, ('do_return', 'retval_'), 2)
        File "/tmp/__autograph_generated_filemd4pex2a.py", line 50, in else_body_2
            expanded_sbf = ag__.converted_call(ag__.ld(tf).repeat, (ag__.ld(sbf),), dict(repeats=ag__.ld(repeats_sbf), axis=1), fscope)
    
        ValueError: Exception encountered when calling layer "spherical_bessel_with_harmonics" "                 f"(type SphericalBesselWithHarmonics).
        
        in user code:
        
            File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/m3gnet/layers/_three_body.py", line 57, in call  *
                return combine_sbf_shf(sbf, shf, max_n=self.max_n, max_l=self.max_l, use_phi=self.use_phi)
            File "/home/hellstrom/.scm/python/AMS2022.2.venv/lib/python3.8/site-packages/m3gnet/utils/_math.py", line 300, in combine_sbf_shf  *
                expanded_sbf = tf.repeat(sbf, repeats=repeats_sbf, axis=1)
        
            ValueError: Dimension 1 in both shapes must be equal, but are 9 and 0. Shapes are [0,9] and [0,0].
        
        
        Call arguments received by layer "spherical_bessel_with_harmonics" "                 f"(type SphericalBesselWithHarmonics):
          • graph=['tf.Tensor(shape=(2, 1), dtype=int32)', 'tf.Tensor(shape=(2,), dtype=float32)', 'None', 'tf.Tensor(shape=(2, 3), dtype=float32)', 'tf.Tensor(shape=(2, 2), dtype=int32)', 'tf.Tensor(shape=(2, 3), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(2,), dtype=float32)', 'tf.Tensor(shape=(1, 3, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(2,), dtype=int32)', 'tf.Tensor(shape=(2,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)']
          • kwargs={'training': 'None'}
    
    
    Call arguments received by layer "m3g_net" "                 f"(type M3GNet):
      • graph=['tf.Tensor(shape=(2, 1), dtype=int32)', 'tf.Tensor(shape=(2, 1), dtype=float32)', 'None', 'tf.Tensor(shape=(2, 3), dtype=float32)', 'tf.Tensor(shape=(2, 2), dtype=int32)', 'tf.Tensor(shape=(2, 3), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(2,), dtype=float32)', 'tf.Tensor(shape=(1, 3, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'None', 'None', 'None', 'tf.Tensor(shape=(2,), dtype=int32)', 'tf.Tensor(shape=(2,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)']
      • kwargs={'training': 'None'}

Get energy above hull through the m3gnet model

Hi~
I noticed on matterverse.ai that the data for energy above hull can be obtained through the m3gnet vmatbench_0.0.1.0 model. Could you please share the code of the example?
Thanks

Feature request: relaxation under pressure

Dear developers,
Considering Molecular Dynamics with m3gnet supports pressure, is it possible to add external pressure to Relaxer? It would be of great benefit for studying systems in pressure ranges.

Model restored from checkpoint predicts wrong energies

Dear m3gnet developers,

I'm getting the wrong energies when using a potential recovered from the callbacks folder. The MAE energy after training is ~ 0.005 eV/atom, but after the restoration of the model, the potential predicts completely different energies (~ 10 times less). However the forces are ok. Do I do something incorrectly while loading weights (I am not so familiar with tensorflow)? Could you suggest any solution? The exmaple of loading weights is below:

m3gnet = M3GNet(is_intensive=False)
folder= 'callbacks/'
latest = tf.train.latest_checkpoint(os.path.dirname(folder))
m3gnet.load_weights(latest)
potential = Potential(model=m3gnet)

Potential fails to predict structure without threebody interaction in threebody cutoff of 4 Å

Hi, I'm reporting that the current M3GNet Potential cannot predict structures without threebody interactions within the threebody cutoff of 4 Å. I believe this is not a desired behavior. Part of the error is listed here:
Screenshot 2023-03-23 at 5 33 33 PM

The error comes from the below lines:

@tf.function(experimental_relax_shapes=True)
def get_energies(self, graph: List) -> tf.Tensor:
"""
get energies from a list repr of a graph
Args:
graph (List): list repr of a graph
Returns:
"""
return self.model(graph)

As designed, self.model(graph) actually works fine to get energy of structures without threebody interactions, but the line of @tf.function(experimental_relax_shapes=True) leads to the bug. When commenting out these lines of @tf.function from the get_energies and get_efs_tensor functions, this bug no longer exists.

@chc273 When available, would you please comment on whether or not we should keep the @tf.function, and if we should keep them, how we can make the Potential behave as designed?

Python 3.11

In materialsproject/pymatgen#2714 it looks like m3gnet can't be installed under Python 3.11.

pip install error (CI run):

Downloading m3gnet-0.0.4.tar.gz (2.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 73.3 MB/s eta 0:00:00
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'error'
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [25 lines of output]
      Traceback (most recent call last):
        File "/opt/hostedtoolcache/Python/3.11.0/x64/lib/python3.11/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 351, in <module>
          main()
        File "/opt/hostedtoolcache/Python/3.11.0/x64/lib/python3.11/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 333, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/hostedtoolcache/Python/3.11.0/x64/lib/python3.11/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-opu_sowf/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 338, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-opu_sowf/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 320, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-opu_sowf/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 335, in run_setup
          exec(code, locals())
        File "<string>", line 60, in <module>
        File "/tmp/pip-build-env-opu_sowf/overlay/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 970, in cythonize
          module_list, module_metadata = create_extension_list(
                                         ^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-opu_sowf/overlay/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 816, in create_extension_list
          for file in nonempty(sorted(extended_iglob(filepattern)), "'%s' doesn't match any files" % filepattern):
        File "/tmp/pip-build-env-opu_sowf/overlay/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 114, in nonempty
          raise ValueError(error_msg)
      ValueError: 'm3gnet/graph/_threebody_indices.pyx' doesn't match any files
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

Cannot read 'structure' of the figshare training data

Hello, I am trying to use your training data from figshare, and I use your code to read it.

import pickle

with open('block_0.p', 'rb') as f:

    data = pickle.load(f)

with open('block_1.p', 'rb') as f:

    data.update(pickle.load(f))

But when I get the data dict and want to get access to the structure (like using data['mp-770840']['structure']), it reports the error:

File ~/mambaforge/lib/python3.10/site-packages/pymatgen/core/structure.py:2244, in IStructure.__repr__(self)
   2243 def __repr__(self):
-> 2244     outs = ["Structure Summary", repr(self.lattice)]
   2245     if self._charge:
   2246         if self._charge >= 0:

File ~/mambaforge/lib/python3.10/site-packages/pymatgen/core/lattice.py:940, in Lattice.__repr__(self)
    931 def __repr__(self):
    932     outs = [
    933         "Lattice",
    934         "    abc : " + " ".join(map(repr, self.lengths)),
    935         " angles : " + " ".join(map(repr, self.angles)),
    936         " volume : " + repr(self.volume),
    937         "      A : " + " ".join(map(repr, self._matrix[0])),
    938         "      B : " + " ".join(map(repr, self._matrix[1])),
    939         "      C : " + " ".join(map(repr, self._matrix[2])),
--> 940         "    pbc : " + " ".join(map(repr, self._pbc)),
    941     ]
    942     return "\n".join(outs)

AttributeError: 'Lattice' object has no attribute '_pbc'

The version of my pymatgen is 2023.1.9.

Could you help me to solve this problem? Thanks.

Reason for small training data?

Quoting from the paper:

Our initial dataset comprises a sampling of the energies, forces and stresses from the first and middle ionic steps of the first relaxation and the last step of the second relaxation for calculations in the Materials Project database that contains “GGA Structure Optimization” or “GGA+U Structure Optimization” task types as of Feb 8, 2021. [...] In total, this “MPF.2021.2.8” dataset contains 187,687 ionic steps of 62,783 compounds, ...

What's the reason for not using the relaxation trajectories for all of the ~140 k MP structures?

Transfer learning with m3gnet

Is it possible to fine-tune the MP-trained model on custom datasets? My first instinct was to supply the trained m3gnet model to the potential trainer but that does not seem to work. Thanks!

Derivatives of forces are NaN when there's a right angle in a structure

The problem occurs in the spherical harmonics calculation:

import tensorflow as tf
from m3gnet.utils import SphericalHarmonicsFunction

sph = SphericalHarmonicsFunction(3, use_phi=False)

def Ylm_2nd_der(x):
    with tf.GradientTape() as t0:
        with tf.GradientTape() as t:
            t.watch(x)
            t0.watch(x)
            y = sph([x], [0.0])

        dydx = t.gradient(y, x)
    d2ydx2 = t0.gradient(dydx, x)

    return d2ydx2

for costheta in [-1.0, -0.5, 0.0, 0.5, 1.0]:
    print(Ylm_2nd_der(tf.convert_to_tensor(costheta, dtype="float32")))

Output:

tf.Tensor(1.8923494, shape=(), dtype=float32)
tf.Tensor(1.8923494, shape=(), dtype=float32)
tf.Tensor(nan, shape=(), dtype=float32)
tf.Tensor(1.8923494, shape=(), dtype=float32)
tf.Tensor(1.8923494, shape=(), dtype=float32)

It happens due to casting to complex here:

costheta = tf.cast(costheta, dtype=tf.dtypes.complex64)

Then, the actual harmonics functions involve a pow in (DT_COMPLEX64, DT_COMPLEX64) -> DT_COMPLEX64 signature, which probably utilizes polar form, hence undefined derivative in 0.

I also notice that in some cases tf.function seems to optimize it out and returns correct value of the derivative, but it's very unpredictable as to when this optimization takes place (e.g., it may happen in t.gradient, but not in t.jacobian).

A possible workaround would be to avoid casting to complex when self.use_phi is False.

M3GNet will crash if it meets some materials in Material project, here are the mp-ids, any hint to fix them?

M3GNet does not work
on 'mp-20071', 'mp-21462', 'mp-1182832' with element Eu;
on 'mp-1012110', 'mp-949029', 'mp-1055940', 'mp-573579', 'mp-639727', 'mp-1183897', 'mp-672241', 'mp-1', 'mp-11832', 'mp-1096915', 'mp-1007976', 'mp-1184151', 'mp-1183694', 'mp-3' with element Cs;
on 'mp-867126', 'mp-1179802', 'mp-974620', 'mp-975204', 'mp-1063817', 'mp-70', 'mp-604321', 'mp-1186899', 'mp-975129', 'mp-12628', 'mp-640416', 'mp-569688', 'mp-639736', 'mp-1186853', 'mp-1018045', 'mp-975519', 'mp-656615', 'mp-1179832', 'mp-639755', 'mp-1179656' with element Rb;
on 'mp-867202', 'mp-1056418', 'mp-76', 'mp-1187073', 'mp-95', 'mp-139' with element Sr;
and so on...

any idea to solve them? Thanks in advance.

Only isolated atom calculation is difficult for m3gnet?

Thank you for the great repositry.
Like #37 I get the error in calculating H atom energy. However, after calculating H2 molecule energy, I don't get the error.
If possible, could you tell me the reason why?

from m3gnet.models import M3GNetCalculator, Potential, M3GNet
from ase.io import read
from ase import Atoms

def main():
    potential = Potential(M3GNet.load())
    calc = M3GNetCalculator(potential=potential)

    print("H")
    atoms = Atoms('H', positions=[(0, 0, -0.35)])
    atoms.set_calculator(calc)
    try:
        print(atoms.get_potential_energy())
    except Exception as e:
        print(e)

if __name__ == '__main__':
    main()
H
in user code:

    File "/usr/local/lib/python3.9/site-packages/m3gnet/models/_base.py", line 186, in get_efs_tensor  *
        energies = self.get_energies(graph)
    File "/usr/local/lib/python3.9/site-packages/m3gnet/models/_base.py", line 261, in get_energies  *
        return self.model(graph)
    File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/tmp/__autograph_generated_fileot8cd98z.py", line 37, in tf__call
        ag__.for_stmt(ag__.converted_call(ag__.ld(range), (ag__.ld(self).n_blocks,), None, fscope), None, loop_body, get_state, set_state, ('g',), {'iterate_names': 'i'})
    File "/tmp/__autograph_generated_fileot8cd98z.py", line 35, in loop_body
        g = ag__.converted_call(ag__.ld(self).graph_layers[ag__.ld(i)], (ag__.ld(g),), None, fscope)
    File "/tmp/__autograph_generated_filetgzexm7i.py", line 16, in tf__call
        out = ag__.converted_call(ag__.ld(self).state_network, (ag__.converted_call(ag__.ld(self).atom_network, (ag__.converted_call(ag__.ld(self).bond_network, (ag__.ld(graph),), None, fscope),), None, fscope),), None, fscope)
    File "/tmp/__autograph_generated_file786n0fz6.py", line 18, in tf__call
        bonds = ag__.converted_call(ag__.ld(self).update_bonds, (ag__.ld(graph),), None, fscope)
    File "/tmp/__autograph_generated_filez4rbuqc7.py", line 40, in tf__update_bonds
        retval_ = ag__.converted_call(ag__.ld(self).update_func, (ag__.ld(concat),), None, fscope) * ag__.converted_call(ag__.ld(self).weight_func, (ag__.ld(graph)[ag__.ld(Index).BOND_WEIGHTS],), None, fscope) + ag__.ld(graph)[ag__.ld(Index).BONDS]
    File "/tmp/__autograph_generated_file4fv31nqr.py", line 20, in tf__call
        retval_ = ag__.converted_call(ag__.ld(self).pipe.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope) * ag__.converted_call(ag__.ld(self).gate.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope)
    File "/tmp/__autograph_generated_filehl5vv6vt.py", line 31, in tf__call
        ag__.for_stmt(ag__.ld(self).layers, None, loop_body, get_state, set_state, ('out',), {'iterate_names': 'layer'})
    File "/tmp/__autograph_generated_filehl5vv6vt.py", line 29, in loop_body
        out = ag__.converted_call(ag__.ld(layer), (ag__.ld(out),), None, fscope)

    ValueError: Exception encountered when calling layer "m3g_net" (type M3GNet).
    
    in user code:
    
        File "/usr/local/lib/python3.9/site-packages/m3gnet/models/_m3gnet.py", line 259, in call  *
            g = self.graph_layers[i](g)
        File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
            raise e.with_traceback(filtered_tb) from None
        File "/tmp/__autograph_generated_filetgzexm7i.py", line 16, in tf__call
            out = ag__.converted_call(ag__.ld(self).state_network, (ag__.converted_call(ag__.ld(self).atom_network, (ag__.converted_call(ag__.ld(self).bond_network, (ag__.ld(graph),), None, fscope),), None, fscope),), None, fscope)
        File "/tmp/__autograph_generated_file786n0fz6.py", line 18, in tf__call
            bonds = ag__.converted_call(ag__.ld(self).update_bonds, (ag__.ld(graph),), None, fscope)
        File "/tmp/__autograph_generated_filez4rbuqc7.py", line 40, in tf__update_bonds
            retval_ = ag__.converted_call(ag__.ld(self).update_func, (ag__.ld(concat),), None, fscope) * ag__.converted_call(ag__.ld(self).weight_func, (ag__.ld(graph)[ag__.ld(Index).BOND_WEIGHTS],), None, fscope) + ag__.ld(graph)[ag__.ld(Index).BONDS]
        File "/tmp/__autograph_generated_file4fv31nqr.py", line 20, in tf__call
            retval_ = ag__.converted_call(ag__.ld(self).pipe.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope) * ag__.converted_call(ag__.ld(self).gate.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope)
        File "/tmp/__autograph_generated_filehl5vv6vt.py", line 31, in tf__call
            ag__.for_stmt(ag__.ld(self).layers, None, loop_body, get_state, set_state, ('out',), {'iterate_names': 'layer'})
        File "/tmp/__autograph_generated_filehl5vv6vt.py", line 29, in loop_body
            out = ag__.converted_call(ag__.ld(layer), (ag__.ld(out),), None, fscope)
    
        ValueError: Exception encountered when calling layer "graph_network_layer" (type GraphNetworkLayer).
        
        in user code:
        
            File "/usr/local/lib/python3.9/site-packages/m3gnet/layers/_gn.py", line 52, in call  *
                out = self.state_network(self.atom_network(self.bond_network(graph)))
            File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
                raise e.with_traceback(filtered_tb) from None
            File "/tmp/__autograph_generated_file786n0fz6.py", line 18, in tf__call
                bonds = ag__.converted_call(ag__.ld(self).update_bonds, (ag__.ld(graph),), None, fscope)
            File "/tmp/__autograph_generated_filez4rbuqc7.py", line 40, in tf__update_bonds
                retval_ = ag__.converted_call(ag__.ld(self).update_func, (ag__.ld(concat),), None, fscope) * ag__.converted_call(ag__.ld(self).weight_func, (ag__.ld(graph)[ag__.ld(Index).BOND_WEIGHTS],), None, fscope) + ag__.ld(graph)[ag__.ld(Index).BONDS]
            File "/tmp/__autograph_generated_file4fv31nqr.py", line 20, in tf__call
                retval_ = ag__.converted_call(ag__.ld(self).pipe.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope) * ag__.converted_call(ag__.ld(self).gate.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope)
            File "/tmp/__autograph_generated_filehl5vv6vt.py", line 31, in tf__call
                ag__.for_stmt(ag__.ld(self).layers, None, loop_body, get_state, set_state, ('out',), {'iterate_names': 'layer'})
            File "/tmp/__autograph_generated_filehl5vv6vt.py", line 29, in loop_body
                out = ag__.converted_call(ag__.ld(layer), (ag__.ld(out),), None, fscope)
        
            ValueError: Exception encountered when calling layer "concat_atoms" (type ConcatAtoms).
            
            in user code:
            
                File "/usr/local/lib/python3.9/site-packages/m3gnet/layers/_bond.py", line 45, in call  *
                    bonds = self.update_bonds(graph)
                File "/usr/local/lib/python3.9/site-packages/m3gnet/layers/_bond.py", line 161, in update_bonds  *
                    return self.update_func(concat) * self.weight_func(graph[Index.BOND_WEIGHTS]) + graph[Index.BONDS]
                File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
                    raise e.with_traceback(filtered_tb) from None
                File "/tmp/__autograph_generated_file4fv31nqr.py", line 20, in tf__call
                    retval_ = ag__.converted_call(ag__.ld(self).pipe.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope) * ag__.converted_call(ag__.ld(self).gate.call, (ag__.ld(inputs),), dict(**ag__.ld(kwargs)), fscope)
                File "/tmp/__autograph_generated_filehl5vv6vt.py", line 31, in tf__call
                    ag__.for_stmt(ag__.ld(self).layers, None, loop_body, get_state, set_state, ('out',), {'iterate_names': 'layer'})
                File "/tmp/__autograph_generated_filehl5vv6vt.py", line 29, in loop_body
                    out = ag__.converted_call(ag__.ld(layer), (ag__.ld(out),), None, fscope)
            
                ValueError: Exception encountered when calling layer "gated_mlp_4" (type GatedMLP).
                
                in user code:
                
                    File "/usr/local/lib/python3.9/site-packages/m3gnet/layers/_core.py", line 229, in call  *
                        return self.pipe.call(inputs, **kwargs) * self.gate.call(inputs, **kwargs)
                    File "/usr/local/lib/python3.9/site-packages/m3gnet/layers/_core.py", line 38, in call  *
                        out = layer(out)
                    File "/usr/local/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
                        raise e.with_traceback(filtered_tb) from None
                    File "/usr/local/lib/python3.9/site-packages/keras/layers/core/dense.py", line 141, in build
                        raise ValueError('The last dimension of the inputs to a Dense layer '
                
                    ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: (0, None)
                
                
                Call arguments received by layer "gated_mlp_4" (type GatedMLP):
                  • inputs=tf.Tensor(shape=(0, None), dtype=float32)
                  • kwargs={'training': 'None'}
            
            
            Call arguments received by layer "concat_atoms" (type ConcatAtoms):
              • graph=['tf.Tensor(shape=(1, 64), dtype=float32)', 'tf.Tensor(shape=(None, 64), dtype=float32)', 'None', 'tf.Tensor(shape=(None, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'tf.Tensor(shape=(0, 3), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(None, 3), dtype=float32)', 'tf.Tensor(shape=(1, 3, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)']
              • kwargs={'training': 'None'}
        
        
        Call arguments received by layer "graph_network_layer" (type GraphNetworkLayer):
          • graph=['tf.Tensor(shape=(1, 64), dtype=float32)', 'tf.Tensor(shape=(None, 64), dtype=float32)', 'None', 'tf.Tensor(shape=(None, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'tf.Tensor(shape=(0, 3), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(None, 3), dtype=float32)', 'tf.Tensor(shape=(1, 3, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(0,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)']
          • kwargs={'training': 'None'}
    
    
    Call arguments received by layer "m3g_net" (type M3GNet):
      • graph=['tf.Tensor(shape=(1, 1), dtype=int32)', 'tf.Tensor(shape=(0, 1), dtype=float32)', 'None', 'tf.Tensor(shape=(None, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'tf.Tensor(shape=(0, 3), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(0,), dtype=float32)', 'tf.Tensor(shape=(1, 3, 3), dtype=float32)', 'tf.Tensor(shape=(0, 2), dtype=int32)', 'None', 'None', 'None', 'tf.Tensor(shape=(0,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)', 'tf.Tensor(shape=(1,), dtype=int32)']
      • kwargs={'training': 'None'}

↓no errors occur

from m3gnet.models import M3GNetCalculator, Potential, M3GNet
from ase.io import read
from ase import Atoms

def main():
    potential = Potential(M3GNet.load())
    calc = M3GNetCalculator(potential=potential)
    print("H2")
    atoms = Atoms('H2', positions=[(0, 0, -0.35), (0, 0, 0.35)])
    atoms.set_calculator(calc)
    try:
        print(atoms.get_potential_energy())
    except Exception as e:
        print(e)

    print("H")
    atoms = Atoms('H', positions=[(0, 0, -0.35)])
    atoms.set_calculator(calc)
    try:
        print(atoms.get_potential_energy())
    except Exception as e:
        print(e)

if __name__ == '__main__':
    main()
H2
[-6.506641]
H
[-1.1176894]

Reproduce IAP training results

Hi,
I'm trying to reproduce the energy, force and stress MAE results reported in the article (0.035 eV/atom, 0.072eV/Å and 0.41GPa), but can not reach those results for now.

Here are my confusions:

The Adam optimizer was used with initial learning rate of 0.001, with a cosine decay to
1% of the original value in 100 epochs. During the optimization, the validation metric values
were used to monitor the model convergence, and training was stopped if the validation
metric did not improve for 200 epochs.

  1. The learning rate decay from 1e-3 to 1e-5 in 100 epochs. How large is the lr when epochs greater than 100, will it still decay or keep 1e-5 ?

  2. How many epochs would the model converge approximately ?

I would appreciate that if you can provide more training details,

Thanks in advance!

Unit of stress for training

First of all thank you for the great repository.
We wanted to try some transfer learning from your original model and were wondering what the actual units of the stress data you have and we have are. Because vasp normally outputs stress in kbar = 0.1 GPa and if we understand the code correctly you convert the stress output of the model from eV/A^3 to GPa . Consequently, the training stress data should be multiplied by -0.1 but in your documentation you write multiply by -1. We also confirmed for some structures of mp-1168 that the stress values in your dataset are in KBar by repeating the calculation (although we were not able to reproduce the calculation for the first structure in the list, we get quite different forces and stresses for this structure).
Thank you very much for your help.

is this model can use multi cpu for relaxing?

when i use the this code, the cpu load is always equal when allocate 32 core and 1 core.
i think this code is run using in 1 core-cpu.

  1. please let me know 'how to use multi-threading in your code'
  2. is it faster using gpu in Relaxing code?

from m3gnet.models import Relaxer
relaxer = Relaxer()
relax_results = relaxer.relax(candidate_structure[0], verbose=True, steps=2000, interval=20, )

Thank you for creating such an amazing code.

Error in loss function during training

I am trying to train m3gnet with my own dataset of structures, energies and forces. For some reason, the training crashes immediately because the loss function returns the following error:

Traceback (most recent call last):
File "train.py", line 41, in
trainer.train(
File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/m3gnet/trainers/potential.py", line 210, in train
lossval, grads, pred_list, emae, fmae, smae = train_one_step(
File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/tmp/autograph_generated_fileeztfkyaw.py", line 14, in tf__train_one_step
(loss_val, emae, fmae, smae) = ag
.converted_call(ag
_.ld(loss), (ag_.ld(target_list), ag__.ld(pred_list), ag__.ld(graph_list)[ag__.ld(Index).N_ATOMS]), None, fscope)
File "/tmp/autograph_generated_fileac34350v.py", line 19, in tf___loss
e_loss = ag
.converted_call(ag__.ld(flat_loss), (ag_.ld(e_target), ag__.ld(e_pred)), None, fscope)
File "/tmp/autograph_generated_filema59lezf.py", line 13, in tf___flat_loss
retval
= ag
_.converted_call(ag__.ld(loss), (ag__.converted_call(ag__.ld(tf).reshape, (ag__.ld(x), ((- 1),)), None, fscope), ag__.converted_call(ag__.ld(tf).reshape, (ag__.ld(y), ((- 1),)), None, fscope)), None, fscope)
File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/keras/losses.py", line 1486, in mean_squared_error
return backend.mean(tf.math.squared_difference(y_pred, y_true), axis=-1)
ValueError: in user code:

File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/m3gnet/trainers/_potential.py", line 192, in train_one_step  *
    loss_val, emae, fmae, smae = _loss(target_list, pred_list, graph_list[Index.N_ATOMS])
File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/m3gnet/trainers/_potential.py", line 139, in _loss  *
    e_loss = _flat_loss(e_target, e_pred)
File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/m3gnet/trainers/_potential.py", line 128, in _flat_loss  *
    return loss(tf.reshape(x, (-1,)), tf.reshape(y, (-1,)))
File "/home/rapplet/.conda/envs/cent7/2020.11-py38/my_tf_env/lib/python3.8/site-packages/keras/losses.py", line 1486, in mean_squared_error
    return backend.mean(tf.math.squared_difference(y_pred, y_true), axis=-1)

ValueError: Dimensions must be equal, but are 32 and 1024 for '{{node SquaredDifference}} = SquaredDifference[T=DT_FLOAT](Reshape_1, Reshape)' with input shapes: [32], [1024].

I have tried different batch sizes and get the same error however the first dimension is always the same as the batch size and the second dimension is always the batch size squared.

I have provided the script I use to call the training function below:

from pymatgen.core import Lattice, Structure, Molecule
from m3gnet.models import M3GNet, Potential
from m3gnet.trainers import PotentialTrainer
import tensorflow as tf
import pickle
import json
import warnings
from tensorflow import keras
import numpy as np

warnings.filterwarnings("ignore")
trainf=open('traindict.json')
valf=open('valdict.json')

traindict=json.load(trainf)
valdict=json.load(valf)

structs=traindict['structures']
energies=traindict['energies']
forces=traindict['forces']
structures=[]
for s in structs:
structures.append(Structure.from_dict(s))

val_structs=valdict['structures']
val_energies=valdict['energies']
val_forces=valdict['forces']
val_structures=[]
for v in val_structs:
val_structures.append(Structure.from_dict(v))

m3gnet = M3GNet.load()
potential = Potential(model=m3gnet)

trainer = PotentialTrainer(
potential=potential, optimizer=tf.keras.optimizers.Adam(1e-3)
)
callbacks = [tf.keras.callbacks.CSVLogger('./training.log', separator=',', append=False)]

trainer.train(
structures,
energies,
forces,
validation_graphs_or_structures=val_structures,
val_energies=val_energies,
val_forces=val_forces,
epochs=2000,
fit_per_element_offset=False,
batch_size=1024,
early_stop_patience=200,
save_checkpoint=True,
callbacks=callbacks
)

potential.model.save('./gstpot')

segmentation issue with energy calculation for some structures

I used the code to predict the energy for the following cif structure and got a deadly segmentation error. it seems it is caused by there is no bond in the structure. Is anyway for your code to detect this and raise an exception instead of segmentation error?
this seems to be a limitation for the model.
I cannot avoid this even using try-except. It crashes my code. the code works with other regular structures with bonds.

generated using pymatgen

data_CaS
_symmetry_space_group_name_H-M 'P 1'
_cell_length_a 5.77562248
_cell_length_b 5.77562248
_cell_length_c 11.50881363
_cell_angle_alpha 90.00000000
_cell_angle_beta 90.00000000
_cell_angle_gamma 90.00000000
_symmetry_Int_Tables_number 1
_chemical_formula_structural CaS
_chemical_formula_sum 'Ca4 S4'
_cell_volume 383.90887645
cell_formula_units_Z 4
loop

_symmetry_equiv_pos_site_id
symmetry_equiv_pos_as_xyz
1 'x, y, z'
loop

_atom_site_type_symbol
_atom_site_label
_atom_site_symmetry_multiplicity
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
Ca Ca0 1 0.00000000 0.50000000 0.25000000 1
Ca Ca1 1 0.50000000 0.00000000 0.25000000 1
Ca Ca2 1 0.50000000 1.00000000 0.75000000 1
Ca Ca3 1 1.00000000 0.50000000 0.75000000 1
S S4 1 0.00000000 0.00000000 0.00000000 1
S S5 1 0.50000000 0.50000000 0.50000000 1
S S6 1 0.00000000 0.00000000 0.50000000 1
S S7 1 0.50000000 0.50000000 1.00000000 1

here is my code:
parser = CifParser('74.cif')
struct = parser.get_structures()[0]
e_form_folder = 'm3gnet_models/MP-2021.2.8-EFS'
m3gnet_e_form = M3GNet.from_dir(e_form_folder)
e_form_predict = m3gnet_e_form.predict_structure(struct)
eform=e_form_predict.numpy()[0][0]
print('formation energy:',eform)

Training is slower without stress?

I have been using two datasets to train the model based on the pre-trained one. They are pretty similar in size, one without stress and the other with stress.

I notice that, using the same device configuration, the model trains much slower on the dataset without stress. It even runs out of memory after 2 epochs when using batch_size=32. I have to decrease the batch size to 16 to continue training.

The training speed for the dataset with stress is ~130ms/step with batch size of 32.
The training speed for the dataset with stress is ~270ms/step with batch size of 16.

I wonder what might be causing this factor of 4 slower in speed?

Feasible to have this packaged on Anaconda?

Otherwise, I'll probably be making my own port of it on conda-forge since I'm planning to package this as a dependency for xtal2png and all deps for conda-forge packages have to also be packaged on conda-forge IIUC.

Btw, very nice and timely package! I had been hoping for something like this for a while now. Even saw and starred m3gnet a little while ago, but didn't realize it's an easy-to-use drop-in DFT relaxation surrogate until I read:

One of the key accomplishments of M3GNet is the development of a universal IAP that can work across the entire periodic table of the elements by training on relaxations performed in the Materials Project.

Just a passing suggestion and feel free to ignore: might be worth adding something related to this to the description:

image

Maybe:

Materials graph network with 3-body interactions featuring a DFT surrogate crystal relaxer and a state-of-the-art property predictor.

or something similar.

Excited to try it out and incorporate it as a post-processing step into xtal2png!

Validation Structures of None results in termination for PotentialTrainer

It seems when doing

trainer = PotentialTrainer(
    potential=potential, optimizer=tf.keras.optimizers.Adam(1e-3)
)

That if no Validation sets are provided. then the entire endeavor fails. This would be because of difference in the Trainer class which has the following condition

          if has_validation:
                val_predictions = []
                val_targets = []
                for val_index, batch in enumerate(mgb_val):
                    graph_batch, target_batch = batch
                    if isinstance(graph_batch, MaterialGraph):

Whereas PotentialTrainer does not and attempts do.

            for batch_index, batch in enumerate(mgb_val):
                graph_batch, target_batch = batch

and since "mgb_val" is not set it fails since the "has_validation" condition is not utilized

enable non-verbose logging of relaxation (e.g. `Relax(verbose=0)`)

This isn't time-sensitive (I'm grateful it's running!), but I'm planning to loop through many structures, and the default output I get for a single relaxation is:

2022-06-16 19:38:02.863217: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2022-06-16 19:38:02.866924: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.       
Skipping registering GPU devices...
2022-06-16 19:38:02.875852: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("PartitionedCall:4", shape=(None,), dtype=int32), values=Tensor("PartitionedCall:3", shape=(None, 3, 3), dtype=float32), dense_shape=Tensor("PartitionedCall:5", shape=(3,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("PartitionedCall:1", shape=(3024,), dtype=int32), values=Tensor("Neg:0", shape=(3024, 3), dtype=float32), dense_shape=Tensor("PartitionedCall:2", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

      Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
FIRE:    0 19:38:19     -279.345825*       8.6713   
C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/three_d_interaction_2/GatherV2_1_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/three_d_interaction_2/GatherV2_1_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/three_d_interaction_2/GatherV2_1_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/graph_network_layer_2/gated_atom_update_2/GatherV2_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/graph_network_layer_2/gated_atom_update_2/GatherV2_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/graph_network_layer_2/gated_atom_update_2/GatherV2_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/graph_network_layer_2/gated_atom_update_2/GatherV2_1_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/graph_network_layer_2/gated_atom_update_2/GatherV2_1_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/graph_network_layer_2/gated_atom_update_2/GatherV2_1_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.       

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/graph_network_layer_2/concat_atoms_2/GatherV2_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/graph_network_layer_2/concat_atoms_2/GatherV2_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/graph_network_layer_2/concat_atoms_2/GatherV2_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/three_d_interaction_1/GatherV2_1_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/three_d_interaction_1/GatherV2_1_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/three_d_interaction_1/GatherV2_1_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/graph_network_layer_1/gated_atom_update_1/GatherV2_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/graph_network_layer_1/gated_atom_update_1/GatherV2_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/graph_network_layer_1/gated_atom_update_1/GatherV2_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/graph_network_layer_1/gated_atom_update_1/GatherV2_1_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/graph_network_layer_1/gated_atom_update_1/GatherV2_1_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/graph_network_layer_1/gated_atom_update_1/GatherV2_1_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.       

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/graph_network_layer_1/concat_atoms_1/GatherV2_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/graph_network_layer_1/concat_atoms_1/GatherV2_grad/Reshape:0", dtype=float32), dense_shape=Tensor("gradients/m3g_net/graph_network_layer_1/concat_atoms_1/GatherV2_grad/Cast:0", shape=(None,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/concat_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/concat:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/m3g_net/three_d_interaction_2/GatherV2_2_grad/Cast:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/GatherV2_5_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/GatherV2_5_grad/Reshape:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/m3g_net/GatherV2_5_grad/Cast:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/GatherV2_6_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/GatherV2_6_grad/Reshape:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/m3g_net/GatherV2_6_grad/Cast:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/GatherV2_3_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/GatherV2_3_grad/Reshape:0", shape=(None, 3), dtype=float32), dense_shape=Tensor("gradients/m3g_net/GatherV2_3_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/m3g_net/GatherV2_4_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/m3g_net/GatherV2_4_grad/Reshape:0", shape=(None, 3), dtype=float32), dense_shape=Tensor("gradients/m3g_net/GatherV2_4_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

C:\Users\sterg\Miniconda3\envs\xtal2png-docs\lib\site-packages\tensorflow\python\framework\indexed_slices.py:444: UserWarning:

Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("PartitionedCall:1", shape=(None,), dtype=int32), values=Tensor("Neg:0", shape=(None, 3), dtype=float32), dense_shape=Tensor("PartitionedCall:2", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.

FIRE:    1 19:38:26     -274.974579*     150.3564
FIRE:    2 19:38:26     -269.860413*     163.3046
FIRE:    3 19:38:27     -278.717163*      61.7388
FIRE:    4 19:38:27     -279.168213*      36.7923
FIRE:    5 19:38:27     -279.419586*       4.7319
FIRE:    6 19:38:27     -279.420776*       4.4985
FIRE:    7 19:38:27     -279.423065*       4.0459
FIRE:    8 19:38:27     -279.426086*       3.4048
FIRE:    9 19:38:27     -279.429474*       2.6230
FIRE:   10 19:38:27     -279.432831*       2.2452
FIRE:   11 19:38:27     -279.435852*       1.8118
FIRE:   12 19:38:27     -279.438293*       1.3240
FIRE:   13 19:38:27     -279.440247*       1.2179
FIRE:   14 19:38:28     -279.441650*       1.7180
FIRE:   15 19:38:28     -279.442657*       1.9919
FIRE:   16 19:38:28     -279.442749*       1.9554
FIRE:   17 19:38:28     -279.442963*       1.8833
FIRE:   18 19:38:28     -279.443237*       1.7773
FIRE:   19 19:38:28     -279.443573*       1.6404
FIRE:   20 19:38:28     -279.443970*       1.4761
FIRE:   21 19:38:28     -279.444427*       1.2892
FIRE:   22 19:38:28     -279.444916*       1.0854
FIRE:   23 19:38:28     -279.445496*       0.8478
FIRE:   24 19:38:28     -279.446106*       0.5811
FIRE:   25 19:38:28     -279.446777*       0.3824
FIRE:   26 19:38:28     -279.447510*       0.3687
FIRE:   27 19:38:28     -279.448303*       0.3641
FIRE:   28 19:38:28     -279.449188*       0.5134
FIRE:   29 19:38:29     -279.450256*       0.6596
FIRE:   30 19:38:29     -279.451538*       0.7646
FIRE:   31 19:38:29     -279.453125*       0.7705
FIRE:   32 19:38:29     -279.455048*       0.6054
FIRE:   33 19:38:29     -279.457245*       0.3071
FIRE:   34 19:38:29     -279.459595*       0.3015
FIRE:   35 19:38:29     -279.462067*       0.5849
FIRE:   36 19:38:29     -279.464844*       0.6786
FIRE:   37 19:38:29     -279.467957*       0.4661
FIRE:   38 19:38:29     -279.471191*       0.2043
FIRE:   39 19:38:29     -279.474182*       0.4625
FIRE:   40 19:38:29     -279.477112*       0.4599
FIRE:   41 19:38:29     -279.479858*       0.2580
FIRE:   42 19:38:29     -279.482117*       0.5493
FIRE:   43 19:38:30     -279.484131*       0.2746
FIRE:   44 19:38:30     -279.485596*       0.6642
FIRE:   45 19:38:30     -279.486938*       0.5204
FIRE:   46 19:38:30     -279.487488*       1.7465
FIRE:   47 19:38:30     -279.488220*       0.3662
FIRE:   48 19:38:30     -279.488190*       1.2142
FIRE:   49 19:38:30     -279.488373*       0.9342
FIRE:   50 19:38:30     -279.488617*       0.4422
FIRE:   51 19:38:30     -279.488739*       0.1691
FIRE:   52 19:38:30     -279.488770*       0.5966
FIRE:   53 19:38:30     -279.488770*       0.5605
FIRE:   54 19:38:30     -279.488800*       0.4907
FIRE:   55 19:38:30     -279.488861*       0.3914
FIRE:   56 19:38:30     -279.488922*       0.2696
FIRE:   57 19:38:31     -279.488953*       0.1483
FIRE:   58 19:38:31     -279.489014*       0.1461
FIRE:   59 19:38:31     -279.489044*       0.1439
FIRE:   60 19:38:31     -279.489105*       0.2442
FIRE:   61 19:38:31     -279.489166*       0.3182
FIRE:   62 19:38:31     -279.489258*       0.3417
FIRE:   63 19:38:31     -279.489380*       0.3012
FIRE:   64 19:38:31     -279.489532*       0.1923
FIRE:   65 19:38:31     -279.489716*       0.1325
FIRE:   66 19:38:31     -279.489899*       0.1463
FIRE:   67 19:38:31     -279.490112*       0.2374
FIRE:   68 19:38:31     -279.490417*       0.1938
FIRE:   69 19:38:31     -279.490723*       0.1127
FIRE:   70 19:38:31     -279.491089*       0.1970
FIRE:   71 19:38:31     -279.491486*       0.2152
FIRE:   72 19:38:31     -279.491943*       0.0969

Get M1 acceleration

I followed the exact steps in the readme to setup a conda environment with tensorflow-metal but keep getting this error when trying to run m3gnet on my M1 Pro:

NotFoundError                             Traceback (most recent call last)
File <timed exec>:1, in <module>

File /opt/homebrew/Caskroom/miniconda/base/envs/m3gnet/lib/python3.9/site-packages/m3gnet/models/_dynamics.py:167, in Relaxer.relax(self, atoms, fmax, steps, traj_file, interval, verbose, **kwargs)
    165 if self.relax_cell:
    166     atoms = ExpCellFilter(atoms)
--> 167 optimizer = self.opt_class(atoms, **kwargs)
    168 optimizer.attach(obs, interval=interval)
    169 optimizer.run(fmax=fmax, steps=steps)

File /opt/homebrew/Caskroom/miniconda/base/envs/m3gnet/lib/python3.9/site-packages/ase/optimize/fire.py:54, in FIRE.__init__(self, atoms, restart, logfile, trajectory, dt, maxstep, maxmove, dtmax, Nmin, finc, fdec, astart, fa, a, master, downhill_check, position_reset_callback, force_consistent)
      8 def __init__(self, atoms, restart=None, logfile='-', trajectory=None,
      9              dt=0.1, maxstep=None, maxmove=None, dtmax=1.0, Nmin=5,
     10              finc=1.1, fdec=0.5,
     11              astart=0.1, fa=0.99, a=0.1, master=None, downhill_check=False,
     12              position_reset_callback=None, force_consistent=None):
     13     """Parameters:
     14 
     15     atoms: Atoms object
   (...)
     52         when downhill_check is True.
     53     """
---> 54     Optimizer.__init__(self, atoms, restart, logfile, trajectory,
     55                        master, force_consistent=force_consistent)
...
  device='CPU'; T in [DT_STRING]
  device='CPU'; T in [DT_RESOURCE]
  device='CPU'; T in [DT_VARIANT]

	 [[PartitionedCall/gradients/m3g_net/graph_network_layer/gated_atom_update/UnsortedSegmentSum_grad/and]] [Op:__inference_get_efs_tensor_7426]
$ pip list | grep tensorflow
tensorflow-estimator    2.9.0
tensorflow-macos        2.9.2
tensorflow-metal          0.5.0

The error disappears when running

pip uninstall -y tensorflow-metal

but I'd like to run m3gnet on lots of structures so GPU acceleration is important.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.