Describe the bug Using an ASE dataset for <code class="notranslat

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Here are the numbers compared to the original: <div class="snippet-clipboard-conte

TypeError when reading ASE dataset about nequip HOT 8 CLOSED

mir-group commented on July 23, 2024

TypeError when reading ASE dataset

from nequip.

Comments (8)

Linux-cpp-lisp commented on July 23, 2024

Hi @kkly1995 ,

This error usually comes up when a key that you try to compute statistics over — usually the energy or force when computing normalization constants — isn't in your dataset.

Can you run python -m pdb nequip/nequip/scripts/train.py path/to/minimal.yaml and run p field when it catches on this error?

Thanks.

from nequip.

kkly1995 commented on July 23, 2024

Thank you for explaining the error, it looks like it was the .xyz provided by MD17 was not exactly in the format that could be fully parsed by ASE, i.e. it did not read the energies and forces. After fixing the format and verifying that ASE could correctly read the energies and forces, nequip successfully ran and in fact produced an identical result to that in configs/minimal.yaml (so indeed the datasets are actually the same).

I actually have another ASE related issue, if that's alright. Once again I use configs/minimal.yaml but change the data to my own (attached below) as well as the numbers n_train and n_val. My data contains 500 structures:

>>> from ase.io import read
>>> samples = read('subset.xyz', format='extxyz', index=':')
>>> len(samples)
500

and I can verify that ASE can parse the energies and forces of every structure here. However, with nequip I get the following error:

Successfully loaded the data set of type ASEDataset(100)...
Traceback (most recent call last):
  File "/home/kkly2/anaconda3/envs/nequip/bin/nequip-train", line 8, in <module>
    sys.exit(main())
  File "/home/kkly2/anaconda3/envs/nequip/lib/python3.8/site-packages/nequip/scripts/train.py", line 40, in main
    fresh_start(parse_command_line(args))
  File "/home/kkly2/anaconda3/envs/nequip/lib/python3.8/site-packages/nequip/scripts/train.py", line 125, in fresh_start
    trainer.set_dataset(dataset)
  File "/home/kkly2/anaconda3/envs/nequip/lib/python3.8/site-packages/nequip/train/trainer.py", line 1046, in set_dataset
    raise ValueError(
ValueError: too little data for training and validation. please reduce n_train and n_val

Am I correct in thinking that it is only reading 100 structures?

data.zip

from nequip.

Linux-cpp-lisp commented on July 23, 2024

Glad that helped!

TODO: to self, at this to FAQ

What did you set n_train and n_val to?

from nequip.

kkly1995 commented on July 23, 2024

Here are the numbers compared to the original:

$ diff minimal.yaml ~/nequip/configs/minimal.yaml
2c2
< root: results/LaH
---
> root: results/aspirin
15,16c15,16
< dataset: ase
< dataset_file_name: subset.xyz
---
> dataset: aspirin
> dataset_file_name: benchmark_data/aspirin_ccsd-train.npz
23,25c23,25
< n_train: 400
< n_val: 100
< batch_size: 5
---
> n_train: 5
> n_val: 5
> batch_size: 1

from nequip.

Linux-cpp-lisp commented on July 23, 2024

did you accidentally set include_frames or something? this is strange since yes, ASEDataset(100) indicates that it loaded only 100 frames

or maybe you are loading a different subset.xyz than you think you were?

I'm not entirely sure what else this could be...

from nequip.

kkly1995 commented on July 23, 2024

It is true that I previously had a smaller dataset of the same name subset.xyz which had only 100 frames, which I since removed from this directory. The error reported above is after I replaced it with the larger dataset. I deleted results/LaH/minimal before rerunning but I also noticed there is results/LaH/processed. Just now I removed all these directories, it was able to successfully read all 500 frames. The error must be some artifact left over from when I ran a similar input, using the same directory but different subset.xyz.

from nequip.

simonbatzner commented on July 23, 2024

Ah, that explains it. If you keep the run_name and the cutoff radius the same between two runs, then the code will read in the previously processed data set from file (in your case the one with 100 frames), instead of recomputing it. So it read the one with 100 frames instead of recomputing the one with 500 because the name and cutoff radius were the same.

We will make that more clear in the docs. Thanks for the notice.

Closing this.

from nequip.

Linux-cpp-lisp commented on July 23, 2024

Worth noting that this issue of cached processed versions of datasets getting out of sync with your dataset settings is resolved in the current beta version. However, that does not include if you change the actual data file that is read from— we don't want to waste time reading it just to check if something has changed, so you are responsible still for making sure that you reprocess if you change the contents of a datafile.

from nequip.

TypeError when reading ASE dataset about nequip HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent