Comments (19)
@peterbjorgensen thanks for the suggestion, I was not aware that asap3
had a neighborlist as well. I have previously implemented but not merged such a use of an accelerated neighborlist for nequip
using the optimized neighborlist from freud
, which has the same limitation but I think also had certain issues with triclinic cells (I don't really remember why I didn't merge). You can see the code for doing the fallback here: https://github.com/mir-group/nequip/blob/freud-neighborlist/nequip/data/_neighbors.py. If asap3
has full support for ASE-style triclinic cells, and that restriction is the only one, you could easily adapt the code with freud
to have asap3
instead. I'd certainly be willing to consider a PR based on that existing branch--- generally this fell to a lower priority after I made the neighborlist pre-processing for training parallel but for some systems I guess it still might still take a while.
However, unfortunately none of this directly helps for OpenMM or other contexts where we need a neighborlist in TorchScript (unless someone wants to write the bindings).
from nequip.
NNPOps now has a neighbor list implementation that can be used in torchscript.
from nequip.
I implemented a TorchScript-able neighbor list recently that is as general as ASE's and it is somewhat faster. It might be useful while waiting for a general NNPOps implementation of this feature.
from nequip.
I'm happy to now be able to close this issue thanks to the great work of @sef43 over in issue #288, building on the nice work of @felixmusil in torch_nl
.
from nequip.
Hi @keano130,
This is something I've discussed with the OpenMM people without coming to a great conclusion. Obviously, the ideal scenario would be where openmm-torch
is able to collect the neighborlist data already computed by OpenMM and pass it along to the model. This is probably not going to happen.
Given that, the path forward seems to be a simple wrapper for the deployed TorchScript model that takes in positions + box, computes the neighborlist, builds the AtomicDataDict
, passes it to the wrapped model, and then extracts the energy and forces. All of that should be possible in pure TorchScript.
To implement the neighborlist, I think the quickest way would be what I suggested here: e3nn/e3nn#179 (comment). Basically, take the neighborlist from the ase package, find-and-replace np
-> torch
, and then resolve any TorchScript compatibility errors.
Also, to use NequIP with openmm-torch
, this issue/feature (openmm/openmm-torch#36) for allowing custom explicit forces from the model would need to be implemented in a PR, but that's a pretty simple change.
from nequip.
This is something I've discussed with the OpenMM people without coming to a great conclusion. Obviously, the ideal scenario would be where
openmm-torch
is able to collect the neighborlist data already computed by OpenMM and pass it along to the model. This is probably not going to happen.
I'm also very much interested in getting NequIP to work in OpenMM. Could you elaborate on why it is not possible to use its internal neighbor list? It's not part of the public API and it would definitely require some work from both parties. On the other hand, the NIH project that funds OpenMM until 2024 specifies that providing support for common machine learning potentials is one of their main goals in the next few years, so I really don't understand why they wouldn't try and expose efficient neighbor searching functionality to the pytorch/tensorflow plugins.
To implement the neighborlist, I think the quickest way would be what I suggested here: e3nn/e3nn#179 (comment). Basically, take the neighborlist from the ase package, find-and-replace
np
->torch
, and then resolve any TorchScript compatibility errors.
I feel like this will end up becoming very inefficient (but correct me if I'm wrong). OpenMM (and GROMACS, and possibly others) are able to implement more efficient neighbor searching by exploiting the fact that for cell matrices in reduced form, i.e. whereby box vectors are aligned with coordinate axes as much as possible, taking care of periodic boundary conditions becomes much easier. For example, application of the minimum image convention using a reduced cell representation is very straightforward, whereas for a general 3x3 cell matrix it is necessary to consider all 26 periodic copies. Why wouldn't we choose to exploit this as well?
As an alternative suggestion, is it possible/beneficial to imitate the TorchEnvironmentProvider
from SchNetPack? Or will that just boil down to the same thing...
Are there other long-term plans regarding inference with NequIP? The rudimentary ASE calculator is painfully slow for anything larger than say 100 atoms, and the LAMMPS plugin seems extremely messy to install/debug/maintain in comparison to e.g. a possible OpenMM plugin.
With all this said: thank you for developing/maintaining NequIP. It's solid, requires minimal effort, and is surprisingly data-efficient. I'll be using it extensively in the near future!
from nequip.
Hi @svandenhaute,
To your first point: I don't know. I'm pretty unfamiliar with OpenMM, and am mostly repeating what was sent in this issue posted by @keano130 (openmm/openmm-torch#33) and subsequently repeated to me in emails. I've been told it has something to do with different OpenMM compute platforms all only maintaining internal neighbor state (in different formats) and not presenting a common interface:
There isn't a single "native" neighbor list format for OpenMM. It depends on which platform you're using. For most platforms, the internal format is quite complicated. For CUDA, for example, it starts by sorting molecules along a space filling curve, then grouping atoms into blocks of 32 contiguous atoms. Then it forms three different neighbor-list-like structures, each with a different format, representing different subsets of interactions that are computed with different algorithms.
If you read the OpenMM developer docs it does also sound like there is no easy way to do this... all of the *Nonbonded
stuff seems to be only for handing atoms in pairs.
I feel like this will end up becoming very inefficient (but correct me if I'm wrong). OpenMM (and GROMACS, and possibly others) are able to implement more efficient neighbor searching
I agree and would prefer to get this information from the host code. In our LAMMPS plugin, for example, we always use the neighborlist precomputed by the MD code. On the other hand, it has been pointed out that compared to the cost of the ML potential itself, the cost of a neighborlist should be negligible. What was suggested to me is that if a pure PyTorch neighborlist isn't fast enough, the next step would actually be a TorchScript C++ extension operation that does the neighborlist for you (so still part of the model).
As an alternative suggestion, is it possible/beneficial to imitate the TorchEnvironmentProvider from SchNetPack? Or will that just boil down to the same thing...
It is the same thing, though if there is existing code it could be useful. It definitely isn't TorchScript compatible, since half of it runs in numpy
(https://schnetpack.readthedocs.io/en/stable/_modules/schnetpack/environment.html#TorchEnvironmentProvider). For some reason I also have the impression that it's an O(N^2) algorithm and not O(N) like ASE's, but that might be wrong.
Are there other long-term plans regarding inference with NequIP? The rudimentary ASE calculator is painfully slow for anything larger than say 100 atoms, and the LAMMPS plugin seems extremely messy to install/debug/maintain in comparison to e.g. a possible OpenMM plugin.
Yes, the ASE is meant mostly for convenience, not performance. As for LAMMPS, recompiling is the standard (and only) way to add something to the code, a process we have also recently made simpler (see the updated README at https://github.com/mir-group/pair_nequip).
At the moment there are no plans for other interfaces from us — time is, as you can imagine, limited
That would go one of two ways:
- Implement neighborlist in TorchScript / TorchScript extension, PR explicit force support into
openmm-torch
(openmm/openmm-torch#36), and PR some small script into NequIP that can wrap a nequip model to take theopenmm-torch
format and add a neighborlist - PRs into
openmm-torch
adding neighborlist and explicit force support, add wrapper to nequip
Unless the openmm-torch
people or someone else familiar with the internals of OpenMM decides to add neighborlist support (again see openmm/openmm-torch#33), I think that 1. is just a simpler option.
With all this said: thank you for developing/maintaining NequIP. It's solid, requires minimal effort, and is surprisingly data-efficient. I'll be using it extensively in the near future!
Thank you! Glad to hear you are enjoying the code and the method, and we are excited to hear about what you do with it
from nequip.
I've been told it has something to do with different OpenMM compute platforms all only maintaining internal neighbor state (in different formats) and not presenting a common interface:
That's pretty much it. There's no simple way to just "get the internal neighbor list".
Is there really no existing PyTorch neighbor list implementation that would work? I know there's one in torch_cluster, though I don't know what algorithm it uses or how the performance scales. In any case, that's the first thing I would try. Most likely you'll find that building the neighbor list takes negligible time compared to evaluating the neural network.
If there really isn't any existing implementation that's sufficiently fast and robust, I'd be willing to help with creating one as part of https://github.com/openmm/NNPOps. A self-contained, reusable neighbor list will be much more generally useful than something built into openmm-torch.
from nequip.
Thanks for the confirmation @peastman !
There is not. torch_cluster
is only without PBC, and we need full triclinic PBC. My vote again goes for trying to translate the ASE neighborlist — which is thoroughly tested as an algorithm and scales O(N) — into TorchScript compatible Python. @svandenhaute I'd happily accept a PR including that and some unit tests into NequIP to get us most of the way to OpenMM support, pending custom force support in openmm-torch
.
from nequip.
There is not.
torch_cluster
is only without PBC, and we need full triclinic PBC. My vote again goes for trying to translate the ASE neighborlist — which is thoroughly tested as an algorithm and scales O(N) — into TorchScript compatible Python. @svandenhaute I'd happily accept a PR including that and some unit tests into NequIP to get us most of the way to OpenMM support, pending custom force support inopenmm-torch
.
I would gladly contribute! I'm pretty busy with other stuff in the next few weeks but I will give it a try afterwards.
from nequip.
Great, thanks @svandenhaute ! We can continue discussion and track progress here in this issue.
from nequip.
That's pretty much it. There's no simple way to just "get the internal neighbor list".
It would also be helpful, if we can at least access the platform specific neighborList itself. Just a class to access the neighbor lists in the API with platform as its argument will do.
from nequip.
Re neighborlists in TorchScript, the OpenCatalyst people seem to have something here:
https://github.com/Open-Catalyst-Project/ocp/blob/fe9998ecbe47e7984a5fa833fd9e4684e58b005b/ocpmodels/common/utils.py#L507
I also played with this at one point translating ase.neighborlist.primitive_neighbor_list
into TorchScript (almost completely untested, use and fix at your own risk!!!) and it at least seems to run in TorchScript correctly:
https://gist.github.com/Linux-cpp-lisp/692018c74b3906b63529e60619f5a207
from nequip.
If you want a way faster replacement of ASE neighborlist, take a look at asap3, that is written in C++. I have used it in my DeepDFT package because the ASE one is too slow. The only thing it does not support is if the cutoff radius is larger than the cell height - then I fall back to the ASE neighborlist.
from nequip.
@peastman But it seems like it does not support periodic boundary conditions?
from nequip.
Not yet I guess. It should be very easy to add.
cc @raimis
from nequip.
@peterbjorgensen would the PBC for an orthorombic box be enough? Or you want a general case?
from nequip.
@raimis Ideally it should work for a general unit cell and also support that only some of the dimensions are periodic (less important). Those features are supported by asap3 already.
from nequip.
That's great! Thank you @felixmusil , very nice to have that available!
from nequip.
Related Issues (20)
- 🐛 [BUG] GPU acceleration on NequIPCalculator HOT 1
- ❓ [QUESTION] How to perform transfer training with nequip? HOT 2
- 🐛 [BUG] Segfault with float64 models HOT 4
- 🐛 [BUG] Tutorial example runs infinitely HOT 3
- 🐛 [BUG] lammps/build/lmp: No such file (tutorial) HOT 3
- wrong ValueError text HOT 1
- 🐛 [BUG] NotADirectoryError when attempting to run git HOT 2
- ❓ [QUESTION] CIF files & other target properties HOT 1
- ❓ [QUESTION] Newton pair when running lammps HOT 2
- 🌟 [FEATURE] OpenMM HOT 46
- Nequip memory requirements ❓ [QUESTION] HOT 10
- 🌟 [FEATURE] Support for newer PyTorch HOT 1
- ❓ [QUESTION] Sweeping hyperparemeters with Weights and Biases HOT 2
- 🌟 [FEATURE] Masking out some labels (e.g. constrained atoms) HOT 2
- 🐛 [BUG] Does not work with RTX 4080 GPU HOT 6
- RuntimeError when using nequip-evaluate HOT 2
- Reduce LR on plateau but not increase HOT 8
- 🐛 [BUG] NequIP running problem on A100 machine HOT 5
- 🐛 [BUG] Batchsize Problem in PerAtomMSELoss HOT 4
- The difference between two validation metrics in file log❓ [QUESTION] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nequip.