Git Product home page Git Product logo

finetuna's People

Contributors

12chao avatar alchem0x2a avatar jmusiel avatar lorywangxx avatar mattaadams avatar mshuaibii avatar renovate-bot avatar ruiqic avatar zulissi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

finetuna's Issues

_message.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN6google8protobuf2io17SafeDoubleToFloatEd

When I was trying to finish the installation of Finetuna, I run the test from [N2H_Ag111_dissociation]. This was the error showing on my terminal.
/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/internal/api_implementation.py:110: UserWarning: Selected implementation cpp is not available.
warnings.warn(
Traceback (most recent call last):
File "/hpcfs/users/a1732812/Test/11/N2H_Ag111.py", line 2, in
from finetuna.online_learner.online_learner import OnlineLearner
File "/hpcfs/users/a1732812/xxx/finetuna/finetuna/online_learner/online_learner.py", line 5, in
from finetuna.logger import Logger
File "/hpcfs/users/a1732812/xxx/finetuna/finetuna/logger.py", line 10, in
import wandb
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/init.py", line 26, in
from wandb import sdk as wandb_sdk
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/sdk/init.py", line 5, in
from .wandb_artifacts import Artifact # noqa: F401
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/sdk/wandb_artifacts.py", line 33, in
from wandb.apis import InternalApi, PublicApi
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/apis/init.py", line 42, in
from .internal import Api as InternalApi # noqa
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/apis/internal.py", line 3, in
from wandb.sdk.internal.internal_api import Api as InternalApi
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/sdk/internal/internal_api.py", line 45, in
from ..lib import retry
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/sdk/lib/retry.py", line 17, in
from .mailbox import ContextCancelledError
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/sdk/lib/mailbox.py", line 10, in
from wandb.proto import wandb_internal_pb2 as pb
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/proto/wandb_internal_pb2.py", line 8, in
from wandb.proto.v4.wandb_internal_pb2 import *
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/wandb/proto/v4/wandb_internal_pb2.py", line 5, in
from google.protobuf.internal import builder as _builder
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/internal/builder.py", line 42, in
from google.protobuf import reflection as _reflection
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/reflection.py", line 51, in
from google.protobuf import message_factory
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/message_factory.py", line 45, in
from google.protobuf import descriptor_pool
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/descriptor_pool.py", line 63, in
from google.protobuf import descriptor
File "/hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 51, in
from google.protobuf.pyext import _message
ImportError: /hpcfs/users/a1732812/miniconda3/envs/finetuna/lib/python3.9/site-packages/google/protobuf/pyext/_message.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN6google8protobuf2io17SafeDoubleToFloatEd

So, could anyone tell me how can I be able to resolve the problem? Thank you so much!

Need tests for offline active learning and NEBs

There is now a test case for a Cu13 nanocluster relaxation with OAL.

We also need test cases for offline active learning and NEBs. The example scripts could be adapted pretty easily. Try to keep it as clean / simple / fast as possible so these tests can be run/fixed frequently.

Question VASP_interactive example

Dear Researchers,

I tried to follow the example of VASP_interactive on online learning. However I struggled early, as the line:
from al_mlp.atomistic_methods import replay_trajectory

leads to
ImportError: cannot import name 'replay_trajectory' from 'al_mlp.atomistic_methods' (.../.local/lib/python3.8/site-packages/al_mlp-0.1-py3.8.egg/al_mlp/atomistic_methods.py)

I had tried to install al_mlp using the setup.py routine. However I didn't see where the replay_trajectory function would be added, even in Github.

Any help would be highly appreciated!

Alexander

There is a training process and GPU memory usage, but the GPU is not working.

Hello, here is my code:

ml_potential = FinetunerCalc(
    checkpoint_path="gemnet_t_direct_h512_all.pt",
    mlp_params={
        "tuner": {
            "unfreeze_blocks": [
                "out_blocks.3.seq_forces",
                "out_blocks.3.scale_rbf_F",
                "out_blocks.3.dense_rbf_F",
                "out_blocks.3.out_forces",
                "out_blocks.2.seq_forces",
                "out_blocks.2.scale_rbf_F",
                "out_blocks.2.dense_rbf_F",
                "out_blocks.2.out_forces",
                "out_blocks.1.seq_forces",
                "out_blocks.1.scale_rbf_F",
                "out_blocks.1.dense_rbf_F",
                "out_blocks.1.out_forces",
            ],
            "num_threads": 32
        },
        "optim": {
            "batch_size": 1,
            "num_workers": 0,
            "max_epochs": 400,
            "lr_initial": 0.0003,
            "factor": 0.9,
            "eval_every": 1,
            "patience": 3,
            "checkpoint_every": 100000,
            "scheduler_loss": "train",
            "weight_decay": 0,
            "eps": 1e-8,
            "optimizer_params": {
                "weight_decay": 0,
                "eps": 1e-8,
            },
        },
        "task": {
            "primary_metric": "loss",
        },
        "local_rank": 0
    }, 
)
ml_potential.train(parent_dataset=train_dataset[:2])

my cuda version is 11.3, nvidia-smi can see the training process and GPU memory usage, but the volatile gpu-util is 0, and the power consumption has not increased. Is there a problem with my parameter settings?

vaspinterative not found mpi process

The finetuna job is running the cray cluster . the model I use is " module swap PrgEnv-cray PrgEnv-intel".
The default mpi is cray mpi. the Warning is :
" UserWarning: Cannot find the mpi process or you're using different ompi wrapper. Will not send stop signal to mpi."
Does the warning impact the calculation speed?

Best,
Li Yuke

energy calculated by mlp vs quantum espresso has a huge difference

Hello,

I run the quantum espresso example from 'examples/quantum_espresso/qe_gpu_online_al_example.py' (I have not changed anything from the example except the path to QE and psedopotential). After the calculation was complete when i plotted the energies from the predicted trajectory ('online_learner_trajectory.traj') i see that energies predicted by mlp is around -1 eV while energies predicted by QE is around -140000 eV. Is there something i am missing?. How do i get the correct energy values?

i have attached the predicted energy plot.

image

VaspInteractive vs Vasp calculator

Hello!

based on some of the previous issues, I think a problem I'm running into is related to the VaspInteractive calculator not being compatible with my custom build of vasp 5.4.4. So, I simply replaced VaspInteractive' with Vasp' from ase (~line 63 of the example wrapper script), and I get the following traceback:

Traceback (most recent call last):
File ".../finetuna_wrapper.py", line 87, in
with vasp_calc as parent_calc:
AttributeError: enter

Is there something else I need to do that's being missed here?

Finetuna crashes after the first DFT calculation

Issue

I tried to run FINETUNA with VASP 6.3.0 to relax H*CO on Pt(111) using the provided ASE example template (no 1). 10 steps are performed with the MLP and then a DFT calculation is triggered. However, after the DFT calculation converges, the software crashes and reports the following error message:

Trying to close the VASP stream but encountered error:
process PID not found (pid=181196)
Will now force closing the VASP process. The OUTCAR and vasprun.xml outputs may be incomplete
Force below threshold: check with parent
OnlineLearner: Parent calculation required
Traceback (most recent call last):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 442, in wrapper
ret = self._cache[fun]
AttributeError: _cache

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1642, in wrapper
return fun(self, *args, **kwargs)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 445, in wrapper
return fun(self)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1684, in _parse_stat_file
data = bcat("%s/%s/stat" % (self._procfs_path, self.pid))
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 775, in bcat
return cat(fname, fallback=fallback, _open=open_binary)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 763, in cat
with _open(fname) as f:
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_common.py", line 727, in open_binary
return open(fname, "rb", buffering=FILE_READ_BUFFER_SIZE)
FileNotFoundError: [Errno 2] No such file or directory: '/proc/181196/stat'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 361, in _init
self.create_time()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 714, in create_time
self._create_time = self._proc.create_time()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1642, in wrapper
return fun(self, *args, **kwargs)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1852, in create_time
ctime = float(self._parse_stat_file()['create_time'])
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/_pslinux.py", line 1649, in wrapper
raise NoSuchProcess(self.pid, self._name)
psutil.NoSuchProcess: process no longer exists (pid=181196)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/gpfs/data/cfgoldsm/bkreitz1/VASP/methane-oxidation/neb/h--co-diss/IS/finetuna/example.py", line 106, in
relaxer.run(
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/atomistic_methods.py", line 198, in run
dyn.run(fmax=self.fmax, steps=self.steps)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 294, in run
return Dynamics.run(self)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 181, in run
for converged in Dynamics.irun(self):
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 168, in irun
self.log()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/optimize/optimize.py", line 308, in log
forces = self.atoms.get_forces()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/atoms.py", line 790, in get_forces
forces = self._calc.get_forces(self)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/calculators/abc.py", line 23, in get_forces
return self.get_property('forces', atoms)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/ase/calculators/calculator.py", line 736, in get_property
self.calculate(atoms, [name], system_changes)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/online_learner/online_learner.py", line 189, in calculate
energy, forces, fmax = self.get_energy_and_forces(atoms)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/online_learner/online_learner.py", line 259, in get_energy_and_forces
energy, forces, constrained_forces = self.add_data_and_retrain(
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/finetuna/online_learner/online_learner.py", line 491, in add_data_and_retrain
self.parent_calc._pause_calc()
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/vasp_interactive/vasp_interactive.py", line 471, in _pause_calc
mpi_process = _find_mpi_process(pid)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/vasp_interactive/vasp_interactive.py", line 65, in _find_mpi_process
process_list = [psutil.Process(pid)]
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 332, in init
self._init(pid)
File "/users/bkreitz1/anaconda/finetuna/lib/python3.9/site-packages/psutil/init.py", line 373, in _init
raise NoSuchProcess(pid, msg='process PID not found')
psutil.NoSuchProcess: process PID not found (pid=181196)
Trying to close the VASP stream but encountered error:
'psutil'

Software

OpenMPI 4.0.5
Intel 2020.2
Python 3.9.0

Executed on 2 nodes, with 2 tasks per node and 8 cpus per task (not sure if that's relevant)

I adjusted the vasp calculator as follows:

vasp_calc = VaspInteractive(
    ibrion=-1,
    nsw=0,
ispin=1,
    ediff=1e-6,
    ediffg=-0.03,
    encut=450.0,
    laechg=False,
    lcharg=False,
    lwave=False,
    #ncore=4,
    xc="beef-vdw",
    kpts=(3,3,1),
)

A case where finetuna fails to reach a reasonable minimum

Hello again! I've found a couple of cases where finetuna fails to reach a reasonable minimum. From my very brief use of it, the common theme seems to be weak binding metals with adsorbates that don't quite bind. It seems like finetuna can fail to localize a minimum with the molecule just above the surface and instead the molecule can float up through the periodic image leading to surfaces potentially blowing up.

I don't know how useful this is, but I've also included a zip file with one such case, *N2H on Ag (111). I'm running a regular VASP optimization to see how the cg optimizer handles this and will update when it's finished.
to_zulissi.zip

QE colab calculator does not work as intended

Traceback for error as follows:

/usr/local/lib/python3.6/dist-packages/al_mlp/offline_learner.py in learn(self)
     88 
     89         while not self.terminate:
---> 90             self.do_before_train()
     91             self.do_train()
     92             self.do_after_train()
/usr/local/lib/python3.6/dist-packages/al_mlp/offline_learner.py in do_before_train(self)
     98         """
     99         if self.iterations > 0:
--> 100             self.query_data()
    101         self.fn_label = f"{self.file_dir}{self.filename}_iter_{self.iterations}"
    102 
/usr/local/lib/python3.6/dist-packages/al_mlp/offline_learner.py in query_data(self)
    140         """
    141         queried_images = self.query_func()
--> 142         self.training_data += compute_with_calc(queried_images, self.delta_sub_calc)
    143 
    144     def check_terminate(self):
/usr/local/lib/python3.6/dist-packages/al_mlp/utils.py in compute_with_calc(images, calculator)
     53     for image in images:
     54         image.set_calculator(copy.deepcopy(calculator))
---> 55     return convert_to_singlepoint(images)
     56 
     57 
/usr/local/lib/python3.6/dist-packages/al_mlp/utils.py in convert_to_singlepoint(images)
     24         os.makedirs("./temp", exist_ok=True)
     25         os.chdir("./temp")
---> 26         sample_energy = image.get_potential_energy(apply_constraint=False)
     27         sample_forces = image.get_forces(apply_constraint=False)
     28         image.set_calculator(
/usr/local/lib/python3.6/dist-packages/ase/atoms.py in get_potential_energy(self, force_consistent, apply_constraint)
    731                 self, force_consistent=force_consistent)
    732         else:
--> 733             energy = self._calc.get_potential_energy(self)
    734         if apply_constraint:
    735             for constraint in self.constraints:
/usr/local/lib/python3.6/dist-packages/ase/calculators/calculator.py in get_potential_energy(self, atoms, force_consistent)
    706 
    707     def get_potential_energy(self, atoms=None, force_consistent=False):
--> 708         energy = self.get_property('energy', atoms)
    709         if force_consistent:
    710             if 'free_energy' not in self.results:
/usr/local/lib/python3.6/dist-packages/ase/calculators/calculator.py in get_property(self, name, atoms, allow_calculation)
    734             if not allow_calculation:
    735                 return None
--> 736             self.calculate(atoms, [name], system_changes)
    737 
    738         if name not in self.results:
/usr/local/lib/python3.6/dist-packages/al_mlp/calcs.py in calculate(self, atoms, properties, system_changes)
     52         self.calcs[0].results = self.parent_results
     53         self.calcs[1].results = self.base_results
---> 54         super().calculate(atoms, properties, system_changes)
     55 
     56         if "energy" in self.results:
/usr/local/lib/python3.6/dist-packages/ase/calculators/mixing.py in calculate(self, atoms, properties, system_changes)
     50 
     51         for w, calc in zip(self.weights, self.calcs):
---> 52             calc.calculate(atoms, properties, system_changes)
     53 
     54             for k in properties:
TypeError: calculate() takes from 2 to 3 positional arguments but 4 were given

Importing FinetunerCalc leads to Segmentation Fault

I tried testing the QE example code (qe_gpu_online_al_example.py). However, Python quickly exits with only the message "Segmentation Fault".

When I try to run the code line by line in the interactive Python interpreter, I found that the following line causes the segmentation fault:
from finetuna.ml_potentials.finetuner_calc import FinetunerCalc

What could be causing this?

Some additional details:

  • I am on an AWS EC2 instance with GPU
  • I created my Python environment by calling conda env create -f env.gpu.yml. However, the code failed to run, and I found out that calling import torch; torch.cuda.is_available() returned False. So I then called pip3 install torch torchvision torchaudio to get a GPU-enabled Pytorch installed in my environment.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

circleci
.circleci/config.yml
  • circleci/python 3.7
github-actions
.github/workflows/black.yml
  • actions/checkout v2
.github/workflows/unittests.yml
  • actions/checkout v2

  • Check this box to trigger a request for Renovate to run again on this repository

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Location: renovate.json
Error type: The renovate configuration file contains some invalid settings
Message: Invalid configuration option: pipfile

Multiple problems when running CPU version

Hello,

I recently installed the CPU version of Finetuna according to the README on our local servers and ran into a series of issues.

First error was in pymatgen. The error said that yaml.safe_load() had been removed. I edited the local_env.py of pymatgen to fit the new format.

Next I got the OSError: libc10_cuda.so, no such file or directory.
Apparently, this is a .so from PyTorch which hadn't been installed. I edited the _ops.py in the ctypes library to manually load these .so files which I got from the GPU version of the Finetuna. I wondered if these .so files are even necessary for the CPU version. Since I am not running a CUDA device these files should be redundant, right?

After that a python script in the ocpmodels directory was missing. I took a look at the OCP20 repository and they merged to trainer files into one which led to the error. I edited the imports to the merged file.

Now I am stuck at an error which I have no idea on how to solve.

File "/.../mambaforge_install/envs/finetuna/lib/python3.9/site-packages/llvmlite/binding/targets.py", line 201, in from_triple
raise RuntimeError(str(outerr))
RuntimeError: No available targets are compatible with triple "x86_64-unknown-linux-gnu"

My research on the error message was not too helpful so far. I hope that maybe another user has run into this problem or someone with more experience with llvmlite can help me to understand the problem.

I also made an environment for the GPU version and except for the YAML issue it started without issues (and the crashed as expected since I don't have a CUDA device to run the program on).

Is there maybe a more up to date version of the CPU environment? Especially the YAML and OCP20 issue seem to come from changes in other libraries that negatively affect Finetunas functionality.

Thank you in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.