Git Product home page Git Product logo

pymatnest's People

Contributors

albapa avatar bernstei avatar gabor1 avatar jameskermode avatar liviabp avatar martinschlegel avatar rjnbaldock avatar stenczelt avatar tdaff avatar yangmr04 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymatnest's Issues

Nested sampling failed for converge_down_to_T

ERROR.tar.gz

I am running NS for a GAP model with 'converge_down_to_T' . But the runs failed with an MPI abort.

For sanity check i took the LJ input file available in the path below and redid the NS.

https://github.com/libAtoms/pymatnest/blob/master/example_inputs/inputs.test.periodic.MD.lammps.converged

The NS input for LJ with converge_down_to_T failed with the same error. All the inputs and outputs are attached for your reference.

I was previously using n_iter_times_fraction_killed which was working fine and ran for many iterations.

'cell' variable isn't being set since recent updates

In Fortran models the 'cell' variable isn't being passed correctly to ll_eval_energy(...). In my own tests it's either all zeros, or all zeros except for the last value. My model works in previous versions of the pymatnest code (e.g. from around May-June), and I normally set cell using max_volume_per_atom keyword.

Has there been a change in functionality, or is this a bug?

lammpslib.py - failing in test cases

The test case in https://svn.fysik.dtu.dk/projects/ase-extra/trunk/ase/test/testlammpslib.py
fails with lammps & ase trunk versions
the test.log has:
----------------------------------- test.log start -----------------------------------------------------------
LAMMPS (26 Jan 2017)
units metal
atom_style atomic
atom_modify map array sort 0 0
boundary p p s
region cell prism 0 5.72756492761 0 4.96021672914 0 0.0 0.0 0.0 0.0 units box
ERROR: Illegal region prism command (../region_prism.cpp:88)
Last command: region cell prism 0 5.72756492761 0 4.96021672914 0 0.0 0.0 0.0 0.0 units box
Total wall time: 0:00:00
----------------------------------- test.log end -----------------------------------------------------------

Similarly this case fails:

---------------------------------------- lammpslib-example.py Start---------------------------------------------------
from ase import Atom, Atoms
from lammpslib import LAMMPSlib
cmds = ["pair_style eam/alloy",
"pair_coeff * * NiAlH_jea.eam.alloy Al H"]
a = 4.05
al = Atoms([Atom('Al')], cell=(a, a, a), pbc=True)
h = Atom([Atom('H')])
alh = al + h
lammps = LAMMPSlib(lmpcmds = cmds, logfile='test.log')
alh.set_calculator(lammps)
print "Energy ", alh.get_potential_energy()
---------------------------------lammpslib-example.py End -----------------------------------------------------------

I get in the stdout this:
Traceback (most recent call last):
File "pymatnest-example.py", line 10, in
alh = al + h
File "/home/vama/install/local/anaconda2/lib/python2.7/site-packages/ase/atoms.py", line 866, in add
atoms += other
File "/home/vama/install/local/anaconda2/lib/python2.7/site-packages/ase/atoms.py", line 872, in extend
other = self.class([other])
File "/home/vama/install/local/anaconda2/lib/python2.7/site-packages/ase/atoms.py", line 150, in init
atoms = self.class(None, *data)
File "/home/vama/install/local/anaconda2/lib/python2.7/site-packages/ase/atoms.py", line 195, in init
self.new_array('numbers', numbers, int)
File "/home/vama/install/local/anaconda2/lib/python2.7/site-packages/ase/atoms.py", line 391, in new_array
a = np.array(a, dtype)
TypeError: long() argument must be a string or a number, not 'Atom'

Thanks,

Possible bug in example_LJ_model.F90

I think there's a bug in the pymatnest fortran example model: example_LJ_model.f90.

In ll_eval_energy there is a term:
if (i==j) E_term = E_term * 0.5
It looks like this is to prevent double-counting, however I think the double-counting actually happens then the indices i and j are different (not the same, as would be implied by the if(i==j) line above).

For example if I've got two atoms then the i and j loops give me:
i=0, j=0
i=0, j=1
i=1, j=0
i=1, j=1

As far as I can see the only potential for a double-count of the energy in the i == j cases is when the 0th images are considered (i.e. dj1, dj2, dj3 all equal 0) and that is correctly prevented by the line:
if (i == j .and. dj1 == 0 .and. dj2 == 0 .and. dj3 == 0) cycle
The double-counting actually happens for the two cases (i=0, j=1) and (i=1, j=0) when the 0th image of each is being computed.

I.e. in the two-atom case given above there is a double-count between:
i=0, j=1 and dj1 = dj2 = dj3 = 0
and
i=1, j=0 and dj1 = dj2 = dj3 = 0

Apologies if there's a mistake in my logic here.

no way to disable sending communicator to lammps initializer

lammps seems to have recently made it so that the serial version does not have symbols that conflict with the actual mpi library used by mpi4py. This means that serial lammps can be used, but then you have to send None for the communicator instead of COMM_SELF as is happening now. That's an easy patch, but should the default be LAMMPS_serial=T, or would that break too many existing setups that use mpi lammps + COMM_SELF?

Poor parallel scaling efficiency due to MPI_gather_all

Per e-mail conversation with Gabor I'm posting about this issue here. Basically the parallel scaling of pymatnest is relatively poor, and performance drops off at a relatively low number of CPU cores.

Taking Archer as an example (Archer is a Cray machine very similar to Titan in the US) with 1152 walkers I see a drop-off in parallel scaling after just 12 cores (1/2 a node), while with 11520 walkers I can only scale up to 48 cores. From looking at the code it seems very likely the problem lies with over-use of the MPI_gather_all routine, as this causes a lot of congestion between nodes (on Archer each node is 24 cores). Gabor informed me that he had trouble going beyond 96 cores (4 nodes).

I've posted my (brief) results from my tests on Archer here, with some discussion of the cause (see the pure MPI_gather_all test towards the end):
https://gist.github.com/erlendd/c236f393ed597187c612599cb472cd4b

errror in LAMMPS example

Hi, I cannot run the lammps test examples, it seems like a variable is missing.
I get the follow error:

$>mpirun -n 2 ../ns_run < inputs.test.cluster.MC.lammps
WARNING: no quippy module loaded
WARNING: no quippy module loaded
comm <mpi4py.MPI.Intracomm object at 0x7f66f1d06900> size 2 rank 0
comm <mpi4py.MPI.Intracomm object at 0x7f00f2fcf900> size 2 rank 1
Traceback (most recent call last):
File "../ns_run", line 6, in
Traceback (most recent call last):
File "../ns_run", line 6, in
ns_run.main()
File "/home/vama/soft/pymatnest/ns_run.py", line 2606, in main
ns_run.main()
File "/home/vama/soft/pymatnest/ns_run.py", line 2606, in main
exit_error("need either n_iter_times_fraction_killed or converge_down_to_T")
TypeError: exit_error() takes exactly 2 arguments (1 given)
exit_error("need either n_iter_times_fraction_killed or converge_down_to_T")
TypeError: exit_error() takes exactly 2 arguments (1 given)

License?

Is this project licensed under an opensource license?

MPI restart bug?

Restarting Nested Sampling runs by first concatenating the walkers

ACE_NS.snapshot.8607.*.extxyz > ACE_NS.snapshot.8607.all.extxyz

and specying restart input file:

restart_file=ACE_NS.snapshot.8607.all.extxyz

has always worked for me, and still seems to run fine running in serial. However, using the most recent pymatnest using MPI I get something weird like this:

6 truncating traj file to start_first_iter 8608
 Uncaught Exception Type: <class 'TypeError'>
 Value: '<' not supported between instances of 'NoneType' and 'int'
 Traceback: <traceback object at 0x14775b2d5680>
 Aborting
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 6

I think it happens during the distributions of walkers over the MPI instances right after reading them in from the restart file.

This only happens at restarts weirdly enough... And when restarting in serial all looks fine

I'm not sure how this can be related to the latest ASE calculator functionality, judging by the code changes it actually looks unlikely it has got anything to do with it, but I do feel it's related somehow...

Starting a NS run from scratch using MPI works fine, it's just the MPI restarts raising this error

Incorrect Contents of `misc_calc_lib`

@MartinSchlegel Following the instructions for the XRD, it appears that misc_calc_lib was inadvertently overwritten with the contents of make_thermal_average_xrd_rdfd_lenhisto.

Traceback (most recent call last):
  File "../pymatnest/make_thermal_average_xrd_rdfd_lenhisto.py", line 1, in <module>
    import misc_calc_lib
  File "/home/ubuntu/pymatnest/misc_calc_lib.py", line 163, in <module>
    rdfd_results = misc_calc_lib.rdfd_QUIP(QUIP_path,at,n_a,r_range)
AttributeError: 'module' object has no attribute 'rdfd_QUIP'

This should be a trivial fix. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.