aiidateam / aiida-atomistic Goto Github PK
View Code? Open in Web Editor NEWAiiDA plugin which contains data and methods for atomistic simulations.
License: MIT License
AiiDA plugin which contains data and methods for atomistic simulations.
License: MIT License
I set allow_kinds
to False, then got an error.
properties = {'cell': {'value': [[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
'pbc': {'value': [True, True, True]},
'positions': {'value': [[0.0, 0.0, 0.0], [1.5, 1.5, 1.5]]},
'symbols': {'value': ['Li', 'Li']},
'mass': {'value': [6.941, 6.941]},
'charge': {'value': [1.0, 0.0]},
# 'kinds': {'value': ['Li0', 'Li0']},
}
structure = StructureData(properties = properties, allow_kinds=False)
--> 780 self.get_kinds(kind_tags=self.properties.kinds.value)
781
782 # Store the properties in the StructureData node.
[~/repos/mikibonacci/aiida-atomistic/src/aiida_atomistic/data/structure/properties/property_utils.py](https://file+.vscode-resource.vscode-cdn.net/home/xing/tests/aiida/atomistic/~/repos/mikibonacci/aiida-atomistic/src/aiida_atomistic/data/structure/properties/property_utils.py) in <lambda>(self, type_hint, attr)
105 assert issubclass(type_hint, BaseProperty)
106 cls._valid_properties.add(attr)
--> 107 func_get = lambda self, type_hint=type_hint, attr=attr: self._template_property(type_hint=type_hint, attr=attr)
108
109 # We do not allow to set any property after the creation of the instance:
[~/repos/mikibonacci/aiida-atomistic/src/aiida_atomistic/data/structure/properties/property_utils.py](https://file+.vscode-resource.vscode-cdn.net/home/xing/tests/aiida/atomistic/~/repos/mikibonacci/aiida-atomistic/src/aiida_atomistic/data/structure/properties/property_utils.py) in _template_property(self, type_hint, attr)
125 return type_hint(
126 parent=self._parent,
--> 127 **self.get_property_attribute(attr)
...
---> 98 return self._property_attributes[key]
99
100 def _inspect_properties(self,properties):
KeyError: 'kinds'
As design principle, we want to have the StructureData
as immutable even if it is not stored. This is done in order to avoid the user to attempt to modify the stored node afterwards, as one usually would do in standard python.
However, this blocks also flexibility on how we produce our StructureData
instance as ready to be used in calculation. For example, it will be no more possible to use methods like append_atom()
for the unstored node, as well as the append_hubbard_parameter
in case of the Hubbard
property.
To change/update the instance, the only possibility is to generate a new instance, and this is made easier by having the to_dict
method, which returns a python dictionary with everything we need to initialise a new object.
However, I think that then the user has to deal with a rather complex object:
properties = {'cell': {'value': [[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
'pbc': {'value': [True, True, True]},
'positions': {'value': [[0.0, 0.0, 0.0], [1.5, 1.5, 1.5]]},
'symbols': {'value': ['Li', 'Li']},
'mass': {'value': [6.941, 6.941]},
'charge': {'value': [1.0, 0.0]},
'kinds': {'value': ['Li0', 'Li1']}}
(we can always dump this as an AttributeDict
).
The main difficulty here is that, to change a property, we have to go into this nested structure every time. Moreover, there are not method to change multiple properties at the same time (e.g., a python dictionary has no remove_atom
method) - the user should implements its own function to do it.
*I am thinking that maybe returning a different object, can make the modification easier for the user. *
For example, we can think to return an instance of the Properties
class (property
attribute of the structure node), which actually contains all the information on the StructureData node. This contains the instances of the properties as attributes, so we can call methods of the properties to modify them if we need it.
In [1]: structure.properties.charge
Out[1]: Charge(parent=<StructureData: uuid: 4ccb3961-e6ff-4b32-8dcf-a638b765981c (unstored)>, domain='intra-site', default_kind_threshold=0.1, value=[1.0, 0.0])
The modified Properties
instance can be used to initialise a new StructureData
(consistency checks are done in the __init__
).
One issue that I see is that right now the properties are connected to the StructureData
instance.
We can think on making the properties mutable only if they are not attached to any StructureData
instance, i.e. if are only attributes of the Properties
instance:
In [2]: properties = structure.to_mutable()
In [3]: properties.charge
Out[3]: Charge(domain='intra-site', default_kind_threshold=0.1, value=[1.0, 0.0])
we can then even think to provide methods to append/remove atoms and properties:
In [4]: properties.append_atom(positions=[0,0,0], symbol="Li", mass=6.941, charge=-1)
In [5]: properties.charge.append_charge(charge=-1, atom_index=1)
It seems that in this way we are providing a new class, which is not AiiDA datatype. It seems that we have then to maintain two classes: actually we are somehow transferring all the modification methods (append, remove atoms) to the Property
class.
***I am really not sure this is the way to go, it is just an idea on how we can try to improve user experience. ***
In the end we can somehow achieve a similar thing (limited wrt the properties which are supported), by using ASE or Pymatgen combined with StructureData.from/to_ase/pymatgen
.
Immutability of
properties
attributeproperties.x
, where x
is a propertyI have to check how this is done in aiida-core.
I was playing around with the basic StructureData
and found some very strange behavior. Running the following:
from aiida_atomistic.data.structure import StructureData
properties_dict = {
"cell":{"value":[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
"pbc":{"value":[True,True,True]},
"positions":{"value":[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]]},
"symbols":{"value":["Li","Li"]},
}
structure = StructureData(properties = properties_dict)
print('cell' in dir(structure))
structure.cell
prints True
, but then raises an AttributeError
when I try to access cell
. ๐
Up to know, atom can be placed in the cell or by providing ASE, Pymatgen objects, or using the appdend_atom()
method of the original orm.StructureData
.
However, as the StructureData
will be basically read-only in this new implementation, also atomic positions should be provided in the initialization phase of the instance. This means that we should provide symbols
and positions
input parameters.
On the implementation, I think a good starting point is to use the append_atom()
method during the __init__
phase.
append_atom
after the initialization.properties
namespaceProperties may be stored inside an attribute called properties
. For example in case of the pbc
property, it should be stored as StructureData.properties.pbc
: this is the class Pbc
and its actual value (which is a list of booleans) can be accessed via StructureData.properties.pbc.value
.
properties
attribute can be subject to a consistency check, to understand if there is something unknown. and then raise an error (the same check can be done via the structure.get_valid_properties() and structure.get_defined_properties() methods).pbc
a property? What about the mass of each site?The advantage of database (storing properties as attribute of StructureData node) is that it is possible to query and search. The repository, instead, is not subject to any query.
However, for example in case of Hubbard parameters (see HubbardStructureData), these are stored in the repository as JSON files, using pydantic functionalities.
Sometimes, we want to be queryable only some part of the properties. This means that we can design a property class to determine which of its parameters/values should be stored and where; we add a level of flexibility for the storage location. This flexibility however should be only in the implementation of the property, i.e. we cannot decide the storage location on-the-fly when creating a StructureData instance.
Stored in self.base.attributes._property_attributes dictionary. --> should change to self.base.attributes.property
Stored in the repository and accessed via pydantic BaseModel functionalities.
Currently we have strong bias towards periodic boundary conditions built-in the StructureData
. Moreover, we consider the dimensionality to be directly derived from the periodic boundary conditions.
However, there are certainly use cases where you have e.g. a 3D system, but different boundary conditions along a certain direction, see:
test if the new StructureData
is slower than the original one.
Currently, the StructureData
is completely defined as a list of properties (as explained here):
I understand the appeal of this, and maybe it's the way to go. Part of me is still wondering if all properties should be equal, i.e. if some properties should receive special treatment (cell
, pbc
, positions
, ...). Thinking of @mikibonacci's question here, I'd prefer to still be able to access the cell
as structure.cell
, not as:
structure.properties.cell
which then outputs a Cell
object:
Cell(parent=<StructureData: uuid: 2f9fc6f5-e3a6-4811-89d6-41015b5f50b4 (unstored)>, domain='global', value=[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]])
If I want to look at the cell, I'd just want to have structure.cell
return a list of list or numpy
array. I don't want to have to deal with the Cell
object and then figure out how to get the cell vectors from this.
That said, other tools like pymatgen
and ASE also have their own "cell" variants (Structure
-> Lattice
; Atoms
-> Cell
). And I can see how Cell.value
lets you get what you desire. So maybe I'm just ranting for no reason here. Will think about it some more as I'm dogfooding.
EDIT: I think I may have managed to derail the conversation from my original point immediately in the same post. I insta-"Mr. Scope creeped" myself. ๐
In the user guide, one print out the properties:
The whole list of currently supported properties is:
['symbols', 'cell', 'mass', 'charge', 'positions', 'pbc', 'custom', 'kinds']
Stored properties are:
['symbols', 'cell', 'mass', 'positions', 'pbc', 'kinds']
Why charge
and custom
is not stored?
Currently, we cannot define unsupported properties (some check in set_property
method against the get_valid_properties
output), or defined property with some parameters which is not recognized (extra parameters);
When a structure is passed to a plugin calcjob/workchain, there should be a check on all the properties associated to the structure. Then two behaviors can happen:
1 - the code uses a calcfunction to create a new instance of the StructureData without the not supported property
2 - the code warns about the unsupported property, but skips it and continue; this can also be done using some keyword ignore_property="hubbard"
The second option is the preferred one if we have multiple codes which support different properties, and also avoid some StructureData duplication.
-> it should be obvious how and why the property is ignored in the plugin (for example if you look at an old calculation, you should understand the decision clearly). There should be a part of the documentation which explain how to do it in a plugin, the keywords to be used, the warnings/exceptions
We may also let the user to define some custom properties, as done in optimade by means of a prefix
.
This means that the StructureData should support such custom property definition, and maybe store them under a different key in the attributes, in such a way to easily find/recognise them.
We should provide a template on how to define such properties.
Currently the constructor of the StructureData
still has e.g. the ase
input to initialize an instance from an ASE Atoms
object:
aiida-atomistic/src/aiida_atomistic/data/structure/__init__.py
Lines 717 to 727 in 1af101e
I would instead want to see a from_ase()
class method, e.g.:
from ase.build import bulk
bulk_si = bulk('Si', cubic=True)
structure_data = StructureData.from_ase(bulk_si)
The current implementation supports two ways to set a property, both via methods of the StructureData itself or of the property class:
### Store the Pbc property in a StructureData node.
##########################################################
#
#
structure = StructureData()
#
# (1) using the set_property method:
#
structure.set_property(name='pbc', value={'value':[True,False,True]})
#
# (2) using built-in method of Pbc property.
# This calls internally the set_property method.
#
structure.pbc.set_from_string(dimensionality="3D")
# ==> structure.pbc.value is then [True,True,True]
#
#The built-in methods then should work in this way:
structure.pbc.set_pbc(input_list=[True,True,True],...)
structure.pbc.from_structure(structure=other_structure,...)
#
# We cannot do the following:
#
# A property should be read-only if accessed directly:
structure.pbc.value = 54
structure.pbc = Pbc([True,False,True])
#
# Not possible to set unsupported properties or to provide random parameters
structure.set_property(name='unsupported', value={'value':5})
structure.set_property(name='pbc', value={'unsupported_params':5})
The built-in methods of the properties should call the set_property
method in order to set/update the value of a property. This ensures consistency and data validation checks and also consistency for the usage of the StructureData in different plugins.
However, in this way the immutability of a property is less robust: we can always use one of the methods above to override a property at any time (of course, only if the node is not stored). This may also lead to some unconsistency with other properties which are already defined. A way to avoid these two issues is to delegate everything (StructureData and properties setting) to the constructor of the StructureData class:
### Store the Pbc property in a StructureData node.
##########################################################
#
#
pbc_3d = Pbc(value=[True,True,True])
spin_up = Magnetization(moments=[1,1,1,...])
...
#
structure = StructureData(pbc=pbc_3d, magnetization=spin_up, ...)
some advantages are:
A disadvantage that I see it is that structure dependent property that we may want to initialize are not straightforward: for example, if we want to provide Hubbard U for a given set of atoms, we can provide the name of the atom, and then a set of operations are done in order to understand each atomic site which should have the same U. This cannot be done a priori and it is a bit counterintuitive on how to do it form the constructor (we pass the Hubbard class, or a dictionary then to be internally processed in order to set the instance of Hubbard?)
In my opinion, we can provide some method in the property classes that allows to instantiate one of them. This however may give a fake hint on the possiblity to mutate them once the StructureData is created. I don't really know about this, but if I think of the Hubbard property, one and very useful method is the from_list
one.
from aiida import orm, load_profile
load_profile()
from aiida_atomistic.data.structure import StructureData
properties_dict = {
"cell":{"value":[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
"pbc":{"value":[True,True,True]},
"positions":{"value":[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]]},
"charge": {"value": [1.0, 1.0]},
"symbols":{"value":["Li","Li"]},
}
structure = StructureData(properties = properties_dict)
structure.get_ase()
Raise error:
1992 _kinds = self.kinds
1993
[~/miniconda3/envs/aiida/lib/python3.11/site-packages/aiida/orm/nodes/node.py](https://file+.vscode-resource.vscode-cdn.net/home/xing/tests/aiida/atomistic/~/miniconda3/envs/aiida/lib/python3.11/site-packages/aiida/orm/nodes/node.py) in __getattr__(self, name)
720 return getattr(self.base.links, new_name)
721
--> 722 raise AttributeError(name)
AttributeError: cell
Should be related to #12 .
When create the StructureData, one needs to
properties_dict = {
"cell":{"value":[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
"pbc":{"value":[True,True,True]},
"positions":{"value":[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]]},
"symbols":{"value":["Li","Li"]},
}
structure = StructureData(properties = properties_dict)
It would be great if the user can provide the dict without specific the value
attribute.
properties_dict = {
"cell":[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]],
"pbc":[True,True,True],
"positions":[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]],
"symbols":["Li","Li"],
}
It would also be good if it support:
structure = StructureData(cell=[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]],
pbc=[True, True, True],
positions=[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]],
symbols=["Li", "Li"])
In this issue we comment on the description of properties in a site-based approach.
get_kinds
methodIf we adopt the sites based approach, the sites-to-kinds mapping can be provided via a get_kinds
method of the StructureData. This can support tolerances for the definition of new kinds.
However, such procedure can be also done by hands for specific cases (for example, we may want a specific order of the kinds)
Looking at the example on how to create a StructureData
instance:
from aiida_atomistic.data.structure import StructureData
properties_dict = {
"cell":{"value":[[3.5, 0.0, 0.0], [0.0, 3.5, 0.0], [0.0, 0.0, 3.5]]},
"pbc":{"value":[True,True,True]},
"positions":{"value":[[0.0, 0.0, 0.0],[1.5, 1.5, 1.5]]},
"symbols":{"value":["Li","Li"]},
}
structure = StructureData(properties = properties_dict)
And the fact the entire structure data is now captured by the properties, after #10 the constructor would effectively only have the properties
input. This immediately made me think of just having a JSON serializing method (to_json
, from_json
, or similar). But then I thought about @sphuber's AEP (aiidateam/AEP#40) - which I still have to read ๐ - and if we shouldn't be thinking about how to integrate the new StructureData
with the concepts there.
I'll have to read up some more before having a solid opinion on what to do, but I already wanted to raise the issue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.