vivarium-collective / vivarium-core Goto Github PK
View Code? Open in Web Editor NEWCore Interface and Engine for Vivarium
Home Page: https://vivarium-core.readthedocs.io/
License: Apache License 2.0
Core Interface and Engine for Vivarium
Home Page: https://vivarium-core.readthedocs.io/
License: Apache License 2.0
The fasta
library, currently still in vivarium-core
, is used to load for genome sequence data, which is only used in vivarium-cell. This should be moved, and all imports in vivarium-cell updated.
Python 3.10 deprecates distutils
. It'll be removed in Python 3.12.
So with the move to support Python 3.10 in #135 , we should replace distutils in all vivarium projects. setuptools
embeds a forked (and updated?) copy of distutils but there are compatibility issues and it's moving away from being a CLI tool to being just a library. setup.py
is not dead but apparently we should not run it directly. It's a mess.
(Presumably wcEcoli will stay on Python 3.8.7 and distutils. I did try updating it to setuptools but reverted that due to CovertLab/wcEcoli#1113.)
PEP 517 defines how to describe a project in a small, declarative pyproject.toml
file.
Modern tools to consider:
vivarium-ecoli
. It makes sense for us to learn one modern tool and use it in all vivarium projects.numpy.distutils
.Some of these tools can also obviate the Makefile
.
The equality comparisons here and here throw the following error when any values are Numpy arrays:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
For example, this occurs when the new and current schemas are {'_divider': {'divider': 'set_value', 'config': {'value': np.zeros(10)}}}
. The above Exception is raised due to the np.zeros(10)
buried deep within the _divider
schema. A custom recursive function is needed to check all schema values for embedded Numpy arrays and compare them separately using np.array_equal
.
We lack robust support for processes with more than one port to the same store.
The multi-update workaround introduced in #28 allows ports that point to multiple leaf stores to share leaf stores. In this case, updates to the shared leaf stores are aggregated into lists and applied sequentially.
Unfortunately, this workaround is very fragile. Here is an non-exhaustive list of scenarios that break this solution:
initial_state
method that provides initial values for the overlapping ports (patched in #205)dict
dict
instead of a tuple
(e.g. port: {'_path': ('bulk',)}
)One potential solution is to generate individual updates for each port and apply them in sequence.
We should create a CONTRIBUTING.md file that lays out:
Currently, when a user specifies two different dividers for a variable (e.g. through two different processes wired to that variable), we silently pick on divider:
vivarium-core/vivarium/core/store.py
Line 578 in 10c756c
We should instead raise an exception
In #192 it became clear that it would be nice for us to be able to create v2 releases that break v1 functionality without pushing those releases out to all users and breaking their code. Here's how we could do this:
develop-v1
branch from the current master
. This is where we'll put any bug fixes needed for the v1 version of Vivarium Coredevelop-v2
.2.0.0a1
for the first alpha release or 2.0.0b3
for the third beta release. pip
should not install any of these pre-releases when a user runs pip install vivarium-core
.
2.0.0a1
, we could use 2.0.0-alpha1
, 2.0.0-alpha.1
, 2.0.0-alpha-1
, etc.master
)? There's a certain symmetry to calling it develop-v2
, but then the name of our main branch would change each time we start a new major release (e.g. when we start developing v3).Some questions to address before publishing the Vivarium paper and inviting library users we can't consult before making incompatible changes:
__all__
or _xyz
names to declare that.setup.py
.
abc
abstract base classes & methods anywhere else?isinstance()
checks! isinstance()
should be used sparingly since it makes for fragile code.
We use Process.parameters
to serialize processes, which is not reliable when processes mutate their parameters. To prevent this, we should save off a copy of the parameters when the process is initialized and use this copy for serialization
Can a maintainer publish v1.6.1 on PyPI? Thanks!
Steps to migrate from vivarium to vivarium-collective:
setup.py
files for PyPIThere are a couple uses of pytest in the codebase. It should be added to setup or these uses should be removed:
vivarium-core/vivarium/core/engine.py
Line 24 in aae63ff
There are several Toy
processes and compartments in the codebase for testing purposes. Some of these are in composition.py
, some in the processes
directory (exchange_a.py
), some for tests within other files (ToyAgent
in meta_division
process file). Maybe these can all be pulled into a separate file so that they can be more easily accessed and reused?
Currently Quantity
instances with units are serialized with value.magnitude
, which strips them of their units. deserialize
is not being used in emitter.get_data
, and so we lose information about units and this can be a problem. The serializer will need to know which units to convert a value to when serializing and deserializing.
np.float128 and np.complex256 do not exist for numpy on some platforms (including Windows, ARM-based MacOS, which breaks scripts when trying to import them.
Deleting np.float128 and np.complex256 on lines 191, 192 in serialize.py fixed the issue for me.
vivarium-core/vivarium/core/serialize.py
Line 191 in aae63ff
A couple of other examples:
mpi4jax/mpi4jax#112
winpython/winpython#613
Before we release v1.0.0, we should document a policy for breaking API changes and versioning. I recommend we follow semantic versioning (which we mostly have been already) with the following further policies:
Our supported API consists of all public, documented interfaces that are not marked as being experimental in their docstrings.
Changes to the supported API will be reflected in the versioning according to semantic versioning. They will also be marked as deprecated in the documentation (and a warning will be raised if possible) for at least one minor release before they are removed.
Experimental interfaces may be changed or removed at any time and in any version.
The following are not considered breaking API changes:
Composite
dictionary with an extra key.config
dictionary.Scattered throughout Vivarium, we have functions to perform operations on schemas, topologies, and hierarchies, e.g. inverse_topology
. I think the codebase would be easier to understand if we pulled these functions into a new library under vivarium/library
so that we can document and test them thoroughly.
In this library, we would have 3 kinds of entities:
The library would provide functions to do the following:
_default
. This also shouldn't be dependent on the Store
class.Possibly related: #100
When a process has Process.parallel
set to true, we launch a parallel OS process that runs vivarium.core.process._run_update()
. run_update()
contains a loop that repeatedly receives (interval, state)
tuples from a pipe, passes those arguments to Process.next_update()
, and sends the result back through the pipe. To stop the parallel process, we pass interval=-1
.
In the main OS process (which contains Engine
), we store ParallelProcess
instances in Engine.parallel
. Then when we need an update from a process, we call Engine._invoke_process
, which does the following:
ParallelProcess
computing the update and returns it.InvokeProcess
, which has an interface similar to ParallelProcess
but computes the update immediately. Note that there's an extra invoke_process
function that computes the process's update, but this extra level of indirection appears unnecessary.The way we currently track parallel processes has a number of downsides:
ParallelProcess
instances in Engine.parallel
, but we also store the original Process
instances in the store hierarchy (with a reference in self.processes
).
Process
instances in the store hierarchy is confusing. A user can read out the internal state of those processes with no problem, but they're getting the state from a process that hasn't changed since the beginning of the simulation, which is not intuitive.Eliminate the extra invoke_process
function that doesn't appear to do anything.
Instead of storing ParallelProcess
instances in self.parallel
, put them directly into the store hierarchy with references in self.processes
. Once a parallel process has been put on its own OS process, there should be no copies of it left in the main OS process.
Systematize message-passing between the main and parallel OS processes as commands with arguments. Process
will have a run_command()
method that accepts a command name (string) and a tuple of arguments. The _run_update()
function would handle the following commands:
_halt
: Ignores arguments and shuts down the parallel process (equivalent of passing interval=-1
currently)_next_update
: Takes (interval, state)
as arguments and passes them to Process.next_update()
Process authors can override run_command()
to handle more commands, e.g. to return internal state variables.
ParallelProcess
will also provide run_command()
, but instead of running commands itself, it will pass those commands through the pipe to its child process's _run_update()
function, which will in turn pass them to Process.run_command()
.
I've started implementing this in #198
I think this proposal addresses all the problems described above. However, it brings some new downsides:
Process
instances anymore. Some will be ParallelProcess
instances. We don't want to make ParallelProcess
inherit from Process
because it doesn't know enough to do things like generate composites, and adding those missing capabilities is overkill.Process
instances in the store hierarchy is very counter-intuitive. The biggest breaking change is probably removing Engine.parallel
, but I doubt anyone relies on that.The library has 130 isinstance() checks! isinstance() should be used sparingly since it makes for fragile code.
This issue will keep track of different bugs and enhancements in the comments as I test the Vivarium API by trying to create a set of swimming "proto-cell" agents on a lattice.
Issues to fix:
CONTRIBUTING.md
should point to the getting started guidevivarium-template
repository is still correct vivarium-collective/vivarium-template#1super()
is_deriver()
and is_step()
methods**
syntax: https://vivarium-core.readthedocs.io/en/latest/guides/hierarchy.htmlConveneinceKinetics
is now it's own project: https://vivarium-core.readthedocs.io/en/latest/tutorials/write_process.html*
, **
)_add
, _delete
, _move
, {'_updater': ...}
)*
really belongs with schemas, not paths as it is now in the hierarchy.rst docsEngine.run_for()
Pages to audit:
We currently have an initial_state
in the Generator
class, which is inherited by Process
. This needs to be overwritten for all use cases or else it gives an exception. But initial_state
should work differently for composite generators than it does for process generators. The composite generators can merge the initial states of its constituent processes' initial_state
, perhaps with a config that resolves any conflicts between the processes' declared states.
To achieve this I propose: 1) Process
has its own initial_state
with the same exception approach currently used in Generator
. 2) Generator
has a different initial_state
that calls a new function -- process_initial_states
. This new function uses an inverse topology to get the store names corresponding to each processes' ports, calls the processes' initial_state
and does a merge. 3) The Generator
initial_state
can be overwritten for different cases, and the process_initial_states
accepts a config that allows it to resolve conflicts between merged states.
Now that we are replacing Derivers with Steps, it's becoming increasingly clear that Steps and Processes are really quite different. It might be a good idea to reflect that distinction in our class hierarchy. We would have two top-level interfaces (abstract classes), Wirable
and Composer
. Process
and Step
would both implement each of these interfaces. Note that these interfaces would need to not have overlapping methods since that confuses Python's multiple inheritance.
As part of this shift, we should separate the code needed for wiring things together and put that with the Wirable
interface. This will prepare our code to use the idea of wiring things together for purposes besides running simulations, e.g. running workflows or analyses
#64 greatly expanded plot_topology
to include several graph_formats
(hierarchy, vertical, horizontal), manual coordinates
, remove_nodes
, and node_labels
. There are several still-unimplemented features that would even further expand the options for this plot function.
Additional features that might be of interest include:
collapse_nodes
to reduce multiple nodes into one, while keeping their connections.remove_labels
--- selectively remove labels. Currently only one node can have an empty string (''
), since this serves as a unique identifier. It would be useful if multiple nodes could be left without labels.Currently, Process.__init__()
saves off a copy of the parameters to self._original_parameters
. Then when serializing, Process.__getstate__()
returns self._original_parameters
and Process.__setstate__()
calls self.__init__()
with the parameters.
This is problematic in class hierarchies where constructors change parameters. For example:
class A(Process):
def __init__(self, parameters):
super().__init__({'2': parameters['1']})
a = A({'1': True})
a2 = pickle.loads(pickle.dumps(a))
Here, A
receives a parameter under the key 1
but passes that parameter to Process
under key 2
.
Here's how the serialization/deserialization flow looks:
A
is serialized, Process.__getstate__()
returns {'2': True}
.a2
, Process.__setstate__()
calls self.__init__(parameters)
. However, it is A.__init__()
that runs here, not Process.__init__()
, so A.__init__()
will be looking for a nonexistent key 1
in the parameters.The fact that A.__init__()
runs is correct. There could be code there that we need to run to initialize instance variables. The problem is that a.__getstate__()
should be returning the original parameters passed to A.__init__()
, not those passed to Process.__init__()
.
Here are some possible solutions:
Process.__init__()
for the original subclass parameters, which will then be saved in self._original_parameters
. This route will require that we update all constructors in an inheritance hierarchy to handle the new parameter.Process.save_parameters()
method that subclasses can call to save off their original parameters. With this approach, we will only have to change Process
and the class at the bottom of the hierarchy. However, we will have to handle multiple calls to Process.save_parameters()
since we only want to save the parameters from the child-most subclass.parameters
dictionary for the original parameters. If provided, Process.__init__()
will save this in self._original_parameters
instead of parameters
.I prefer approach (3) since I think it's most in line with how we pass other special configs to Process
.
We currently define a MultiInvoke
class in experiment.py
:
vivarium-core/vivarium/core/experiment.py
Line 124 in a92f8eb
However, I can't find any usages of it in vivarium-collective
or in my wcEcoli work. It seems that we're handling parallel processing using ParallelProcess
instead.
Can MultiInvoke
be removed?
Right now we don't specify versions in setup.py. We should add them to help pip solve environments when users install vivarium-core. For example, we could specify numpy>=0.1
to say that numpy must be at least version 0.1 to work with vivarium-core
Note: This depends on #22.
Get all code to pass lint checks
core/composition.py
core/control.py
core/emitter.py
core/experiment.py
core/store.py
core/process.py
core/registry.py
Complete docstrings
core/composition.py
core/control.py
core/emitter.py
core/experiment.py
core/store.py
core/process.py
core/registry.py
To run pylint, just execute pylint vivarium
.
The files left to fix are listed in .pylintrc
after ignore=
. We use an exclusion list to ensure that once we fix a file, any code added to it also passes the linter. This prevents regressions. It also ensures that any new files added have proper style.
I use pylint's default rules. You can adjust this in .pylintrc
after the disable=
line. You can find the list of pylint checks here.
.pylintrc
pylint vivarium
core/composition.py
core/control.py
core/emitter.py
core/experiment.py
core/store.py
core/process.py
core/registry.py
I use the pytest-cov
plugin to calculate test coverage. Whenever you run the tests, add the --cov=vivarium
argument to calculate coverage. At the end of the test output, you'll get a table like this:
Name Stmts Miss Cover Missing
-----------------------------------------------------------------------------------
vivarium/__init__.py 37 0 100%
vivarium/composites/__init__.py 0 0 100%
vivarium/composites/injected_glc_phosphorylation.py 13 0 100%
vivarium/core/__init__.py 0 0 100%
vivarium/core/composition.py 403 180 55% 62, 64-68, 141, 143, 156-157, 179-193, 200-221, 247-333, 347, 364-370, 375-386, 417, 445-456, 527-540, 543-544, 548-552, 559-569, 578-580, 614-643, 686-706, 746-755, 998-1000
vivarium/core/control.py 85 13 85% 40, 42, 59-60, 63-64, 89-90, 104-106, 112, 119, 211
vivarium/core/emitter.py 156 80 49% 32-33, 55, 59, 65, 70-73, 86-94, 103, 116, 119, 128, 176-189, 192-196, 199, 203-212, 216-222, 226-227, 232-280, 284-298, 304-312, 320-324
vivarium/core/experiment.py 1018 104 90% 93, 102, 171, 173, 182-184, 191, 245, 250-251, 262, 267, 271-272, 285-286, 288, 293, 311, 327-328, 381-384, 416, 437, 442, 447, 452, 460-467, 481, 485, 487, 490, 500-502, 520-523, 665-672, 827-831, 855-872, 909, 920, 922, 948-949, 990-997, 1014, 1069, 1076, 1096, 1348-1352, 1461-1462, 1494-1495, 1517, 1546, 1548, 1570-1576, 1677, 2254
vivarium/core/process.py 277 40 86% 30, 41, 43, 51, 78-81, 100-104, 211, 227, 274, 277-279, 319, 337, 353-361, 370, 386, 392-394, 401, 457, 459, 507, 516, 564
vivarium/core/registry.py 118 57 52% 77-78, 115-122, 146-153, 164, 182-197, 202-204, 212, 227, 238, 246, 249, 253, 256, 260-264, 271-275, 284-286, 289, 294-296, 300
vivarium/experiments/__init__.py 0 0 100%
vivarium/experiments/glucose_phosphorylation.py 48 5 90% 91-96, 105
vivarium/library/__init__.py 0 0 100%
vivarium/library/datum.py 28 19 32% 2-3, 6-7, 28-41, 44, 47, 50
vivarium/library/dict_utils.py 163 75 54% 17-20, 33-36, 38, 53, 68, 70, 74, 94, 113-117, 127-132, 136-144, 150-152, 156-175, 183-202, 206, 230-236, 259-261
vivarium/library/fasta.py 10 8 20% 4-12
vivarium/library/filepath.py 35 22 37% 22-25, 31-34, 51-58, 73-77, 81-82, 87-89, 93-94
vivarium/library/make_network.py 87 79 9% 15-27, 33-40, 55-92, 96-125, 136-171
vivarium/library/path.py 4 2 50% 4-5
vivarium/library/pretty.py 24 5 79% 16, 18, 20-21, 26
vivarium/library/schema.py 39 28 28% 4, 7, 12, 18-25, 28-40, 45-52, 55, 62, 69, 77
vivarium/library/timeseries.py 32 27 16% 27-50, 61-69
vivarium/library/topology.py 148 17 89% 36-37, 58-61, 93-98, 107-116
vivarium/library/units.py 38 6 84% 62-66, 90
vivarium/plots/__init__.py 0 0 100%
vivarium/plots/agents_multigen.py 135 127 6% 13-24, 64-226
vivarium/plots/simulation_output.py 103 16 84% 70, 76, 78-79, 82, 84, 99-103, 122, 129, 132, 138, 172
vivarium/plots/topology.py 105 17 84% 40, 137-140, 151, 166-180, 187, 202, 255, 268-277
vivarium/processes/__init__.py 0 0 100%
vivarium/processes/agent_names.py 14 6 57% 12-14, 17, 29-30
vivarium/processes/burst.py 58 6 90% 208-212, 217
vivarium/processes/derive_concentrations.py 28 19 32% 20-31, 34, 53-69
vivarium/processes/derive_counts.py 24 8 67% 32, 35, 55-62
vivarium/processes/divide_condition.py 15 7 53% 10-11, 14, 17, 25-28
vivarium/processes/engulf.py 58 6 90% 183-187, 191
vivarium/processes/exchange_a.py 15 0 100%
vivarium/processes/glucose_phosphorylation.py 39 5 87% 129-139
vivarium/processes/injector.py 37 9 76% 57, 103-109, 113
vivarium/processes/meta_division.py 76 9 88% 29, 52, 65, 207-211, 215
vivarium/processes/nonspatial_environment.py 27 15 44% 22-24, 27-72, 75-87
vivarium/processes/remove.py 47 6 87% 116-120, 124
vivarium/processes/swap_processes.py 63 6 90% 175-179, 183
vivarium/processes/template_process.py 38 7 82% 138-146, 151
vivarium/processes/timeline.py 57 5 91% 43-44, 51, 56-57
vivarium/processes/tree_mass.py 50 7 86% 69, 168-172, 176
-----------------------------------------------------------------------------------
TOTAL 3752 1048 72%
Required test coverage of 72.0% reached. Total coverage: 72.07%
This lists the fraction of lines in each file that are covered by the tests. It also tells you which lines in the file weren't covered so you can add tests for them.
Notice the last line says we require a code coverage of 72%. This is just the current coverage value. If coverage dips below 72%, the tests will fail to make sure we don't regress.
pytest --cov=vivarium
..coveragerc
to your current coverage value for the entire project.core/composition.py
core/control.py
core/emitter.py
core/experiment.py
core/process.py
core/registry.py
We use type annotations as specified by PEP 484 and prefer annotations to special comments like # type: str
. See the documentation for typing
for more information.
Mypy is configured using the mypy.ini
file, which looks like this:
[mypy]
disallow_untyped_defs = True
disallow_incomplete_defs = True
# Ignore missing types from third-party libraries.
[mypy-numpy.*]
ignore_missing_imports = True
...
# Ignore missing types from Vivarium Core files we haven't typed yet.
[mypy-vivarium.core.registry.*]
disallow_untyped_defs = False
disallow_incomplete_defs = False
[mypy-vivarium.core.control.*]
disallow_untyped_defs = False
disallow_incomplete_defs = False
...
Notice how we first set disallow_untyped_defs
and disallow_incomplete_defs
to True. This means that by default, mypy will raise an error whenever a function is not completely typed. Then, for every file that hasn't been typed yet, we include a section like [mypy-vivarium.core.registry.*]
for vivarium/core/registry.py
that sets those options to False. As we add type hints, we can remove files from this list until everything has type annotations.
Also note that we tell mypy to ignore types in our dependencies like numpy
. Many third-party libraries are poorly-typed, and we want to avoid spending time trying to get mypy to work with them. We will focus on making sure the types in our code are correct instead.
mypy.ini
as not being covered yet.mypy.ini
.mypy vivarim
_add
, _delete
, _move
, {'_updater': ...}
)Engine.run_for()
The following bug was encountered while trying to wire together FBA and bioscrape in this notebook..
Overview: The goal is to have a FluxDeriver connect to both a Bioscrape rate and a Cobra flux_bound.
The following absolute topology worked:
topology = {
'bioscrape': {
# all species go to a species store on the base level,
# except Biomass, which goes to the 'globals' store, with variable 'biomass'
'species': {
'_path': ('species',),
'Biomass': ('..', 'globals', 'biomass'),
},
'delta_species': ('delta_species',),
'rates': ('rates',),
'globals': ('globals',),
},
'cobra': {
'internal_counts': ('internal_counts',),
'external': ('external',),
'exchanges': ('exchanges',),
'reactions': ('reactions',),
'flux_bounds': ('flux_bounds',),
'global': ('globals',),
},
'flux_deriver': {
'deltas': ('delta_species',),
'amounts':('globals',),
# connect Bioscrape deltas 'Lactose_consumed' and 'Glucose_internal'
# to COBRA flux bounds 'EX_lac__D_e' and 'EX_glc__D_e'
# also connect biomass flux to the dilution rate
'fluxes':
{
#'_path': ('flux_bounds',),
'Lactose_consumed': ('flux_bounds','EX_lac__D_e',),
'Glucose_internal': ('flux_bounds','EX_glc__D_e',),
'biomass':('rates', 'k_dilution__',)
}
},
'mass_deriver': {
'global': ('globals',),
},
'volume_deriver': {
'global': ('globals',),
},
'biomass_adaptor': {
'input': ('globals',),
'output': ('globals',),
}
}
The following relative topology did not correctly update ('fluxes', 'biomass',)
--> ('..', 'rate', 'k_dilution__',)
. Note that the updates from the FluxDeriver seemed fine and that the other flux ports were updated correctly.
topology = {
'bioscrape': {
# all species go to a species store on the base level,
# except Biomass, which goes to the 'globals' store, with variable 'biomass'
'species': {
'_path': ('species',),
'Biomass': ('..', 'globals', 'biomass'),
},
'delta_species': ('delta_species',),
'rates': ('rates',),
'globals': ('globals',),
},
'cobra': {
'internal_counts': ('internal_counts',),
'external': ('external',),
'exchanges': ('exchanges',),
'reactions': ('reactions',),
'flux_bounds': ('flux_bounds',),
'global': ('globals',),
},
'flux_deriver': {
'deltas': ('delta_species',),
'amounts': ('globals',),
# connect Bioscrape deltas 'Lactose_consumed' and 'Glucose_internal'
# to COBRA flux bounds 'EX_lac__D_e' and 'EX_glc__D_e'
# also connect biomass flux to the dilution rate
'fluxes':
{
'_path': ('flux_bounds',),
'Lactose_consumed': ('EX_lac__D_e',),
'Glucose_internal': ('EX_glc__D_e',),
'biomass':('..', 'rates', 'k_dilution__',)
}
},
'mass_deriver': {
'global': ('globals',),
},
'volume_deriver': {
'global': ('globals',),
},
'biomass_adaptor': {
'input': ('globals',),
'output': ('globals',),
}
}
This problem was fixed by pointing the bioscrape ('rate', 'k_dilution__')
at the 'flux_bounds'
store while still using a relative topology (previously it was effectively wired the other way around - one would expect both the above and the below to work). This suggests that the there is an issue when using relative topologies that point to very different stores. The working relative topology is given below:
topology = {
'bioscrape': {
# all species go to a species store on the base level,
# except Biomass, which goes to the 'globals' store, with variable 'biomass'
'species': {
'_path': ('species',),
'Biomass': ('..', 'globals', 'biomass'),
},
'delta_species': ('delta_species',),
'rates': {
'_path' : ('rates',),
'k_dilution__': ('..', 'flux_bounds', 'k_dilution__'),
},
'globals': ('globals',),
},
'cobra': {
'internal_counts': ('internal_counts',),
'external': ('external',),
'exchanges': ('exchanges',),
'reactions': ('reactions',),
'flux_bounds': ('flux_bounds',),
'global': ('globals',),
},
'flux_deriver': {
'deltas': ('delta_species',),
'amounts': ('globals',),
# connect Bioscrape deltas 'Lactose_consumed' and 'Glucose_internal'
# to COBRA flux bounds 'EX_lac__D_e' and 'EX_glc__D_e'
# also connect biomass flux to the dilution rate
'fluxes':
{
'_path': ('flux_bounds',),
'Lactose_consumed': ('EX_lac__D_e',),
'Glucose_internal': ('EX_glc__D_e',),
'biomass':('k_dilution__',)
#'Lactose_consumed': ('flux_bounds','EX_lac__D_e',),
#'Glucose_internal': ('flux_bounds','EX_glc__D_e',),
#'biomass':('rates', 'k_dilution__',)
}
},
'mass_deriver': {
'global': ('globals',),
},
'volume_deriver': {
'global': ('globals',),
},
'biomass_adaptor': {
'input': ('globals',),
'output': ('globals',),
}
}
```
GitHub offers the following permission levels for organization repositories (docs):
- Read: Recommended for non-code contributors who want to view or discuss your project
- Triage: Recommended for contributors who need to proactively manage issues and pull requests without write access
- Write: Recommended for contributors who actively push to your project
- Maintain: Recommended for project managers who need to manage the repository without access to sensitive or destructive actions
- Admin: Recommended for people who need full access to the project, including sensitive and destructive actions like managing security or deleting a repository
I propose the following structure of roles for Vivarium and Vivarium Core (I'm ignoring the other Vivarium repos for now):
owner
role in the Vivarium Collective organization and the admin
role in Vivarium Core. They are responsible for enforcing the code of conduct, responding to security vulnerabilities, and adjudicating disputes.member
role in the Vivarium Collective organization and the write
role in Vivarium Core. They are empowered to manage issues, open PRs, and otherwise develop Vivarium Core.Remaining Questions:
When vivarium-core updates, it has the potential to break many other vivarium projects. Because of this, it would be useful to implement automated testing that can check other repos before making changes to vivarium-core. Here is a link to some useful discussion about how this could be done: https://blog.marcnuri.com/triggering-github-actions-across-different-repositories/
Consider the case when there are two processes A and B which by default each update a store X by setting.
Currently, the order A and B are added to the Bigraph determines which setter takes precedent. It would be very helpful to have a flag in the schema {path: {X : _update : False} which can be used to override the default schema to turn off updating from one of the processes or the other to remove this order dependence.
Currently Engine.front
tracks all processes next updates, with a dict that has the next update times and the updates that will be applied: {'time': 14.0, 'update': {update-to-be-applied}}
. The process's next update time is determined by the global time plus the timestep given by its Process.calculate_timestep()
.
Having interrupts would allow other processes to change a given process's next update time to be earlier. This might be necessary if, for example, an event such as chromosome replication needs to reset transcription at a given time. If the interrupt time is before the previously-determined next update time, the transcription process would need to be rerun from its previous initial state, and would return a new update dict: {'time': 13.0, 'update': {new}}
.
To add this feature requires that we save the initial state provided for the process, so it could be re-run from that state at a shorter interval. Something like: {'time': 14.0, 'update': {update-to-be-applied}, 'state': {saved-view}}
. It would also require the interrupt times to be received and handled by Engine.run_for
.
We have this code in Store.check_default()
:
vivarium-core/vivarium/core/store.py
Lines 132 to 135 in a92f8eb
Notice that we set new_default_comp
to self.default.tolist()
even when it is new_default
that is a numpy array. This seems like a bug.
On a related note, we warn about a schema conflict when the new and existing defaults are the same. Shouldn't we warn about that when the defaults differ? Here's the whole function (link):
def check_default(self, new_default):
defaults_equal = False
if self.default is not None:
self_default_comp = self.default
new_default_comp = new_default
if isinstance(self_default_comp, np.ndarray):
self_default_comp = self.default.tolist()
if isinstance(new_default_comp, np.ndarray):
new_default_comp = self.default.tolist()
defaults_equal = self_default_comp == new_default_comp
if defaults_equal:
if (
not isinstance(new_default, np.ndarray)
and not isinstance(self.default, np.ndarray)
and new_default == 0
and self.default != 0
):
log.debug(
'_default schema conflict: %s and %s. selecting %s',
str(self.default), str(new_default), str(self.default))
return self.default
log.debug(
'_default schema conflict: %s and %s. selecting %s',
str(self.default), str(new_default), str(new_default))
return new_default
Some processes self-register by invoking upon import, for example derive_concentrations
and derive_counts
. Now that we have the __init__
setting up registration, it would be better to just import these processes, import the registry, and manually register them. This is cleaner than self-registration and sets up a nice pattern for process registration.
Some discussed changes to the vivarium class structure include:
Composite
object, which has processes
and topology
attributes. This would be a short-lived object, as its processes and topology would be passed into an experiment for execution. An advantage would be improved application of developer tools, and type checking. Also, it could have its own initial_state
method and merge
method.Composer
would return Composites
.Experiment
could also accept Composite
, and call its processes
and topology
attributes.MetaComposer
would be a composer of Composers
. In its generate_processes
and generate_topology
it would call its component Composers
and merge their generated Composites
.Consider a process (e.g. multibody physics) with a ports schema like this:
{
'agents': {
'*': {
'a': {...},
'b': {...},
},
},
}
Now imagine that we want to use this process with an agent with a store hierarchy like this:
{
'a': {...}
'boundary': {
'b': {...},
},
}
We don't currently have a way to wire the environment process to this agent.
Currently, processes have a defaults
class variable that specifies the parameters they expect. Then when calling a process, you can provide parameters in a dictionary to the parameters
argument to override those defaults. For example:
class MyProcess(Process):
defaults = {
'flag': True,
}
proc = MyProcess({'flag': False})
Our current approach for handling process parameters and defaults has a number of issues:
MyProcess({'flag': 'test'})
without raising any type errors.defaults
. For example, MyProcess
could have a next_update
that uses self.parameters['flag2']
even though MyProcess.defaults
does not contain flag2
.defaults
are not handled correctly when subclassing processes. For example, consider the process class MyProcess2(MyProcess)
with a constructor that calls MyProcess.__init__(parameters)
. Inside MyProcess.__init__
, self.defaults
refers to MyProcess2.defaults
, not MyProcess.defaults
like you'd expect.All the problems would be solved by using normal Python function arguments for process parameters. Here's how we could preserve the desired functionality:
Not sure how we can do this yet.
I think we can use inspect
and locals()
to do this. We would, in Process.__init__()
, first use self.__class__
to get the subclass. Then we could use code like this to get the parameters passed to the subclass constructor:
>>> def foobar(foo, bar, baz):
... sig, foobar_locals = inspect.signature(foobar), locals()
... return [foobar_locals[param.name] for param in sig.parameters.values()]
...
>>> foobar(1, 2, 3)
[1, 2, 3]
Source: https://stackoverflow.com/a/10724602
Note that the locals()
call would have to happen in the subclass, and the result would need to be passed to the superclass constructor.
Currently, when we emit data that has a unit in the state hierarchy and format it as a timeseries, we put the unit into the key and strip the units from the value. For example, we might get:
{
'external': {
('antibiotic', 'millimolar'): [1, 2, 3],
}
}
This is not ideal. Instead, our plotting functions should know how to handle Quantity
objects.
Here's where we do the unit conversion:
vivarium-core/vivarium/library/dict_utils.py
Lines 264 to 268 in 6836411
And here's a trace of the calls in emitter.py
that lead to the code above:
vivarium.library.dict_utils.value_in_embedded_dict()
vivarium.core.emitter.timeseries_from_data()
vivarium.core.emitter.Emitter.get_timeseries()
Note that addressing this will probably be a breaking API change since doing so will change our output format.
If a '_generate'
or '_divide'
update is triggered it requires the expiration and re-caching of some topology_views
. Currently, Store.apply_update
will pass a view_expire
flag (here) back to Engine
, which triggers self.state.build_topology_views()
: (here). This expires and re-caches ALL of the topology_views
across the entire Store
hierarchy. It would be more efficient if it knew to only expire and re-cache the topology_view
of the subset of processes that are connected to the affected stores.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.