micom-dev / micom Goto Github PK
View Code? Open in Web Editor NEWPython package to study microbial communities using metabolic modeling.
Home Page: https://micom-dev.github.io/micom
License: Apache License 2.0
Python package to study microbial communities using metabolic modeling.
Home Page: https://micom-dev.github.io/micom
License: Apache License 2.0
This is a consequence of opencobra/optlang#232.
Dear maintainers,
I'm encountering a strange issue with micom.media.minimal_medium()
in MICOM (version 0.16.1). Apparently the medium computed contains more metabolites than the initial medium provided to the community. From what I understand, this should not be possible with the parameter open_exchanges=False
(more details below).
I have a community of roughly ~250 species from AGORA 1.03 (without mucins) and a custom medium derived from the WesternDiet provided in the original MICOM publication: western_diet_custom.txt. I also use CPLEX 12.9.0.0.
I apply the medium on my community in a similar way than shown in the documentation:
# List of exchange reaction in the model
ex_ids = [r.id for r in com.exchanges]
# Find medium reaction that are recognized by the community
medium_in_ex_ids = medium.index.isin(ex_ids)
# Define medium of community
com.medium = medium[medium_in_ex_ids]
Then, I run a cooperative tradeoff to obtain the community growth rate:
# Compute growth rate on diet
sol = com.cooperative_tradeoff(fraction=0.7, fluxes=True, pfba=False,)
sol
So far so good. And I'm also able to compute the minimal medium necessary to obtain 95% of this community growth rate:
# Compute minimal medium necessary to reach 95% growth rate
med = micom.media.minimal_medium(
com,
0.95 * sol.growth_rate,
exports=False,
open_exchanges=False, # !!!
)
med
EX_26dap_M_m 0.013413
EX_2dmmq8_m 0.011625
EX_acgam_m 23.994033
EX_adocbl_m 0.000054
EX_arg_L_m 3.600214
...
EX_isochol_m 0.000001
EX_HC02194_m 0.000033
EX_uchol_m 0.000032
EX_sel_m 0.001193
EX_cellul_m 0.000836
Length: 130, dtype: float64
This is where I notice a problem: when I compare the fluxes of the minimal medium with the original medium, there are a handful of extra reactions that were not initially present (reproduced below).
From what I understand, it should not be possible with open_exchanges=False
. Any idea of what is happening? I hope I'm giving enough information (feel free to ask for more, I didn't share the pickle of the community because it's a 300MB file, but I could find a way if it's necessary).
Note that I did not observe this problem with AGORA 1.03 (with mucins) or AGORA 1.02, and it seems to only happen with some specific samples. My main goal was to somehow use minimal_medium()
in order to identify essential metabolites required by my communities, but the fact that the function add more reactions than expected might be troublesome.
Cheers,
Nils
The warning regarding media components with missing imports is a bit too fatalistic given that it is not a real problem in most cases. See #63 for instance.
Maybe change it to be a an info and only trigger a warning if a large number of metabolites can not be consumed or if there are no consumable carbon and nitrogen sources.
Just changing the text of the warning, but it may still be too verbose.
After creating a custom database, the "build" command sometimes stalls at 99% when creating a manifest from 100+ samples and 16 taxa. If I open a new Python session and run the same build command, it occasionally creates the manifest successfully. I'm also encountering stalling at 99% complete with the fix_medium command (same dataset).
I'm using 28 cores so I don't think it's a lack of RAM, but would appreciate any ideas for debugging!
>>>build_database(tax, out_path="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/database", rank='genus', threads=28, compress=None, compresslevel=6, progress=True)
>>> soil_db = "/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/database/"
>>>
>>> manifest = build(tax,
... out_folder="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/",
... model_db=soil_db, cutoff=0.00001, threads=28)
[12:00:26] WARNING Found existing models for 102 samples. Will skip those. Delete the output folder if you would like me to rebuild them.
[normal messages ...]
Set parameter TokenServer to value "sccsvc"
Set parameter GURO_PAR_SPECIAL
Set parameter TokenServer to value "sccsvc"
Read LP format model from file /scratch/5511857.1.geo-int/tmpcafsj2wj.lp
Reading time = 0.14 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Set parameter GURO_PAR_SPECIAL
Set parameter TokenServer to value "sccsvc"
Read LP format model from file /scratch/5511857.1.geo-int/tmptp3038rs.lp
Reading time = 0.12 seconds
: 17548 rows, 40257 columns, 173487 nonzeros
Read LP format model from file /scratch/5511857.1.geo-int/tmpjz3xnx_x.lp
Reading time = 0.14 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Set parameter GURO_PAR_SPECIAL
Set parameter TokenServer to value "sccsvc"
Read LP format model from file /scratch/5511857.1.geo-int/tmpmq1q5ngz.lp
Reading time = 0.13 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Read LP format model from file /scratch/5511857.1.geo-int/tmp5vkdrvvb.lp
Reading time = 0.13 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 99% 0:00:01
(micom)[zrwerbin@scc-ul2 ~]$ python -c "import micom; micom.show_versions()"
Package Information
-------------------
micom 0.33.2
Dependency Information
----------------------
cobra 0.29.0
highspy missing
jinja2 3.1.3
osqp 0.6.3
scikit-learn 1.4.1.post1
scipy 1.12.0
symengine 0.11.0
Build Tools Information
-----------------------
pip 24.0
setuptools 69.1.1
wheel 0.42.0
Platform Information
--------------------
Linux 4.18.0-513.9.1.el8_9.x86_64-x86_64
CPython 3.12.2
Other session info:
(micom)[zrwerbin@scc-ul2 ~]$ conda list
# packages in environment at /projectnb/talbot-lab-data/zrwerbin/.conda/envs/micom:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
annotated-types 0.6.0 pyhd8ed1ab_0 conda-forge
anyio 4.3.0 pyhd8ed1ab_0 conda-forge
appdirs 1.4.4 pyh9f0ad1d_0 conda-forge
argon2-cffi 23.1.0 pyhd8ed1ab_0 conda-forge
argon2-cffi-bindings 21.2.0 py312h98912ed_4 conda-forge
arrow 1.3.0 pyhd8ed1ab_0 conda-forge
asttokens 2.4.1 pyhd8ed1ab_0 conda-forge
async-lru 2.0.4 pyhd8ed1ab_0 conda-forge
attrs 23.2.0 pyh71513ae_0 conda-forge
babel 2.14.0 pyhd8ed1ab_0 conda-forge
libcblas 3.9.0 21_linux64_openblas conda-forge
beautifulsoup4 4.12.3 pyha770c72_0 conda-forge
bleach 6.1.0 pyhd8ed1ab_0 conda-forge
brotli-python 1.1.0 py312h30efb56_1 conda-forge
bzip2 1.0.8 hd590300_5 conda-forge
ca-certificates 2024.2.2 hbcca054_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
certifi 2024.2.2 pyhd8ed1ab_0 conda-forge
cffi 1.16.0 py312hf06ca03_0 conda-forge
charset-normalizer 3.3.2 pyhd8ed1ab_0 conda-forge
cobra 0.29.0 pyhd8ed1ab_0 conda-forge
comm 0.2.1 pyhd8ed1ab_0 conda-forge
debugpy 1.8.1 py312h30efb56_0 conda-forge
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
depinfo 2.2.0 pyhd8ed1ab_0 conda-forge
diskcache 5.6.3 pyhd8ed1ab_0 conda-forge
entrypoints 0.4 pyhd8ed1ab_0 conda-forge
exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge
executing 2.0.1 pyhd8ed1ab_0 conda-forge
fqdn 1.5.1 pyhd8ed1ab_0 conda-forge
future 1.0.0 pyhd8ed1ab_0 conda-forge
gf2x 1.3.0 ha476b99_2 conda-forge
glpk 5.0 h445213a_0 conda-forge
gmp 6.3.0 h59595ed_0 conda-forge
gurobi 11.0.1 py312_0 gurobi
h11 0.14.0 pyhd8ed1ab_0 conda-forge
h2 4.1.0 pyhd8ed1ab_0 conda-forge
hpack 4.0.0 pyh9f0ad1d_0 conda-forge
httpcore 1.0.4 pyhd8ed1ab_0 conda-forge
httpx 0.27.0 pyhd8ed1ab_0 conda-forge
hyperframe 6.0.1 pyhd8ed1ab_0 conda-forge
idna 3.6 pyhd8ed1ab_0 conda-forge
importlib-metadata 7.0.1 pyha770c72_0 conda-forge
importlib_metadata 7.0.1 hd8ed1ab_0 conda-forge
importlib_resources 6.1.2 pyhd8ed1ab_0 conda-forge
ipykernel 6.29.3 pyhd33586a_0 conda-forge
ipython 8.22.2 pyh707e725_0 conda-forge
ipywidgets 8.1.2 pyhd8ed1ab_0 conda-forge
isoduration 20.11.0 pyhd8ed1ab_0 conda-forge
jedi 0.19.1 pyhd8ed1ab_0 conda-forge
jinja2 3.1.3 pyhd8ed1ab_0 conda-forge
joblib 1.3.2 pyhd8ed1ab_0 conda-forge
json5 0.9.21 pyhd8ed1ab_0 conda-forge
jsonpointer 2.4 py312h7900ff3_3 conda-forge
jsonschema 4.21.1 pyhd8ed1ab_0 conda-forge
jsonschema-specifications 2023.12.1 pyhd8ed1ab_0 conda-forge
jsonschema-with-format-nongpl 4.21.1 pyhd8ed1ab_0 conda-forge
jupyter 1.0.0 pyhd8ed1ab_10 conda-forge
jupyter-lsp 2.2.4 pyhd8ed1ab_0 conda-forge
jupyter_client 8.6.0 pyhd8ed1ab_0 conda-forge
jupyter_console 6.6.3 pyhd8ed1ab_0 conda-forge
jupyter_core 5.7.1 py312h7900ff3_0 conda-forge
jupyter_events 0.9.0 pyhd8ed1ab_0 conda-forge
jupyter_server 2.13.0 pyhd8ed1ab_0 conda-forge
jupyter_server_terminals 0.5.2 pyhd8ed1ab_0 conda-forge
jupyterlab 4.1.3 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.3.0 pyhd8ed1ab_1 conda-forge
jupyterlab_server 2.25.3 pyhd8ed1ab_0 conda-forge
jupyterlab_widgets 3.0.10 pyhd8ed1ab_0 conda-forge
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
libblas 3.9.0 21_linux64_openblas conda-forge
libcblas 3.9.0 21_linux64_openblas conda-forge
libexpat 2.5.0 hcb278e6_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libflint 2.9.0 h2f819a4_ntl_100 conda-forge
libgcc-ng 13.2.0 h807b86a_5 conda-forge
libgfortran-ng 13.2.0 h69a702a_5 conda-forge
libgfortran5 13.2.0 ha4646dd_5 conda-forge
libgomp 13.2.0 h807b86a_5 conda-forge
liblapack 3.9.0 21_linux64_openblas conda-forge
libnsl 2.0.1 hd590300_0 conda-forge
libopenblas 0.3.26 pthreads_h413a1c8_0 conda-forge
libosqp 0.6.3 h6a678d5_0
libqdldl 0.1.7 hcb278e6_0 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libsqlite 3.45.1 h2797004_0 conda-forge
libstdcxx-ng 13.2.0 h7e041cc_5 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libxcrypt 4.4.36 hd590300_1 conda-forge
libzlib 1.2.13 hd590300_5 conda-forge
markdown-it-py 3.0.0 pyhd8ed1ab_0 conda-forge
markupsafe 2.1.5 py312h98912ed_0 conda-forge
matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge
mdurl 0.1.2 pyhd8ed1ab_0 conda-forge
micom 0.33.2 pyhdfd78af_0 bioconda
mistune 3.0.2 pyhd8ed1ab_0 conda-forge
mpc 1.3.1 hfe3b2da_0 conda-forge
mpfr 4.2.1 h9458935_0 conda-forge
mpmath 1.3.0 pyhd8ed1ab_0 conda-forge
nbclient 0.8.0 pyhd8ed1ab_0 conda-forge
nbconvert 7.16.2 pyhd8ed1ab_0 conda-forge
nbconvert-core 7.16.2 pyhd8ed1ab_0 conda-forge
nbconvert-pandoc 7.16.2 pyhd8ed1ab_0 conda-forge
nbformat 5.9.2 pyhd8ed1ab_0 conda-forge
ncurses 6.4 h59595ed_2 conda-forge
nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge
notebook 7.1.1 pyhd8ed1ab_0 conda-forge
notebook-shim 0.2.4 pyhd8ed1ab_0 conda-forge
ntl 11.4.3 hef3c4d3_1 conda-forge
numpy 1.26.4 py312heda63a1_0 conda-forge
openssl 3.2.1 hd590300_0 conda-forge
optlang 1.8.1 pyhd8ed1ab_0 conda-forge
osqp 0.6.3 py312hfb8ada1_2 conda-forge
overrides 7.7.0 pyhd8ed1ab_0 conda-forge
packaging 23.2 pyhd8ed1ab_0 conda-forge
pandas 2.2.1 py312hfb8ada1_0 conda-forge
pandoc 3.1.12.2 ha770c72_0 conda-forge
pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge
parso 0.8.3 pyhd8ed1ab_0 conda-forge
pexpect 4.9.0 pyhd8ed1ab_0 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pip 24.0 pyhd8ed1ab_0 conda-forge
pkgutil-resolve-name 1.3.10 pyhd8ed1ab_1 conda-forge
platformdirs 4.2.0 pyhd8ed1ab_0 conda-forge
prometheus_client 0.20.0 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.42 pyha770c72_0 conda-forge
prompt_toolkit 3.0.42 hd8ed1ab_0 conda-forge
psutil 5.9.8 py312h98912ed_0 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pydantic 2.6.3 pyhd8ed1ab_0 conda-forge
pydantic-core 2.16.3 py312h4b3b743_0 conda-forge
pygments 2.17.2 pyhd8ed1ab_0 conda-forge
pysocks 1.7.1 pyha2e5f31_6 conda-forge
python 3.12.2 hab00c5b_0_cpython conda-forge
python-dateutil 2.9.0 pyhd8ed1ab_0 conda-forge
python-fastjsonschema 2.19.1 pyhd8ed1ab_0 conda-forge
python-json-logger 2.0.7 pyhd8ed1ab_0 conda-forge
python-libsbml 5.20.2 py312h30efb56_1 conda-forge
python-symengine 0.11.0 py312h83f29e1_1 conda-forge
python-tzdata 2024.1 pyhd8ed1ab_0 conda-forge
python_abi 3.12 4_cp312 conda-forge
pytz 2024.1 pyhd8ed1ab_0 conda-forge
pyyaml 6.0.1 py312h98912ed_1 conda-forge
pyzmq 25.1.2 py312h886d080_0 conda-forge
qdldl-python 0.1.7.post0 py312hfb8ada1_1 conda-forge
qtconsole-base 5.5.1 pyha770c72_0 conda-forge
qtpy 2.4.1 pyhd8ed1ab_0 conda-forge
readline 8.2 h8228510_1 conda-forge
referencing 0.33.0 pyhd8ed1ab_0 conda-forge
requests 2.31.0 pyhd8ed1ab_0 conda-forge
rfc3339-validator 0.1.4 pyhd8ed1ab_0 conda-forge
rfc3986-validator 0.1.1 pyh9f0ad1d_0 conda-forge
rich 13.7.1 pyhd8ed1ab_0 conda-forge
rpds-py 0.18.0 py312h4b3b743_0 conda-forge
ruamel.yaml 0.18.6 py312h98912ed_0 conda-forge
ruamel.yaml.clib 0.2.8 py312h98912ed_0 conda-forge
scikit-learn 1.4.1.post1 py312h394d371_0 conda-forge
scipy 1.12.0 py312heda63a1_2 conda-forge
send2trash 1.8.2 pyh41d4057_0 conda-forge
setuptools 69.1.1 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
sniffio 1.3.1 pyhd8ed1ab_0 conda-forge
soupsieve 2.5 pyhd8ed1ab_1 conda-forge
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
swiglpk 5.0.10 py312h98912ed_0 conda-forge
symengine 0.11.2 hb29318e_0 conda-forge
sympy 1.12 pyh04b8f61_3 conda-forge
terminado 0.18.0 pyh0d859eb_0 conda-forge
threadpoolctl 3.3.0 pyhc1e730c_0 conda-forge
tinycss2 1.2.1 pyhd8ed1ab_0 conda-forge
tk 8.6.13 noxft_h4845f30_101 conda-forge
tomli 2.0.1 pyhd8ed1ab_0 conda-forge
tornado 6.4 py312h98912ed_0 conda-forge
traitlets 5.14.1 pyhd8ed1ab_0 conda-forge
types-python-dateutil 2.8.19.20240106 pyhd8ed1ab_0 conda-forge
typing-extensions 4.10.0 hd8ed1ab_0 conda-forge
typing_extensions 4.10.0 pyha770c72_0 conda-forge
typing_utils 0.1.0 pyhd8ed1ab_0 conda-forge
tzdata 2024a h0c530f3_0 conda-forge
uri-template 1.3.0 pyhd8ed1ab_0 conda-forge
urllib3 2.2.1 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge
webcolors 1.13 pyhd8ed1ab_0 conda-forge
webencodings 0.5.1 pyhd8ed1ab_2 conda-forge
websocket-client 1.7.0 pyhd8ed1ab_0 conda-forge
wheel 0.42.0 pyhd8ed1ab_0 conda-forge
widgetsnbextension 4.0.10 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
zeromq 4.3.5 h59595ed_1 conda-forge
zipp 3.17.0 pyhd8ed1ab_0 conda-forge
The beginning of the tax file looks like this, with 1410 rows total:
>>> tax.head()
genus id file species abundance sample_id domain kingdom phylum class order family
0 Azotobacter iAA1300 /projectnb2/talbot-lab-data/metabolic_models/c... vinelandii 0.011088 TOOL_002-O-20210804-COMP Bacteria Bacteria Pseudomonadota Gammaproteobacteria Pseudomonadales Pseudomonadaceae
1 Azotobacter iAA1300 /projectnb2/talbot-lab-data/metabolic_models/c... vinelandii 0.011444 TOOL_001-M-20170721-COMP Bacteria Bacteria Pseudomonadota Gammaproteobacteria Pseudomonadales Pseudomonadaceae
2 Azotobacter iAA1300 /projectnb2/talbot-lab-data/metabolic_models/c... vinelandii 0.019511 ORNL_029-O-20170621-COMP Bacteria Bacteria Pseudomonadota Gammaproteobacteria Pseudomonadales Pseudomonadaceae
3 Azotobacter iAA1300 /projectnb2/talbot-lab-data/metabolic_models/c... vinelandii 0.012908 HARV_001-M-20130709-COMP Bacteria Bacteria Pseudomonadota Gammaproteobacteria Pseudomonadales Pseudomonadaceae
4 Azotobacter iAA1300 /projectnb2/talbot-lab-data/metabolic_models/c... vinelandii 0.011425 ORNL_002-O-20170619-COMP Bacteria Bacteria Pseudomonadota Gammaproteobacteria Pseudomonadales Pseudomonadaceae
Custom models and some models from AGORA have active demand reactions for instance for biotin. This can make MICOM fail due to infeasible models in low nutrient settings.
See micom-dev/media#2 .
Adjust those demands so that the zero flux solution is included. Raise a warning or info in the logger if that is the case.
Hi, I am trying to build a community for two models and I've already changed the metabolite IDs for each model. I renamed the metabolite IDs as + such as C12145cytoplasm ('C12145' is KEGG ID and 'cytoplasm' is the name of the compartment). Firstly one thing I'd like to double-check is that do I need to change the compartment ID as well even though I didn't use the compartment ID for matching models?
Secondly, when I tried to join the two renamed models an error occurred:
File "/Users/wintermute/opt/anaconda3/lib/python3.7/urllib/parse.py", line 107, in <genexpr>
return tuple(x.decode(encoding, errors) if x else '' for x in args)
AttributeError: 'Model' object has no attribute 'decode'
Here is my code:
sc = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/yeast7.6_changeID_final.xml')
kp = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/KP_changeID_final3.xml')
community = [sc,kp]
micom.util.join_models(community)
Does anyone have any idea about that? Thanks.
There are repeated problems with processes not returning, excessive memory usage, and
Jupyter weirdness.
We should test switching to spawn
for all workflows. Hopefully, the overhead is not too bad.
Forkserver did not perform much better. Deactivating parallel computing would not be popular.
None.
Originally posted by anubhavdas0907 March 8, 2022
Hello Christian,
I was trying to copy a community model to a different variable, but I get an error.
I want to manipulate a model, without changing the original one.
Following are the details.
from micom import load_pickle
model = load_pickle("ERR1883210.pickle")
model_1 = model.copy()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/data/Conda_base/envs/MyConda/lib/python3.7/site-packages/cobra/core/model.py", line 321, in copy
new = self.__class__()
TypeError: __init__() missing 1 required positional argument: 'taxonomy'
Can you please suggest, what's wrong in this code, and what can be the solution?
Regards
Anubhav
I'm new to MICOM and COBRA, and I'd like to use MICOM to predict the metabolites similar to MAMBO and the Garza et al. 2018 paper's approach which, as I understand it, is to maximize the correlation of microbial growth with the known relative abundances. Is this possible with MICOM?
I'd guess this would involve changing the community objective function somehow, or are alternate objective functions already supported?
Thanks!
I'm trying to run com.cooperative_tradeoff on a community using some sbml files but cannot because I'm using the default GLPK and don't know how to use IBM Cplex instead. I downloaded it as an academic edition but do not know how to set it as the solver for micom. Thanks in advance!
When setting the upper bound for the import fluxes from the gut lumen, do I need to adjust for organism abundance? I noticed that if I look at community.medium, I get
In [12]: community.medium
Out[12]:
{'EX_4abut__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
'EX_7dhcdchol__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
'EX_7ocholate__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
'EX_C02528__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
...
where the flux for a metabolite includes the organism name already. The community constraints from the MICOM preprint suggest that I would not (instead setting all keys with"7ocholate" to have the same upper bound) but since these dictionary elements already include the organism name, I'm not sure.
Is there a way to see a pretty-print-ish format for the constraints sent off to the solver (Gurobi, in my case)?
Thanks!
When using models from different sources metabolite IDs may not be matched. Mention this pitfall in the docs and implement a warning in case models have no (or little) shared metabolites.
Hi! I really like MICOM! But I am having a problem when trying to perform the Taxa Knockout Analysis
I want to calculate the log fold change of the growth rates when performing the Taxa Knockouts. Currently, one can obtain one of three possible outputs when using com.knockout_taxa()
.
method='raw'
-> new: the new growth rates for the taxa in the community without the knocked taxon.method=
change` -> new - old: the absolute change in the growth rates for the new community without the knocked taxon with respect to the original community.method=
relative change` -> (new - old)/old: the relative change in the growth rates for the new community without the knocked taxon with respect to the original community.What I actually want for the log fold change is something like log(new) - log(old) (or log(new/old), which is the same). So I attempted two things for trying to achieve this:
com.knockout_taxa()
with method
equal to raw
and change
, and then calculate old
(the growth rates for the original community) as follows old = change - new (now that I know both new
and old
I can easily calculate the log fold change using either of the expressions above). The problem is that, when I checked the values for old
for all my samples, I consistently found some troubling values for some of the knocked taxa (always in one of the first three taxa in my matrix).com = load_pickle('mouse_01'.pickle)
diet = load_qiime_medium("vmh_high_fiber.qza")
com.medium = diet.flux
ko_raw = com.knockout_taxa(fraction=0.6, method='raw', diag=False)
ko_change = com.knockout_taxa(fraction=0.6, method='change', diag=False)
old = ko_raw - ko_change
print(old)
B_caccae B_cellulosilyticus_WH2 B_ovatus \
B_caccae NaN 0.045237 -0.023992
B_cellulosilyticus_WH2 0.000556 NaN 0.000966
B_ovatus 0.000556 0.037356 NaN
B_thetaiotaomicron 0.000556 0.037356 0.000986
B_uniformis 0.000556 0.037356 0.000986
B_vulgatus 0.000556 0.037356 0.000986
C_aerofaciens 0.000556 0.037356 0.000986
C_scindens 0.000556 0.037356 0.000986
C_spiroforme 0.000556 0.037356 0.000986
P_distasonis 0.000556 0.037356 0.000986
R_obeum 0.000556 0.037356 0.000986
B_thetaiotaomicron B_uniformis B_vulgatus \
B_caccae 0.066630 -0.002823 0.106982
B_cellulosilyticus_WH2 0.067358 0.000809 0.106279
B_ovatus 0.067358 0.000795 0.106279
B_thetaiotaomicron NaN 0.000795 0.106279
B_uniformis 0.067358 NaN 0.106279
B_vulgatus 0.067358 0.000795 NaN
C_aerofaciens 0.067358 0.000795 0.106279
C_scindens 0.067358 0.000795 0.106279
C_spiroforme 0.067358 0.000795 0.106279
P_distasonis 0.067358 0.000795 0.106279
R_obeum 0.067358 0.000795 0.106279
...
C_scindens 0.001470
C_spiroforme 0.001470
P_distasonis 0.001470
R_obeum NaN
As you can see, the results for old
should be a matrix where the values for all the rows in a column should be the same, and it it true for most of the taxa but it fails for the first 3 taxa. This happens consistently across my 15 samples and even when using different media.
com.knockout_taxa()
it returns me not only the new growth rates (when using method=raw
) but also the old values. The and, something that is also strange for me is that the values I am obtaining are quite different to the previous calculation (1.).def knockout_taxa(community, taxa, fraction, method, progress, diag=True):
"""Knockout a taxon from the community."""
with community as com:
check_modification(com)
min_growth = _format_min_growth(0.0, com.taxa)
_apply_min_growth(com, min_growth)
com.objective = com.scale * com.variables.community_objective
community_min_growth = (
optimize_with_retry(com, "could not get community growth rate.") / com.scale
)
regularize_l2_norm(com, fraction * community_min_growth)
old = com.optimize().members["growth_rate"]
results = []
iter = track(taxa, description="Knockouts") if progress else taxa
for sp in iter:
with com:
logger.info("getting growth rates for %s knockout." % sp)
[
r.knock_out()
for r in com.reactions.query(lambda ri: ri.community_id == sp)
]
sol = optimize_with_fraction(com, fraction)
new = sol.members["growth_rate"]
if "change" in method:
new = new - old
if "relative" in method:
new /= old
results.append(new)
old = old.drop("medium", axis=1)
ko = pd.DataFrame(results, index=taxa).drop("medium", axis=1)
ko = ko.loc[ko.index.sort_values(), ko.columns.sort_values()]
if not diag:
np.fill_diagonal(ko.values, np.NaN)
return ko, old
com = load_pickle('mouse_01'.pickle)
diet = load_qiime_medium("vmh_high_fiber.qza")
com.medium = diet.flux
new, old = com.knockout_taxa(fraction=0.6, method='raw', diag=False)
print(old)
growth_rate
B_caccae 0.0135695707228535
B_cellulosilyticus_WH2 0.0272505030167579
B_ovatus 0.0114618034691539
B_thetaiotaomicron 0.119902654224808
B_uniformis 0.0132031895100882
B_vulgatus 0.105075269656302
C_aerofaciens 0.00680013487305695
C_scindens 0.00386220655742489
C_spiroforme 0.0250118398731469
P_distasonis 0.058633920577937
R_obeum 0.0226188024707752
Package Information
-------------------
micom 0.33.2
Dependency Information
----------------------
cobra 0.29.0
highspy missing
jinja2 3.1.3
osqp 0.6.5
scikit-learn 1.2.2
scipy 1.11.4
symengine 0.11.0
Build Tools Information
-----------------------
pip 23.3
setuptools 68.0.0
wheel 0.41.2
Platform Information
--------------------
Linux 6.5.0-26-generic-x86_64
CPython 3.10.13
Hi
I have been following the published documentation on the MICOM webpage to create a proper workflow and model manifest for performing analysis and obtaining results. However, I am getting stuck with no rendered output. Below is the script I am using to perform constraint-based modeling with one sample at first (test case). If successful, I plan to extend this to three samples for a microbial community of twelve species, each represented by a genome-scale model and its respective relative abundance. I have provided placeholders for the taxonomy for now. I also included a medium file. However, I do not obtain any results as MICOM simply stalls.
import os
import pandas as pd
from micom.workflows import grow
from micom.workflows.build import build
from micom.media import minimal_medium
from micom.util import join_models, load_pickle, _read_model
from micom import Community
from cobra.io import read_sbml_model, write_sbml_model
# Define the base directory for model files and output
base_dir = "~/Gap-filled_MM2014_GEMs"
# Define the model paths relative to the base directory
model_paths = [
"Dehalobacter_restrictus_new_3.xml",
"Dehalobacter_sp004343605_new_3.xml",
"Acetobacterium_sp003260995.xml",
"Cloacimonadaceae_g__Cloacimonas.xml",
"g__Methanothrix_s.xml",
"Methanobacterium_C_congolense.xml",
"o__Bacteroidales_f__TTA-H9_g__TTA-H9_s__TTA-H9_sp002418865.xml",
"o__Bacteroidales_f__VadinHA17_g__LD21_s__.xml",
"o__Bacteroidales_f__VadinHA17_g__SR-FBR_E99_s__.xml",
"Rectinema_sp002441395.xml",
"Rectinema_sp015657395.xml",
"Rectinema_subterraneum.xml"
]
# Define the abundances
abundances = {
"Dehalobacter_restrictus_new_3": 0.3673,
"Dehalobacter_sp004343605_new_3": 0.0034,
"Acetobacterium_sp003260995": 0.0266,
"Cloacimonadaceae_g__Cloacimonas": 0.0258,
"g__Methanothrix_s": 0.0244,
"Methanobacterium_C_congolense": 0.0389,
"o__Bacteroidales_f__TTA_H9_g__TTA_H9_s__TTA_H9_sp002418865": 0.0227,
"o__Bacteroidales_f__VadinHA17_g__LD21_s__": 0.0021,
"o__Bacteroidales_f__VadinHA17_g__SR-FBR_E99_s__": 0.0008,
"Rectinema_sp002441395": 0.0063,
"Rectinema_sp015657395": 0.0744,
"Rectinema_subterraneum": 0.081
}
# Create placeholder taxonomic data
taxonomic_data = {
"kingdom": ["Bacteria"] * len(model_paths),
"phylum": ["Placeholder_Phylum"] * len(model_paths),
"class": ["Placeholder_Class"] * len(model_paths),
"order": ["Placeholder_Order"] * len(model_paths),
"family": ["Placeholder_Family"] * len(model_paths),
"genus": ["Placeholder_Genus"] * len(model_paths),
"species": ["Placeholder_Species"] * len(model_paths),
}
# Create the full paths for the models
full_model_paths = [os.path.join(base_dir, path) for path in model_paths]
# Create a DataFrame with model IDs, file paths, abundances, and taxonomic information
taxonomy_df = pd.DataFrame({
"id": [path.split('/')[-1].replace('.xml', '') for path in model_paths],
"file": full_model_paths,
"abundance": [abundances[path.split('/')[-1].replace('.xml', '')] for path in model_paths],
"sample_id": [f"sample_{i}" for i in range(len(model_paths))],
**taxonomic_data
})
# Ensure the output directory exists
output_directory = os.path.join(base_dir, "MICOM")
os.makedirs(output_directory, exist_ok=True)
# Debugging: Print the current working directory and full paths to verify correctness
print("Current working directory:", os.getcwd())
print("Full model paths:", full_model_paths)
print("Output directory:", output_directory)
# Check if all model files exist
for path in full_model_paths:
if not os.path.isfile(path):
print(f"File not found: {path}")
else:
print(f"File exists: {path}")
# Build the community models using the build function
manifest = build(taxonomy_df, model_db=None, out_folder=output_directory, cutoff=0.0001, threads=1, solver=None)
# Read the medium file into a pandas DataFrame
medium_file_path = os.path.join(output_directory, "edlab_mineral_medium_BiGG_IDs.csv")
medium_df = pd.read_csv(medium_file_path)
# Ensure the medium DataFrame has the correct columns
if 'reaction' not in medium_df.columns or 'flux' not in medium_df.columns:
raise ValueError("Medium file must contain 'reaction' and 'flux' columns")
# Add sample_id column to medium DataFrame
medium_df['sample_id'] = taxonomy_df['sample_id'][0]
# Use the grow function with the medium DataFrame
res = grow(manifest, model_folder=output_directory, medium=medium_df, tradeoff=0.5, threads=1)
# Print results
print(res)
The grow function should complete and return results, but it currently stalls with no rendered output.
Additional Context
Operating System: macOS
Python Version: 3.9
MICOM Version: 0.35.0
Any guidance or suggestions to resolve this issue would be greatly appreciated.
Thank you!
Hello , I want to know if the microbial model generated by MICOM takes automatically into account the coupling of microbial reaction with their biomass reactions ?
I want to To enforce that flux through a microbial reaction only occurs when its biomass reaction has a non-zero value, by including coupling constraints.
But I don't see how can I loop over all microbial reaction to couple each of them with the biomass reactions.
Do you have any idea please?
I'm wondering what the best approach would be to save output from the "grow" workflow in case it does not complete for a large dataset. I'm splitting my manifest file by rows into smaller chunks, which produces many results files, and still sometimes stalls at 98-99% complete, which might just be due to one problematic sample (right?)
Would you recommend modifying the internal "_growth" function to save results right before the "return" call? Ideally, the growth results could then be combined to still be used as inputs for visualizations.
Thanks!
Hello,
I have installed the latest version of MICOM (through pip) and have been unsuccessful in importing it. Interestingly, after installing MICOM, importing Cobra by itself also yields the same error. Any help with this?
import micom
OSQP and HIGHS are available but could not load the hybrid interface.
Traceback (most recent call last):
File "/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/optlang/__init__.py", line 61, in <module>
from optlang import hybrid_interface
File "/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/optlang/hybrid_interface.py", line 26, in <module>
import optlang.matrix_interface as mi
File "/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/optlang/matrix_interface.py", line 40, in <module>
from numpy import Infinity, array, concatenate, zeros
ImportError: cannot import name 'Infinity' from 'numpy' (/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/numpy/__init__.py). Did you mean: 'isfinite'?
..
Process finished with exit code 0
Python 3.12.3
micom 0.35.0
cobra 0.29
The Genome Taxonomy Database (GTDB) is comprehensive (especially the new v202 release) and more robust than the NCBI microbial taxonomy, especially given that the GTDB taxonomy is completely based off of genome phylogenic relatedness.
Although the MICOM docs are vague about the taxonomy that one must use, it appears that the NCBI taxonomy is required.
Provide direct support for the GTDB taxonomy.
I created a simple community model with two species, set the solver to gurobi, and saved the community model using to_pickle(). When I load the community model, gurobi logs several messages which makes loading the community model slow. Is this just the way gurobi works or are there some attributes or settings I should be using to make loading a model with gurobi as the solver efficient?
Here's how I created the community model:
>>> import micom
>>> import pandas as pd
>>> tax = pd.DataFrame({'id': ['a', 'b'], 'file': ['Clostridium_hathewayi_DSM_13479.xml', 'Eubacterium_siraeum_DSM_15702.xml']})
com = micom.Community(tax, id='test', progress=False)
>>> com.solver = 'gurobi'
>>> com.optimize()
<CommunitySolution 31.217 at 0x7f713cad3198>
>>> com.to_pickle('gurobitest.pickle')
Then, in a new Python session, I loaded the community model with these commands:
>>> import micom
>>> com = micom.load_pickle('gurobitest.pickle')
Academic license - for non-commercial use only
Changed value of parameter Method to 0
Prev: -1 Min: -1 Max: 5 Default: -1
Changed value of parameter Presolve to 0
Prev: -1 Min: -1 Max: 2 Default: -1
>>> com.solver
<optlang.gurobi_interface.Model object at 0x7fb02f8c2470>
>>> com.optimize()
<CommunitySolution 31.217 at 0x7fb036450a20>
Hi!
I have a question about the flag "status" which comes as an output when "cooperative_tradeoff" is run. In particular, sometimes I get as a result "optimal" but most of times I get "numeric". What does it mean?
Sincerely,
Arianna Basile
Goodmorning.
I have recently used your program to analyze metabolic exchanges in a community. I have only a question: in the output, I have for each metabolite two columns one called, for example, EX_but_e and the other EX_but_m. The EX_but_m appears only in one row which is the medium one. I supposed that summing the EX_but_e for all the microbes in my community I should have got the value in the medium row at EX_but_m column. But it is not the case. Why?
Sincerely,
Arianna Basile
The "Plotting consumed metabolites" section of the documentation describes the function plot_exchanges_per_sample
with default parameter direction='import'
as plotting the metabolites consumed by the microbial community.
In the source code for plot_exchanges_per_sample
, the subset with taxon='medium
' is selected. But isn't the "import" for the 'medium'
actually the export for the microbial community? This seems at odds with the description of plot_exchanges_per_sample(direction='import')
as plotting the consumption patterns of the microbes.
Perhaps I am misinterpreting the direction
of the fluxes for the 'medium
' in the output of the grow
function?
Originally posted by anubhavdas0907 April 14, 2022
Hello Christian,
I just wanted to point out that in the documentation for micom.workflows.grow.grow, the default strategy in the code is 'minimal imports', while in the explanation, pFBA is mentioned as default. Kindly edit accordingly.
micom.workflows.grow.grow(manifest, model_folder, medium, tradeoff, threads=1, weights=None, strategy='minimal imports', atol=None, rtol=None, presolve=False)
strategy : str
Computational strategy used to reduce the flux space. Default "pFBA" uses
parsimonious FBA, ...
Regards
Anubhav
Even with fluxes=True
passed into community.optimize_all()
, only maximal growth rates are returned. In fact, community.optimize_single()
does not accept the fluxes
argument.
Using version '0.22.6'
while import micom version 0.9.2 I got the following ImportError. Any suggestion to solve it?
Sincerely,
Arianna Basile
ImportError Traceback (most recent call last)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/swiglpk.py in swig_import_helper()
13 try:
---> 14 return importlib.import_module(mname)
15 except ImportError:
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/init.py in import_module(name, package)
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)
127
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in _gcd_import(name, package, level)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in find_and_load(name, import)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in find_and_load_unlocked(name, import)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in _load_unlocked(spec)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in module_from_spec(spec)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap_external.py in create_module(self, spec)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)
ImportError: libglpk.so.40: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import micom
2 import cplex
3 import cobra
4 import pandas
5 import gurobi
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/init.py in
1 """Simple init file for mico."""
2
----> 3 from micom.community import Community, load_pickle
4 from micom import (algorithms, problems, util, data, duality, media,
5 solution, workflows)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/community.py in
12 from micom.logger import logger
13 from micom.media import default_excludes
---> 14 from micom.optcom import optcom, solve
15 from micom.problems import cooperative_tradeoff, knockout_species
16
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/optcom.py in
5 check_modification)
6 from micom.logger import logger
----> 7 from micom.solution import solve
8 from optlang.symbolics import Zero
9 from functools import partial
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/solution.py in
14 from micom.logger import logger
15 from micom.util import reset_min_community_growth
---> 16 from swiglpk import glp_adv_basis
17
18
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/init.py in
----> 1 from .swiglpk import *
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/swiglpk.py in
15 except ImportError:
16 return importlib.import_module('_swiglpk')
---> 17 _swiglpk = swig_import_helper()
18 del swig_import_helper
19 elif _swig_python_version_info >= (2, 6, 0):
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/swiglpk.py in swig_import_helper()
14 return importlib.import_module(mname)
15 except ImportError:
---> 16 return importlib.import_module('_swiglpk')
17 _swiglpk = swig_import_helper()
18 del swig_import_helper
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/init.py in import_module(name, package)
124 break
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)
127
128
ModuleNotFoundError: No module named '_swiglpk'
Not really a problem but I noticed that the workflows.build_database()
function does not really compress the output. This is the current state of the code in workflows.build_database()
.
with ZipFile(out_path, "w") as zf:
[zf.write(a[2], os.path.basename(a[2])) for a in args]
zf.write(os.path.join(tdir, "manifest.csv"), "manifest.csv")
But if we look at the ZipFile() documentation, we can see that by default the compression level is 0 (no compression).
Init signature:
zipfile.ZipFile(
file,
mode='r',
compression=0,
allowZip64=True,
compresslevel=None,
)
Docstring:
Class with methods to open, read, write, close, list zip files.
z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=True,
compresslevel=None)
file: Either the path to the file, or a file-like object.
If it is a path, the file will be opened and closed by ZipFile.
mode: The mode can be either read 'r', write 'w', exclusive create 'x',
or append 'a'.
compression: ZIP_STORED (no compression), ZIP_DEFLATED (requires zlib),
ZIP_BZIP2 (requires bz2) or ZIP_LZMA (requires lzma).
allowZip64: if True ZipFile will create files with ZIP64 extensions when
needed, otherwise it will raise an exception when this would
be necessary.
compresslevel: None (default for the given compression type) or an integer
specifying the level to pass to the compressor.
When using ZIP_STORED or ZIP_LZMA this keyword has no effect.
When using ZIP_DEFLATED integers 0 through 9 are accepted.
When using ZIP_BZIP2 integers 1 through 9 are accepted.
Init docstring:
Open the ZIP file with mode read 'r', write 'w', exclusive create 'x',
or append 'a'.
I think this cancels out the advantage of using a .zip
file.
It would be nice if we had the option to choose a compression level when building the model database. This should not interfere with the reading in other parts of the code since the same module (zipfile) is used.
This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment 😄
The current format for fluxes returned by MICOM is a table in wide format:
In [1]: from micom import Community
In [2]: from micom.data import test_taxonomy
In [3]: com = Community(test_taxonomy())
Building ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
In [7]: sol = com.cooperative_tradeoff(fluxes=True)
In [8]: sol.fluxes
Out[8]:
reaction ACALD ACALDt ACKr ACONTa ACONTb ACt2r ADK1 ... SUCDi SUCOAS TALA THD2 TKT1 TKT2 TPI
compartment ...
Escherichia_coli_1 0.049190 -0.008897 -0.004224 5.999485 5.999485 -0.004224 3.388665e-11 ... 5.017641 -5.017641 1.489184 1.924736e-10 1.489184 1.173698 7.513137
Escherichia_coli_2 -0.079989 -0.115231 0.072559 6.001066 6.001066 0.072559 4.264225e-11 ... 5.033051 -5.033051 1.491048 1.924125e-10 1.491048 1.175562 7.495742
Escherichia_coli_3 0.102350 0.197394 -0.100513 6.004985 6.004985 -0.100513 3.662292e-11 ... 5.083935 -5.083935 1.506075 1.926208e-10 1.506075 1.190589 7.460396
Escherichia_coli_4 -0.071551 -0.073266 0.032177 6.023463 6.023463 0.032177 4.133342e-11 ... 5.122875 -5.122875 1.501628 1.926284e-10 1.501628 1.186143 7.440253
medium NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN
[5 rows x 115 columns]
This has resulted in some issues:
cobra.Solution.fluxes
which breaks a lot of the cobra functionality like for instance summary methods.CommunitySolution.fluxes
will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:
CommunitySolution.exchange_fluxes
Similar to the previous one but with the taxa annotated.
reaction name taxon flux direction micom_id
0 EX_ac_m ac_m medium exchange medium 1.814984e-11 export EX_ac_m
1 EX_acald_m acald_m medium exchange medium 1.328645e-11 export EX_acald_m
2 EX_akg_m akg_m medium exchange medium 3.225128e-12 export EX_akg_m
3 EX_co2_m co2_m medium exchange medium 2.280983e+01 export EX_co2_m
4 EX_etoh_m etoh_m medium exchange medium 1.515389e-11 export EX_etoh_m
.. ... ... ... ... ...
CommunitySolution.internal_fluxes
reaction name taxon flux micom_id
0 ACALD Acetaldehyde dehydrogenase (acetylating) Escherichia_coli_1 1.312146e+00 ACALD__Escherichia_coli_1
1 ACALDt Acetaldehyde reversible transport Escherichia_coli_1 3.236132e+00 ACALDt__Escherichia_coli_1
2 ACKr Acetate kinase Escherichia_coli_1 -1.304078e+00 ACKr__Escherichia_coli_1
3 ACONTa Aconitase (half-reaction A, Citrate hydro-lyase) Escherichia_coli_1 5.987675e+00 ACONTa__Escherichia_coli_1
4 ACONTb Aconitase (half-reaction B, Isocitrate hydro-l... Escherichia_coli_1 5.987675e+00 ACONTb__Escherichia_coli_1
This will consolidate GrowthResults
and CommunitySolution
and gives a more readable format. All those properties are generated on the fly when accessing the property.
Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations
.
A similar format change is planned for Community.knockout_taxa
. elasticities
already uses a long format.
I want to analyse the microbial growth rates for a given microbiome profile (single sample) when run on multiple different diets (or media). The idea is to understand how individual growth rates are impacted with changes in the medium. Multiple samples run on the same medium or grow
with the appropriate models has in-built multithreading to take advantage of a multi-core cpu. But growth of a single sample on multiple media doesn't have a multi threading option.
Given that a single sample grow
uses only 1 thread, I want to be able to run growth calculations on different media (100-200 types of media) by using multithreading to access all threads simultaneously.
I currently use the pool.starmap
function in the standard python multiprocessing library. I am facing an issue of memory pile up (likely due to threads that are not deallocated/ terminated despite the completion of the calculation) that causes the program to crash due to heavy ram usage.
This is issue is a collection of small things that should be fixed in the docs.
Just curious since there is additional information about what did not work in the cobrapy Solution object that could be returned. Or does it just not make sense to return a CommunitySolution object when the optimization did not work? I'm most interested in knowing what the solver status is.
The format for the taxonomy table community model manifests can be unclear and both are often confused for one another. Provide a validation helper and better error messages.
When trying to build a community, I always get a ValueError similar to the one below.
ValueError: ('The model already contains a reaction with the id:', 'ABC_RXN__smaAA1')
Files causing this can be found at https://gitlab.com/mostueve/micom-error-files
Hi. I have a question about cooperative_tradeoff. (micom 0.16.1, cplex 12.10.0)
I used medium condition from VMH database and manually added some components to make my community model(over 50 members) could grow.
What I want to look at is exchange reactions of each member.
It worked when I run this code with the community model,
com.cooperative_tradeoff(min_growth=0,fraction=0.4, fluxes=False, pfba=True)
But cooperative_tradeoff with fluxes=True returned this result.
com.cooperative_tradeoff(min_growth=0,fraction=0.4, fluxes=True, pfba=True)
2020-07-18 03:23:17.495 | WARNING | micom.solution:solve:222 - solver encountered an error infeasible
Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/MicomEdit/trash.py", line 74, in <module>
a=com.cooperative_tradeoff(min_growth=0,fraction=0.4,fluxes=True, pfba=True)
File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\micom\community.py", line 693, in cooperative_tradeoff
return cooperative_tradeoff(self, min_growth, fraction, fluxes, pfba)
File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\micom\problems.py", line 98, in cooperative_tradeoff
sol = crossover(com, sol, fluxes=fluxes, pfba=pfba)
File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\micom\solution.py", line 282, in crossover
% community.solver.status
cobra.exceptions.OptimizationError: crossover could not converge (status = infeasible).
Process finished with exit code 1
Also, I found that if I run with another samples (with different members of microbial community) it sometimes worked with fluxes=True.
I'm trying to find the medium condition which I can apply to multiple samples before I compare exchange fluxes.
And this issue keeps occuring and wondering why "fluxes" parameter matters and how to solve it.
Thank you!
Hi,
I am experiencing instability with MICOM when solving some cooperative tradeoff problems in small communities (2-5 species).
In short: assuming that a problem solved on a certain community gives a flux solution S, the same problem solved again returns a flux solution S' (all arguments and constraints being equal). After the second run, the returned solution remains S'.
Below there is a two-model community example using the E. coli iJO1366 model from BiGG.
System information:
OS Ubuntu 16.04
CPLEX 12.8
python 3.6.12
pandas 1.1.3
cobra 0.20.0
micom 0.21.3
import pandas as pd
import micom as mc
species_names = ['ecoli1', 'ecoli2']
model_paths = ['iJO1366.xml', 'iJO1366.xml']
abundances = [0.5, 0.5]
species_df = pd.DataFrame(data={'id': species_names, 'file': model_paths, 'abundance': abundances})
com = mc.Community(species_df, solver='cplex')
sol1 = com.cooperative_tradeoff(fraction = 0.5, fluxes=True, pfba=True)
sol2 = com.cooperative_tradeoff(fraction = 0.5, fluxes=True, pfba=True)
sol3 = com.cooperative_tradeoff(fraction = 0.5, fluxes=True, pfba=True)
print('first vs second:')
print((sol1.fluxes.round(6) - sol2.fluxes.round(6)).abs().max().max())
print('first vs third:')
print((sol1.fluxes.round(6) - sol3.fluxes.round(6)).abs().max().max())
Expected output:
first vs second:
0.0
first vs third:
0.0
Obtained output:
first vs second:
0.002876000000000545
first vs third:
0.002876000000000545
The status is optimal for all the three solutions.
While in this example the maximum flux difference is not dramatic, it tends to increase with more limiting medium uptake bounds, and it grows worringly large with communities composed by automatically-generated models. Apart from individual model instabilities, this behaviour would seem to depend also on some numerical instability in cooperative tradeoff that gets adjusted by CPLEX after the first run, seeing previous discussions opencobra/cobrapy#924.
It would seem that S' is the solution to be trusted given that it remains stable even after the third run, and because the two identical models in the example above get equal growth rates only in S' (up to solver tolerance):
print(sol1.members.loc['ecoli1', 'growth_rate'].round(6) == sol1.members.loc['ecoli2', 'growth_rate'].round(6))
print(sol2.members.loc['ecoli1', 'growth_rate'].round(6) == sol2.members.loc['ecoli2', 'growth_rate'].round(6))
print(sol3.members.loc['ecoli1', 'growth_rate'].round(6) == sol3.members.loc['ecoli2', 'growth_rate'].round(6))
Expected output:
True
True
True
Obtained output:
False
True
True
Is the reasoning correct?
I was wondering if this behaviour has been observed before and if the solver version could make a difference, or else what could be done to avoid it.
Cheers,
Guido
When using the second dataframe output from complete_db_medium
and summing up the fluxes per metabolite as the medium file for grow
, most communities couldn't grow (the flux values seemed too low)
I also tried getting the per-sample results using the minimal_media
function with summarize=False
, but I'm receiving this error after 100%:
from micom.workflows import minimal_media
minimal_media_persample = minimal_media(manual_amplicon_manifest, summarize=False,
model_folder="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/manual_amplicon/",
min_growth=.01,threads=28)
Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:04:52
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
Cell In[5], line 3
1 from micom.workflows import minimal_media
----> 3 minimal_media_persample = minimal_media(manual_amplicon_manifest, summarize=False,
4 model_folder="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/manual_amplicon/",
5 min_growth=.01,threads=28)
File /projectnb/talbot-lab-data/zrwerbin/.conda/envs/micom/lib/python3.12/site-packages/micom/workflows/media.py:64, in minimal_media(manifest, model_folder, summarize, min_growth, threads)
62 if summarize:
63 medium = results.groupby("reaction").flux.max().reset_index()
---> 64 medium["metabolite"] = medium.reaction.str.replace("EX_", "")
65 return medium
UnboundLocalError: cannot access local variable 'medium' where it is not associated with a value
Originally posted by @zoey-rw in #177 (comment)
I want to be able to increase the number of threads allowed in the grow function. It seems that the number of threads assigned to the growth of a single model (community model from a single sample) is fixed at 1. The threads option is used only when the model folder has multiple models.
Since the grow function can take upto 20 minutes for a solver to converge on a solution, I would like to be able to parallelise the grow process for individual models in order to speed up solve time and optimise system usage since running a single model engages only 1 thread, leaving the others unused.
From personal communication. Since it is an allowed file extension for SBML files we should support it.
In the documentation when running minimal_medium(), what are the units for the fluxes returned? I'm guessing something like mMol / hr. Also when returning fluxes for a community, I noticed that the fluxes are broken down by metabolite and bacteria. Do the fluxes already include a scaling factor for the abundance of the bacteria in the community?
Thanks!
hi
Im have some issues with 'load_pickle' command
for com_No in all_com:
com_name = 'com_' + str(com_No) + '.pickle'
print(com_name)
com = load_pickle('C:/Users/rotemb/OneDrive/Ph.D/binning_90_5/communities/com_test/'+com_name)
im geting the error
keyerror: '_gpr' when trying to load a pickle file python
my pickle file contain 4 species, the GEMs built using CarvMe and validated with MIMOTE
Following your recommendation from #31 , I execute build_database
as follows:
# df has columns:
# id kingdom phylum class order family genus species file
> m = build_database( manifest = df , out_path = "/data/out" , threads = 8 )
which gives warning message (but still completes successfully):
/usr/local/lib/python3.8/dist-packages/micom/workflows/build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.
meta.index = meta[rank].str.replace("[^\\w\\_]", "_")
Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━ 79% 0:02:12
System information:
$ python3 -c "import micom; micom.show_versions()"
System Information
==================
OS Linux
OS-release 5.4.0-1030-aws
Python 3.8.5
Package Versions
================
cobra 0.21.0
jinja2 2.10.1
micom 0.22.5
pip 20.0.2
scikit-learn 0.24.1
scipy 1.6.1
setuptools 45.2.0
symengine 0.7.2
wheel 0.34.2
Similar to the Qiime 2 plugin the micom readme and docs should mention where one can ask questions.
I have a custom set of SBML files for the organisms in my samples.
The input DataFrame looks like this:
> df.head()
sample_id id abundance file
0 sample1 taxa1 0.364398 /data/model.taxa1.xml.gz
1 sample1 taxa2 0.029670 /data/model.taxa2.xml.gz
2 sample1 taxa3 0.016877 /data/model.taxa3.xml.gz
...
Following the micom
documentation for building community models, I execute build
as follows:
from micom.workflows import build
manifest = build( taxonomy = df , model_db = None , out_folder = '/data/out', cutoff = 1e-2 , threads = 8 , solver = 'cplex')
Note that model_db
argument is None
because there is a file
column in the input DataFrame, as instructed by the documentation for build
.
However, the above build
command generates the following error:
Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% -:--:--
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.8/dist-packages/micom/workflows/build.py", line 28, in build_and_save
metrics = com.build_metrics.to_frame().T
File "/usr/local/lib/python3.8/dist-packages/micom/community.py", line 652, in build_metrics
raise ValueError(
ValueError: Metrics are only available for models build using a model database :(
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ubuntu/work/20210423_micom/src/test.py", line 13, in <module>
manifest = build( taxonomy = indf , model_db = None , out_folder = outdir , cutoff = 1e-2 , threads = 8 , solver = 'cplex')
File "/usr/local/lib/python3.8/dist-packages/micom/workflows/build.py", line 87, in build
res = workflow(build_and_save, args, threads)
File "/usr/local/lib/python3.8/dist-packages/micom/workflows/core.py", line 41, in workflow
results = list(it)
File "/usr/local/lib/python3.8/dist-packages/rich/progress.py", line 148, in track
yield from progress.track(
File "/usr/local/lib/python3.8/dist-packages/rich/progress.py", line 669, in track
for value in sequence:
File "/usr/lib/python3.8/multiprocessing/pool.py", line 868, in next
raise value
ValueError: Metrics are only available for models build using a model database :(
Here is some system information:
$ python3 -c "import micom; micom.show_versions()"
System Information
==================
OS Linux
OS-release 5.4.0-1030-aws
Python 3.8.5
Package Versions
================
cobra 0.21.0
jinja2 2.10.1
micom 0.22.5
pip 20.0.2
scikit-learn 0.24.1
scipy 1.6.1
setuptools 45.2.0
symengine 0.7.2
wheel 0.34.2
Hello,
I am running optcom on different communities with sol=com.optcom(strategy='original', min_growth=0.0, fluxes=False, pfba=True)
and I would like to export automatically the growth rates of all microbial models in the community. I tried sol.growth_rate
and sol.objective_value
but in both cases I get the growth rates of the community and not of each single metabolic model. How can I solve this issue?
Best,
Arianna
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.