micom-dev / micom Goto Github PK

View Code? Open in Web Editor NEW

87.0 5.0 17.0 48.09 MB

Python package to study microbial communities using metabolic modeling.

Home Page: https://micom-dev.github.io/micom

License: Apache License 2.0

Python 87.83% HTML 11.88% JavaScript 0.29%

microbiota cobrapy modeling metabolic-models flux biochemistry systems-biology microbiome python cobra

micom's Issues

Workflows are broken with Gurobi

This is a consequence of opencobra/optlang#232.

minimal_medium() returns more reactions than initially present

Dear maintainers,

I'm encountering a strange issue with micom.media.minimal_medium() in MICOM (version 0.16.1). Apparently the medium computed contains more metabolites than the initial medium provided to the community. From what I understand, this should not be possible with the parameter open_exchanges=False (more details below).

I have a community of roughly ~250 species from AGORA 1.03 (without mucins) and a custom medium derived from the WesternDiet provided in the original MICOM publication: western_diet_custom.txt. I also use CPLEX 12.9.0.0.

I apply the medium on my community in a similar way than shown in the documentation:

# List of exchange reaction in the model
ex_ids = [r.id for r in com.exchanges]
# Find medium reaction that are recognized by the community
medium_in_ex_ids = medium.index.isin(ex_ids)
# Define medium of community
com.medium = medium[medium_in_ex_ids]

Then, I run a cooperative tradeoff to obtain the community growth rate:

# Compute growth rate on diet
sol = com.cooperative_tradeoff(fraction=0.7, fluxes=True, pfba=False,)
sol

So far so good. And I'm also able to compute the minimal medium necessary to obtain 95% of this community growth rate:

# Compute minimal medium necessary to reach 95% growth rate
med = micom.media.minimal_medium(
    com,
    0.95 * sol.growth_rate,
    exports=False,
    open_exchanges=False,  # !!!
)
med

EX_26dap_M_m     0.013413
EX_2dmmq8_m      0.011625
EX_acgam_m      23.994033
EX_adocbl_m      0.000054
EX_arg_L_m       3.600214
                  ...    
EX_isochol_m     0.000001
EX_HC02194_m     0.000033
EX_uchol_m       0.000032
EX_sel_m         0.001193
EX_cellul_m      0.000836
Length: 130, dtype: float64

This is where I notice a problem: when I compare the fluxes of the minimal medium with the original medium, there are a handful of extra reactions that were not initially present (reproduced below).

From what I understand, it should not be possible with open_exchanges=False. Any idea of what is happening? I hope I'm giving enough information (feel free to ask for more, I didn't share the pickle of the community because it's a 300MB file, but I could find a way if it's necessary).

Note that I did not observe this problem with AGORA 1.03 (with mucins) or AGORA 1.02, and it seems to only happen with some specific samples. My main goal was to somehow use minimal_medium() in order to identify essential metabolites required by my communities, but the fact that the function add more reactions than expected might be troublesome.

Cheers,
Nils

Improve the warning for media metabolites with a missing transport reaction

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Is your feature related to a problem? Please describe it.

The warning regarding media components with missing imports is a bit too fatalistic given that it is not a real problem in most cases. See #63 for instance.

Describe the solution you would like.

Maybe change it to be a an info and only trigger a warning if a large number of metabolites can not be consumed or if there are no consumable carbon and nitrogen sources.

Describe alternatives you considered

Just changing the text of the warning, but it may still be too verbose.

Additional context

"build" step stalling at 99% complete

Problem description

After creating a custom database, the "build" command sometimes stalls at 99% when creating a manifest from 100+ samples and 16 taxa. If I open a new Python session and run the same build command, it occasionally creates the manifest successfully. I'm also encountering stalling at 99% complete with the fix_medium command (same dataset).

I'm using 28 cores so I don't think it's a lack of RAM, but would appreciate any ideas for debugging!

Code Sample

>>>build_database(tax, out_path="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/database", rank='genus', threads=28, compress=None, compresslevel=6, progress=True)
>>> soil_db = "/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/database/"
>>> 
>>> manifest = build(tax, 
...                  out_folder="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/",
...                  model_db=soil_db, cutoff=0.00001, threads=28)
[12:00:26] WARNING  Found existing models for 102 samples. Will skip those. Delete the output folder if you would like me to rebuild them.     
[normal messages ...]
Set parameter TokenServer to value "sccsvc"
Set parameter GURO_PAR_SPECIAL
Set parameter TokenServer to value "sccsvc"
Read LP format model from file /scratch/5511857.1.geo-int/tmpcafsj2wj.lp
Reading time = 0.14 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Set parameter GURO_PAR_SPECIAL
Set parameter TokenServer to value "sccsvc"
Read LP format model from file /scratch/5511857.1.geo-int/tmptp3038rs.lp
Reading time = 0.12 seconds
: 17548 rows, 40257 columns, 173487 nonzeros
Read LP format model from file /scratch/5511857.1.geo-int/tmpjz3xnx_x.lp
Reading time = 0.14 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Set parameter GURO_PAR_SPECIAL
Set parameter TokenServer to value "sccsvc"
Read LP format model from file /scratch/5511857.1.geo-int/tmpmq1q5ngz.lp
Reading time = 0.13 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Read LP format model from file /scratch/5511857.1.geo-int/tmp5vkdrvvb.lp
Reading time = 0.13 seconds
: 19665 rows, 44999 columns, 193299 nonzeros
Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸  99% 0:00:01

Context

(micom)[zrwerbin@scc-ul2 ~]$ python -c "import micom; micom.show_versions()"
Package Information
-------------------
micom 0.33.2
Dependency Information
----------------------
cobra             0.29.0
highspy          missing
jinja2             3.1.3
osqp               0.6.3
scikit-learn 1.4.1.post1
scipy             1.12.0
symengine         0.11.0
Build Tools Information
-----------------------
pip          24.0
setuptools 69.1.1
wheel      0.42.0
Platform Information
--------------------
Linux   4.18.0-513.9.1.el8_9.x86_64-x86_64
CPython                             3.12.2

Other session info:

(micom)[zrwerbin@scc-ul2 ~]$ conda list
# packages in environment at /projectnb/talbot-lab-data/zrwerbin/.conda/envs/micom:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
annotated-types           0.6.0              pyhd8ed1ab_0    conda-forge
anyio                     4.3.0              pyhd8ed1ab_0    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
argon2-cffi               23.1.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py312h98912ed_4    conda-forge
arrow                     1.3.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
async-lru                 2.0.4              pyhd8ed1ab_0    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
babel                     2.14.0             pyhd8ed1ab_0    conda-forge
libcblas                  3.9.0           21_linux64_openblas    conda-forge
beautifulsoup4            4.12.3             pyha770c72_0    conda-forge
bleach                    6.1.0              pyhd8ed1ab_0    conda-forge
brotli-python             1.1.0           py312h30efb56_1    conda-forge
bzip2                     1.0.8                hd590300_5    conda-forge
ca-certificates           2024.2.2             hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
certifi                   2024.2.2           pyhd8ed1ab_0    conda-forge
cffi                      1.16.0          py312hf06ca03_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
cobra                     0.29.0             pyhd8ed1ab_0    conda-forge
comm                      0.2.1              pyhd8ed1ab_0    conda-forge
debugpy                   1.8.1           py312h30efb56_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
depinfo                   2.2.0              pyhd8ed1ab_0    conda-forge
diskcache                 5.6.3              pyhd8ed1ab_0    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.0              pyhd8ed1ab_2    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
fqdn                      1.5.1              pyhd8ed1ab_0    conda-forge
future                    1.0.0              pyhd8ed1ab_0    conda-forge
gf2x                      1.3.0                ha476b99_2    conda-forge
glpk                      5.0                  h445213a_0    conda-forge
gmp                       6.3.0                h59595ed_0    conda-forge
gurobi                    11.0.1                  py312_0    gurobi
h11                       0.14.0             pyhd8ed1ab_0    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
httpcore                  1.0.4              pyhd8ed1ab_0    conda-forge
httpx                     0.27.0             pyhd8ed1ab_0    conda-forge
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
idna                      3.6                pyhd8ed1ab_0    conda-forge
importlib-metadata        7.0.1              pyha770c72_0    conda-forge
importlib_metadata        7.0.1                hd8ed1ab_0    conda-forge
importlib_resources       6.1.2              pyhd8ed1ab_0    conda-forge
ipykernel                 6.29.3             pyhd33586a_0    conda-forge
ipython                   8.22.2             pyh707e725_0    conda-forge
ipywidgets                8.1.2              pyhd8ed1ab_0    conda-forge
isoduration               20.11.0            pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.3              pyhd8ed1ab_0    conda-forge
joblib                    1.3.2              pyhd8ed1ab_0    conda-forge
json5                     0.9.21             pyhd8ed1ab_0    conda-forge
jsonpointer               2.4             py312h7900ff3_3    conda-forge
jsonschema                4.21.1             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
jsonschema-with-format-nongpl 4.21.1             pyhd8ed1ab_0    conda-forge
jupyter                   1.0.0             pyhd8ed1ab_10    conda-forge
jupyter-lsp               2.2.4              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.0              pyhd8ed1ab_0    conda-forge
jupyter_console           6.6.3              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.1           py312h7900ff3_0    conda-forge
jupyter_events            0.9.0              pyhd8ed1ab_0    conda-forge
jupyter_server            2.13.0             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.2              pyhd8ed1ab_0    conda-forge
jupyterlab                4.1.3              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_server         2.25.3             pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.10             pyhd8ed1ab_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
libblas                   3.9.0           21_linux64_openblas    conda-forge
libcblas                  3.9.0           21_linux64_openblas    conda-forge
libexpat                  2.5.0                hcb278e6_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libflint                  2.9.0           h2f819a4_ntl_100    conda-forge
libgcc-ng                 13.2.0               h807b86a_5    conda-forge
libgfortran-ng            13.2.0               h69a702a_5    conda-forge
libgfortran5              13.2.0               ha4646dd_5    conda-forge
libgomp                   13.2.0               h807b86a_5    conda-forge
liblapack                 3.9.0           21_linux64_openblas    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libopenblas               0.3.26          pthreads_h413a1c8_0    conda-forge
libosqp                   0.6.3                h6a678d5_0  
libqdldl                  0.1.7                hcb278e6_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.45.1               h2797004_0    conda-forge
libstdcxx-ng              13.2.0               h7e041cc_5    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libzlib                   1.2.13               hd590300_5    conda-forge
markdown-it-py            3.0.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.5           py312h98912ed_0    conda-forge
matplotlib-inline         0.1.6              pyhd8ed1ab_0    conda-forge
mdurl                     0.1.2              pyhd8ed1ab_0    conda-forge
micom                     0.33.2             pyhdfd78af_0    bioconda
mistune                   3.0.2              pyhd8ed1ab_0    conda-forge
mpc                       1.3.1                hfe3b2da_0    conda-forge
mpfr                      4.2.1                h9458935_0    conda-forge
mpmath                    1.3.0              pyhd8ed1ab_0    conda-forge
nbclient                  0.8.0              pyhd8ed1ab_0    conda-forge
nbconvert                 7.16.2             pyhd8ed1ab_0    conda-forge
nbconvert-core            7.16.2             pyhd8ed1ab_0    conda-forge
nbconvert-pandoc          7.16.2             pyhd8ed1ab_0    conda-forge
nbformat                  5.9.2              pyhd8ed1ab_0    conda-forge
ncurses                   6.4                  h59595ed_2    conda-forge
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
notebook                  7.1.1              pyhd8ed1ab_0    conda-forge
notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge
ntl                       11.4.3               hef3c4d3_1    conda-forge
numpy                     1.26.4          py312heda63a1_0    conda-forge
openssl                   3.2.1                hd590300_0    conda-forge
optlang                   1.8.1              pyhd8ed1ab_0    conda-forge
osqp                      0.6.3           py312hfb8ada1_2    conda-forge
overrides                 7.7.0              pyhd8ed1ab_0    conda-forge
packaging                 23.2               pyhd8ed1ab_0    conda-forge
pandas                    2.2.1           py312hfb8ada1_0    conda-forge
pandoc                    3.1.12.2             ha770c72_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.2.0              pyhd8ed1ab_0    conda-forge
prometheus_client         0.20.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.42             pyha770c72_0    conda-forge
prompt_toolkit            3.0.42               hd8ed1ab_0    conda-forge
psutil                    5.9.8           py312h98912ed_0    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pydantic                  2.6.3              pyhd8ed1ab_0    conda-forge
pydantic-core             2.16.3          py312h4b3b743_0    conda-forge
pygments                  2.17.2             pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.12.2          hab00c5b_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.19.1             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python-libsbml            5.20.2          py312h30efb56_1    conda-forge
python-symengine          0.11.0          py312h83f29e1_1    conda-forge
python-tzdata             2024.1             pyhd8ed1ab_0    conda-forge
python_abi                3.12                    4_cp312    conda-forge
pytz                      2024.1             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0.1           py312h98912ed_1    conda-forge
pyzmq                     25.1.2          py312h886d080_0    conda-forge
qdldl-python              0.1.7.post0     py312hfb8ada1_1    conda-forge
qtconsole-base            5.5.1              pyha770c72_0    conda-forge
qtpy                      2.4.1              pyhd8ed1ab_0    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.33.0             pyhd8ed1ab_0    conda-forge
requests                  2.31.0             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
rich                      13.7.1             pyhd8ed1ab_0    conda-forge
rpds-py                   0.18.0          py312h4b3b743_0    conda-forge
ruamel.yaml               0.18.6          py312h98912ed_0    conda-forge
ruamel.yaml.clib          0.2.8           py312h98912ed_0    conda-forge
scikit-learn              1.4.1.post1     py312h394d371_0    conda-forge
scipy                     1.12.0          py312heda63a1_2    conda-forge
send2trash                1.8.2              pyh41d4057_0    conda-forge
setuptools                69.1.1             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
soupsieve                 2.5                pyhd8ed1ab_1    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
swiglpk                   5.0.10          py312h98912ed_0    conda-forge
symengine                 0.11.2               hb29318e_0    conda-forge
sympy                     1.12               pyh04b8f61_3    conda-forge
terminado                 0.18.0             pyh0d859eb_0    conda-forge
threadpoolctl             3.3.0              pyhc1e730c_0    conda-forge
tinycss2                  1.2.1              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tornado                   6.4             py312h98912ed_0    conda-forge
traitlets                 5.14.1             pyhd8ed1ab_0    conda-forge
types-python-dateutil     2.8.19.20240106    pyhd8ed1ab_0    conda-forge
typing-extensions         4.10.0               hd8ed1ab_0    conda-forge
typing_extensions         4.10.0             pyha770c72_0    conda-forge
typing_utils              0.1.0              pyhd8ed1ab_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
uri-template              1.3.0              pyhd8ed1ab_0    conda-forge
urllib3                   2.2.1              pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webcolors                 1.13               pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.7.0              pyhd8ed1ab_0    conda-forge
wheel                     0.42.0             pyhd8ed1ab_0    conda-forge
widgetsnbextension        4.0.10             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.5                h59595ed_1    conda-forge
zipp                      3.17.0             pyhd8ed1ab_0    conda-forge

The beginning of the tax file looks like this, with 1410 rows total:

>>> tax.head()
         genus       id                                               file     species  abundance                 sample_id    domain   kingdom          phylum                class            order            family
0  Azotobacter  iAA1300  /projectnb2/talbot-lab-data/metabolic_models/c...  vinelandii   0.011088  TOOL_002-O-20210804-COMP  Bacteria  Bacteria  Pseudomonadota  Gammaproteobacteria  Pseudomonadales  Pseudomonadaceae
1  Azotobacter  iAA1300  /projectnb2/talbot-lab-data/metabolic_models/c...  vinelandii   0.011444  TOOL_001-M-20170721-COMP  Bacteria  Bacteria  Pseudomonadota  Gammaproteobacteria  Pseudomonadales  Pseudomonadaceae
2  Azotobacter  iAA1300  /projectnb2/talbot-lab-data/metabolic_models/c...  vinelandii   0.019511  ORNL_029-O-20170621-COMP  Bacteria  Bacteria  Pseudomonadota  Gammaproteobacteria  Pseudomonadales  Pseudomonadaceae
3  Azotobacter  iAA1300  /projectnb2/talbot-lab-data/metabolic_models/c...  vinelandii   0.012908  HARV_001-M-20130709-COMP  Bacteria  Bacteria  Pseudomonadota  Gammaproteobacteria  Pseudomonadales  Pseudomonadaceae
4  Azotobacter  iAA1300  /projectnb2/talbot-lab-data/metabolic_models/c...  vinelandii   0.011425  ORNL_002-O-20170619-COMP  Bacteria  Bacteria  Pseudomonadota  Gammaproteobacteria  Pseudomonadales  Pseudomonadaceae

Adjust active demands when loading models

Problem description

Custom models and some models from AGORA have active demand reactions for instance for biotin. This can make MICOM fail due to infeasible models in low nutrient settings.

Code Sample

See micom-dev/media#2 .

Suggested fix

Adjust those demands so that the zero flux solution is included. Raise a warning or info in the logger if that is the case.

Error occurs when joining two models after changing their metabolite IDs

Hi, I am trying to build a community for two models and I've already changed the metabolite IDs for each model. I renamed the metabolite IDs as + such as C12145cytoplasm ('C12145' is KEGG ID and 'cytoplasm' is the name of the compartment). Firstly one thing I'd like to double-check is that do I need to change the compartment ID as well even though I didn't use the compartment ID for matching models?

Secondly, when I tried to join the two renamed models an error occurred:

File "/Users/wintermute/opt/anaconda3/lib/python3.7/urllib/parse.py", line 107, in <genexpr>
    return tuple(x.decode(encoding, errors) if x else '' for x in args)
AttributeError: 'Model' object has no attribute 'decode'

Here is my code:

sc = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/yeast7.6_changeID_final.xml')

kp = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/KP_changeID_final3.xml')

community = [sc,kp]

micom.util.join_models(community)

Does anyone have any idea about that? Thanks.

Switch the multiprocessing default method to spawn

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Is your feature related to a problem? Please describe it.

There are repeated problems with processes not returning, excessive memory usage, and
Jupyter weirdness.

Describe the solution you would like.

We should test switching to spawn for all workflows. Hopefully, the overhead is not too bad.

Describe alternatives you considered

Forkserver did not perform much better. Deactivating parallel computing would not be popular.

Additional context

None.

Error while copying model using model.copy()

Discussed in #69

^{Originally posted by anubhavdas0907 March 8, 2022}
Hello Christian,

I was trying to copy a community model to a different variable, but I get an error.
I want to manipulate a model, without changing the original one.

Following are the details.

from micom import load_pickle
model = load_pickle("ERR1883210.pickle")
model_1 = model.copy()

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/Conda_base/envs/MyConda/lib/python3.7/site-packages/cobra/core/model.py", line 321, in copy
    new = self.__class__()
TypeError: __init__() missing 1 required positional argument: 'taxonomy'

Can you please suggest, what's wrong in this code, and what can be the solution?

Regards
Anubhav

Optimize community growth to match known relative abundance

I'm new to MICOM and COBRA, and I'd like to use MICOM to predict the metabolites similar to MAMBO and the Garza et al. 2018 paper's approach which, as I understand it, is to maximize the correlation of microbial growth with the known relative abundances. Is this possible with MICOM?

I'd guess this would involve changing the community objective function somehow, or are alternate objective functions already supported?

Thanks!

Add basic installation instructions for QP solvers to the docs

I'm trying to run com.cooperative_tradeoff on a community using some sbml files but cannot because I'm using the default GLPK and don't know how to use IBM Cplex instead. I downloaded it as an academic edition but do not know how to set it as the solver for micom. Thanks in advance!

Question on setting import flux upper bounds

When setting the upper bound for the import fluxes from the gut lumen, do I need to adjust for organism abundance? I noticed that if I look at community.medium, I get

In [12]: community.medium
Out[12]:
{'EX_4abut__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
'EX_7dhcdchol__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
'EX_7ocholate__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
'EX_C02528__91__e__93____Bacteroides_fragilis_YCH46_m': 1000.0,
...

where the flux for a metabolite includes the organism name already. The community constraints from the MICOM preprint suggest that I would not (instead setting all keys with"7ocholate" to have the same upper bound) but since these dictionary elements already include the organism name, I'm not sure.

Is there a way to see a pretty-print-ish format for the constraints sent off to the solver (Gurobi, in my case)?

Thanks!

Should warn about mismatched IDs

When using models from different sources metabolite IDs may not be matched. Mention this pitfall in the docs and implement a warning in case models have no (or little) shared metabolites.

Possible Bug in `com.knockout_taxa()`

Hi! I really like MICOM! But I am having a problem when trying to perform the Taxa Knockout Analysis

Problem description

I want to calculate the log fold change of the growth rates when performing the Taxa Knockouts. Currently, one can obtain one of three possible outputs when using com.knockout_taxa().

using method='raw' -> new: the new growth rates for the taxa in the community without the knocked taxon.
using method=change` -> new - old: the absolute change in the growth rates for the new community without the knocked taxon with respect to the original community.
using method=relative change` -> (new - old)/old: the relative change in the growth rates for the new community without the knocked taxon with respect to the original community.

What I actually want for the log fold change is something like log(new) - log(old) (or log(new/old), which is the same). So I attempted two things for trying to achieve this:

Run com.knockout_taxa() with method equal to raw and change, and then calculate old (the growth rates for the original community) as follows old = change - new (now that I know both new and old I can easily calculate the log fold change using either of the expressions above). The problem is that, when I checked the values for old for all my samples, I consistently found some troubling values for some of the knocked taxa (always in one of the first three taxa in my matrix).

com = load_pickle('mouse_01'.pickle)
diet = load_qiime_medium("vmh_high_fiber.qza")
com.medium = diet.flux
ko_raw = com.knockout_taxa(fraction=0.6, method='raw', diag=False)
ko_change = com.knockout_taxa(fraction=0.6, method='change', diag=False)
old = ko_raw - ko_change
print(old)

                        B_caccae  B_cellulosilyticus_WH2  B_ovatus  \
B_caccae                     NaN                0.045237 -0.023992   
B_cellulosilyticus_WH2  0.000556                     NaN  0.000966   
B_ovatus                0.000556                0.037356       NaN   
B_thetaiotaomicron      0.000556                0.037356  0.000986   
B_uniformis             0.000556                0.037356  0.000986   
B_vulgatus              0.000556                0.037356  0.000986   
C_aerofaciens           0.000556                0.037356  0.000986   
C_scindens              0.000556                0.037356  0.000986   
C_spiroforme            0.000556                0.037356  0.000986   
P_distasonis            0.000556                0.037356  0.000986   
R_obeum                 0.000556                0.037356  0.000986   

                        B_thetaiotaomicron  B_uniformis  B_vulgatus  \
B_caccae                          0.066630    -0.002823    0.106982   
B_cellulosilyticus_WH2            0.067358     0.000809    0.106279   
B_ovatus                          0.067358     0.000795    0.106279   
B_thetaiotaomicron                     NaN     0.000795    0.106279   
B_uniformis                       0.067358          NaN    0.106279   
B_vulgatus                        0.067358     0.000795         NaN   
C_aerofaciens                     0.067358     0.000795    0.106279   
C_scindens                        0.067358     0.000795    0.106279   
C_spiroforme                      0.067358     0.000795    0.106279   
P_distasonis                      0.067358     0.000795    0.106279   
R_obeum                           0.067358     0.000795    0.106279   
...
C_scindens              0.001470  
C_spiroforme            0.001470  
P_distasonis            0.001470  
R_obeum                      NaN

As you can see, the results for old should be a matrix where the values for all the rows in a column should be the same, and it it true for most of the taxa but it fails for the first 3 taxa. This happens consistently across my 15 samples and even when using different media.

So what I tried to do to verify what was happening was to clone the repo and modify it locally, so that when I ran com.knockout_taxa() it returns me not only the new growth rates (when using method=raw) but also the old values. The and, something that is also strange for me is that the values I am obtaining are quite different to the previous calculation (1.).

def knockout_taxa(community, taxa, fraction, method, progress, diag=True):
    """Knockout a taxon from the community."""
    with community as com:
        check_modification(com)
        min_growth = _format_min_growth(0.0, com.taxa)
        _apply_min_growth(com, min_growth)

        com.objective = com.scale * com.variables.community_objective
        community_min_growth = (
            optimize_with_retry(com, "could not get community growth rate.") / com.scale
        )
        regularize_l2_norm(com, fraction * community_min_growth)
        old = com.optimize().members["growth_rate"]
        results = []

        iter = track(taxa, description="Knockouts") if progress else taxa
        for sp in iter:
            with com:
                logger.info("getting growth rates for %s knockout." % sp)
                [
                    r.knock_out()
                    for r in com.reactions.query(lambda ri: ri.community_id == sp)
                ]

                sol = optimize_with_fraction(com, fraction)
                new = sol.members["growth_rate"]
                if "change" in method:
                    new = new - old
                if "relative" in method:
                    new /= old
                results.append(new)

        old = old.drop("medium", axis=1)
        ko = pd.DataFrame(results, index=taxa).drop("medium", axis=1)
        ko = ko.loc[ko.index.sort_values(), ko.columns.sort_values()]
        if not diag:
            np.fill_diagonal(ko.values, np.NaN)

        return ko, old

com = load_pickle('mouse_01'.pickle)
diet = load_qiime_medium("vmh_high_fiber.qza")
com.medium = diet.flux
new, old = com.knockout_taxa(fraction=0.6, method='raw', diag=False)
print(old)

	                                      growth_rate
B_caccae	                      0.0135695707228535
B_cellulosilyticus_WH2	      0.0272505030167579
B_ovatus	                              0.0114618034691539
B_thetaiotaomicron	              0.119902654224808
B_uniformis	                      0.0132031895100882
B_vulgatus	                      0.105075269656302
C_aerofaciens	                      0.00680013487305695
C_scindens	                      0.00386220655742489
C_spiroforme	                      0.0250118398731469
P_distasonis	                      0.058633920577937
R_obeum	                      0.0226188024707752

Context

Package Information
-------------------
micom 0.33.2

Dependency Information
----------------------
cobra         0.29.0
highspy      missing
jinja2         3.1.3
osqp           0.6.5
scikit-learn   1.2.2
scipy         1.11.4
symengine     0.11.0

Build Tools Information
-----------------------
pip          23.3
setuptools 68.0.0
wheel      0.41.2

Platform Information
--------------------
Linux   6.5.0-26-generic-x86_64
CPython                 3.10.13

Issue: No Rendered Output When Following MICOM Documentation for Constraint-Based Modeling

Problem description

I have been following the published documentation on the MICOM webpage to create a proper workflow and model manifest for performing analysis and obtaining results. However, I am getting stuck with no rendered output. Below is the script I am using to perform constraint-based modeling with one sample at first (test case). If successful, I plan to extend this to three samples for a microbial community of twelve species, each represented by a genome-scale model and its respective relative abundance. I have provided placeholders for the taxonomy for now. I also included a medium file. However, I do not obtain any results as MICOM simply stalls.

Code Sample

import os
import pandas as pd
from micom.workflows import grow
from micom.workflows.build import build
from micom.media import minimal_medium
from micom.util import join_models, load_pickle, _read_model
from micom import Community
from cobra.io import read_sbml_model, write_sbml_model

# Define the base directory for model files and output
base_dir = "~/Gap-filled_MM2014_GEMs"

# Define the model paths relative to the base directory
model_paths = [
    "Dehalobacter_restrictus_new_3.xml",
    "Dehalobacter_sp004343605_new_3.xml",
    "Acetobacterium_sp003260995.xml",
    "Cloacimonadaceae_g__Cloacimonas.xml",
    "g__Methanothrix_s.xml",
    "Methanobacterium_C_congolense.xml",
    "o__Bacteroidales_f__TTA-H9_g__TTA-H9_s__TTA-H9_sp002418865.xml",
    "o__Bacteroidales_f__VadinHA17_g__LD21_s__.xml",
    "o__Bacteroidales_f__VadinHA17_g__SR-FBR_E99_s__.xml",
    "Rectinema_sp002441395.xml",
    "Rectinema_sp015657395.xml",
    "Rectinema_subterraneum.xml"
]

# Define the abundances
abundances = {
    "Dehalobacter_restrictus_new_3": 0.3673,
    "Dehalobacter_sp004343605_new_3": 0.0034,
    "Acetobacterium_sp003260995": 0.0266,
    "Cloacimonadaceae_g__Cloacimonas": 0.0258,
    "g__Methanothrix_s": 0.0244,
    "Methanobacterium_C_congolense": 0.0389,
    "o__Bacteroidales_f__TTA_H9_g__TTA_H9_s__TTA_H9_sp002418865": 0.0227,
    "o__Bacteroidales_f__VadinHA17_g__LD21_s__": 0.0021,
    "o__Bacteroidales_f__VadinHA17_g__SR-FBR_E99_s__": 0.0008,
    "Rectinema_sp002441395": 0.0063,
    "Rectinema_sp015657395": 0.0744,
    "Rectinema_subterraneum": 0.081
}

# Create placeholder taxonomic data
taxonomic_data = {
    "kingdom": ["Bacteria"] * len(model_paths),
    "phylum": ["Placeholder_Phylum"] * len(model_paths),
    "class": ["Placeholder_Class"] * len(model_paths),
    "order": ["Placeholder_Order"] * len(model_paths),
    "family": ["Placeholder_Family"] * len(model_paths),
    "genus": ["Placeholder_Genus"] * len(model_paths),
    "species": ["Placeholder_Species"] * len(model_paths),
}

# Create the full paths for the models
full_model_paths = [os.path.join(base_dir, path) for path in model_paths]

# Create a DataFrame with model IDs, file paths, abundances, and taxonomic information
taxonomy_df = pd.DataFrame({
    "id": [path.split('/')[-1].replace('.xml', '') for path in model_paths],
    "file": full_model_paths,
    "abundance": [abundances[path.split('/')[-1].replace('.xml', '')] for path in model_paths],
    "sample_id": [f"sample_{i}" for i in range(len(model_paths))],
    **taxonomic_data
})

# Ensure the output directory exists
output_directory = os.path.join(base_dir, "MICOM")
os.makedirs(output_directory, exist_ok=True)

# Debugging: Print the current working directory and full paths to verify correctness
print("Current working directory:", os.getcwd())
print("Full model paths:", full_model_paths)
print("Output directory:", output_directory)

# Check if all model files exist
for path in full_model_paths:
    if not os.path.isfile(path):
        print(f"File not found: {path}")
    else:
        print(f"File exists: {path}")

# Build the community models using the build function
manifest = build(taxonomy_df, model_db=None, out_folder=output_directory, cutoff=0.0001, threads=1, solver=None)

# Read the medium file into a pandas DataFrame
medium_file_path = os.path.join(output_directory, "edlab_mineral_medium_BiGG_IDs.csv")
medium_df = pd.read_csv(medium_file_path)

# Ensure the medium DataFrame has the correct columns
if 'reaction' not in medium_df.columns or 'flux' not in medium_df.columns:
    raise ValueError("Medium file must contain 'reaction' and 'flux' columns")

# Add sample_id column to medium DataFrame
medium_df['sample_id'] = taxonomy_df['sample_id'][0]

# Use the grow function with the medium DataFrame
res = grow(manifest, model_folder=output_directory, medium=medium_df, tradeoff=0.5, threads=1)

# Print results
print(res)

Expected Behavior

The grow function should complete and return results, but it currently stalls with no rendered output.

Context

Additional Context

Operating System: macOS
Python Version: 3.9
MICOM Version: 0.35.0

Any guidance or suggestions to resolve this issue would be greatly appreciated.

Thank you!

Adding coupling constraint to a coupling model

Checklist

Hello , I want to know if the microbial model generated by MICOM takes automatically into account the coupling of microbial reaction with their biomass reactions ?
I want to To enforce that flux through a microbial reaction only occurs when its biomass reaction has a non-zero value, by including coupling constraints.
But I don't see how can I loop over all microbial reaction to couple each of them with the biomass reactions.
Do you have any idea please?

saving intermediate output from "grow"

I'm wondering what the best approach would be to save output from the "grow" workflow in case it does not complete for a large dataset. I'm splitting my manifest file by rows into smaller chunks, which produces many results files, and still sometimes stalls at 98-99% complete, which might just be due to one problematic sample (right?)

Would you recommend modifying the internal "_growth" function to save results right before the "return" call? Ideally, the growth results could then be combined to still be used as inputs for visualizations.

Thanks!

Hybrid solver broken for numpy 2.0+

Problem description

Hello,

I have installed the latest version of MICOM (through pip) and have been unsuccessful in importing it. Interestingly, after installing MICOM, importing Cobra by itself also yields the same error. Any help with this?

Code Sample

import micom

OSQP and HIGHS are available but could not load the hybrid interface.
Traceback (most recent call last):
  File "/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/optlang/__init__.py", line 61, in <module>
    from optlang import hybrid_interface
  File "/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/optlang/hybrid_interface.py", line 26, in <module>
    import optlang.matrix_interface as mi
  File "/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/optlang/matrix_interface.py", line 40, in <module>
    from numpy import Infinity, array, concatenate, zeros
ImportError: cannot import name 'Infinity' from 'numpy' (/Users/shahad/PycharmProjects/pythonProject/.venv/lib/python3.12/site-packages/numpy/__init__.py). Did you mean: 'isfinite'?
..

Process finished with exit code 0

Context

Python 3.12.3
micom 0.35.0
cobra 0.29

Support for GTDB taxonomy?

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Is your feature related to a problem? Please describe it.

The Genome Taxonomy Database (GTDB) is comprehensive (especially the new v202 release) and more robust than the NCBI microbial taxonomy, especially given that the GTDB taxonomy is completely based off of genome phylogenic relatedness.

Although the MICOM docs are vague about the taxonomy that one must use, it appears that the NCBI taxonomy is required.

Describe the solution you would like.

Provide direct support for the GTDB taxonomy.

Verbose messages from gurobi when loading community model

I created a simple community model with two species, set the solver to gurobi, and saved the community model using to_pickle(). When I load the community model, gurobi logs several messages which makes loading the community model slow. Is this just the way gurobi works or are there some attributes or settings I should be using to make loading a model with gurobi as the solver efficient?

Here's how I created the community model:

>>> import micom
>>> import pandas as pd
>>> tax = pd.DataFrame({'id': ['a', 'b'], 'file': ['Clostridium_hathewayi_DSM_13479.xml', 'Eubacterium_siraeum_DSM_15702.xml']})
com = micom.Community(tax, id='test', progress=False)
>>> com.solver = 'gurobi'                                
>>> com.optimize()                                       
<CommunitySolution 31.217 at 0x7f713cad3198>             
>>> com.to_pickle('gurobitest.pickle')

Then, in a new Python session, I loaded the community model with these commands:

>>> import micom
>>> com = micom.load_pickle('gurobitest.pickle')
Academic license - for non-commercial use only
Changed value of parameter Method to 0
   Prev: -1  Min: -1  Max: 5  Default: -1
Changed value of parameter Presolve to 0
   Prev: -1  Min: -1  Max: 2  Default: -1
>>> com.solver
<optlang.gurobi_interface.Model object at 0x7fb02f8c2470>
>>> com.optimize()
<CommunitySolution 31.217 at 0x7fb036450a20>

Fraction variation influences "status"

Hi!
I have a question about the flag "status" which comes as an output when "cooperative_tradeoff" is run. In particular, sometimes I get as a result "optimal" but most of times I get "numeric". What does it mean?

Sincerely,
Arianna Basile

exchanges not identified correctly in CarveME models

Goodmorning.
I have recently used your program to analyze metabolic exchanges in a community. I have only a question: in the output, I have for each metabolite two columns one called, for example, EX_but_e and the other EX_but_m. The EX_but_m appears only in one row which is the medium one. I supposed that summing the EX_but_e for all the microbes in my community I should have got the value in the medium row at EX_but_m column. But it is not the case. Why?

Sincerely,
Arianna Basile

interpretation of plot_exchanges_per_sample default behavior

The "Plotting consumed metabolites" section of the documentation describes the function plot_exchanges_per_sample with default parameter direction='import' as plotting the metabolites consumed by the microbial community.

In the source code for plot_exchanges_per_sample, the subset with taxon='medium' is selected. But isn't the "import" for the 'medium' actually the export for the microbial community? This seems at odds with the description of plot_exchanges_per_sample(direction='import') as plotting the consumption patterns of the microbes.

Perhaps I am misinterpreting the direction of the fluxes for the 'medium' in the output of the grow function?

Documentation and default parameters do not match for micom.workflows.grow.grow

Discussed in #80

^{Originally posted by anubhavdas0907 April 14, 2022}
Hello Christian,

I just wanted to point out that in the documentation for micom.workflows.grow.grow, the default strategy in the code is 'minimal imports', while in the explanation, pFBA is mentioned as default. Kindly edit accordingly.

micom.workflows.grow.grow(manifest, model_folder, medium, tradeoff, threads=1, weights=None, strategy='minimal imports', atol=None, rtol=None, presolve=False)

strategy : str
Computational strategy used to reduce the flux space. Default "pFBA" uses
parsimonious FBA, ...

Regards
Anubhav

optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

Even with fluxes=True passed into community.optimize_all(), only maximal growth rates are returned. In fact, community.optimize_single() does not accept the fluxes argument.

Using version '0.22.6'

Issue importing

while import micom version 0.9.2 I got the following ImportError. Any suggestion to solve it?

Sincerely,
Arianna Basile

ImportError Traceback (most recent call last)
/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/swiglpk.py in swig_import_helper()
13 try:
---> 14 return importlib.import_module(mname)
15 except ImportError:

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/init.py in import_module(name, package)
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)
127

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in _gcd_import(name, package, level)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in find_and_load(name, import)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in find_and_load_unlocked(name, import)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in _load_unlocked(spec)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in module_from_spec(spec)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap_external.py in create_module(self, spec)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)

ImportError: libglpk.so.40: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import micom
2 import cplex
3 import cobra
4 import pandas
5 import gurobi

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/init.py in
1 """Simple init file for mico."""
2
----> 3 from micom.community import Community, load_pickle
4 from micom import (algorithms, problems, util, data, duality, media,
5 solution, workflows)

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/community.py in
12 from micom.logger import logger
13 from micom.media import default_excludes
---> 14 from micom.optcom import optcom, solve
15 from micom.problems import cooperative_tradeoff, knockout_species
16

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/optcom.py in
5 check_modification)
6 from micom.logger import logger
----> 7 from micom.solution import solve
8 from optlang.symbolics import Zero
9 from functools import partial

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/micom/solution.py in
14 from micom.logger import logger
15 from micom.util import reset_min_community_growth
---> 16 from swiglpk import glp_adv_basis
17
18

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/init.py in
----> 1 from .swiglpk import *

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/swiglpk.py in
15 except ImportError:
16 return importlib.import_module('_swiglpk')
---> 17 _swiglpk = swig_import_helper()
18 del swig_import_helper
19 elif _swig_python_version_info >= (2, 6, 0):

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/site-packages/swiglpk/swiglpk.py in swig_import_helper()
14 return importlib.import_module(mname)
15 except ImportError:
---> 16 return importlib.import_module('_swiglpk')
17 _swiglpk = swig_import_helper()
18 del swig_import_helper

/mnt/data_SSD/anaconda3/envs/flux_balance/lib/python3.6/importlib/init.py in import_module(name, package)
124 break
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)
127
128

ModuleNotFoundError: No module named '_swiglpk'

Give the option to increase the compresslvl of model databases

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Is your feature related to a problem? Please describe it.

Not really a problem but I noticed that the workflows.build_database() function does not really compress the output. This is the current state of the code in workflows.build_database().

            with ZipFile(out_path, "w") as zf:
                [zf.write(a[2], os.path.basename(a[2])) for a in args]
                zf.write(os.path.join(tdir, "manifest.csv"), "manifest.csv")

But if we look at the ZipFile() documentation, we can see that by default the compression level is 0 (no compression).

Init signature:
zipfile.ZipFile(
    file,
    mode='r',
    compression=0,
    allowZip64=True,
    compresslevel=None,
)
Docstring:     
Class with methods to open, read, write, close, list zip files.

z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=True,
            compresslevel=None)

file: Either the path to the file, or a file-like object.
      If it is a path, the file will be opened and closed by ZipFile.
mode: The mode can be either read 'r', write 'w', exclusive create 'x',
      or append 'a'.
compression: ZIP_STORED (no compression), ZIP_DEFLATED (requires zlib),
             ZIP_BZIP2 (requires bz2) or ZIP_LZMA (requires lzma).
allowZip64: if True ZipFile will create files with ZIP64 extensions when
            needed, otherwise it will raise an exception when this would
            be necessary.
compresslevel: None (default for the given compression type) or an integer
               specifying the level to pass to the compressor.
               When using ZIP_STORED or ZIP_LZMA this keyword has no effect.
               When using ZIP_DEFLATED integers 0 through 9 are accepted.
               When using ZIP_BZIP2 integers 1 through 9 are accepted.
Init docstring:
Open the ZIP file with mode read 'r', write 'w', exclusive create 'x',
or append 'a'.

I think this cancels out the advantage of using a .zip file.

Describe the solution you would like.

It would be nice if we had the option to choose a compression level when building the model database. This should not interfere with the reading in other parts of the code since the same module (zipfile) is used.

[MICOM 1.0 API] Proposed new format for fluxes

This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment 😄

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Current state

The current format for fluxes returned by MICOM is a table in wide format:

In [1]: from micom import Community

In [2]: from micom.data import test_taxonomy

In [3]: com = Community(test_taxonomy())
Building ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00

In [7]: sol = com.cooperative_tradeoff(fluxes=True)

In [8]: sol.fluxes
Out[8]: 
reaction               ACALD    ACALDt      ACKr    ACONTa    ACONTb     ACt2r          ADK1  ...     SUCDi    SUCOAS      TALA          THD2      TKT1      TKT2       TPI
compartment                                                                                   ...                                                                          
Escherichia_coli_1  0.049190 -0.008897 -0.004224  5.999485  5.999485 -0.004224  3.388665e-11  ...  5.017641 -5.017641  1.489184  1.924736e-10  1.489184  1.173698  7.513137
Escherichia_coli_2 -0.079989 -0.115231  0.072559  6.001066  6.001066  0.072559  4.264225e-11  ...  5.033051 -5.033051  1.491048  1.924125e-10  1.491048  1.175562  7.495742
Escherichia_coli_3  0.102350  0.197394 -0.100513  6.004985  6.004985 -0.100513  3.662292e-11  ...  5.083935 -5.083935  1.506075  1.926208e-10  1.506075  1.190589  7.460396
Escherichia_coli_4 -0.071551 -0.073266  0.032177  6.023463  6.023463  0.032177  4.133342e-11  ...  5.122875 -5.122875  1.501628  1.926284e-10  1.501628  1.186143  7.440253
medium                   NaN       NaN       NaN       NaN       NaN       NaN           NaN  ...       NaN       NaN       NaN           NaN       NaN       NaN       NaN

[5 rows x 115 columns]

This has resulted in some issues:

It is incompatible with cobra.Solution.fluxes which breaks a lot of the cobra functionality like for instance summary methods.
It can be pretty sparse for very divergent models (many NA entries)
It mixes medium and taxa fluxes
It does not specify if export fluxes denote import or export which is one of the most common help requests we receive
Basically all methods using flux results in MICOM will convert them to a long format

Proposed new API for fluxes

CommunitySolution.fluxes will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:

`CommunitySolution.exchange_fluxes`

Similar to the previous one but with the taxa annotated.

      reaction                     name               taxon          flux direction                       micom_id
0      EX_ac_m     ac_m medium exchange              medium  1.814984e-11    export                        EX_ac_m
1   EX_acald_m  acald_m medium exchange              medium  1.328645e-11    export                     EX_acald_m
2     EX_akg_m    akg_m medium exchange              medium  3.225128e-12    export                       EX_akg_m
3     EX_co2_m    co2_m medium exchange              medium  2.280983e+01    export                       EX_co2_m
4    EX_etoh_m   etoh_m medium exchange              medium  1.515389e-11    export                      EX_etoh_m
..         ...                      ...                 ...           ...       ...

`CommunitySolution.internal_fluxes`

    reaction                                               name               taxon          flux                    micom_id
0      ACALD           Acetaldehyde dehydrogenase (acetylating)  Escherichia_coli_1  1.312146e+00   ACALD__Escherichia_coli_1
1     ACALDt                  Acetaldehyde reversible transport  Escherichia_coli_1  3.236132e+00  ACALDt__Escherichia_coli_1
2       ACKr                                     Acetate kinase  Escherichia_coli_1 -1.304078e+00    ACKr__Escherichia_coli_1
3     ACONTa   Aconitase (half-reaction A, Citrate hydro-lyase)  Escherichia_coli_1  5.987675e+00  ACONTa__Escherichia_coli_1
4     ACONTb  Aconitase (half-reaction B, Isocitrate hydro-l...  Escherichia_coli_1  5.987675e+00  ACONTb__Escherichia_coli_1

This will consolidate GrowthResults and CommunitySolution and gives a more readable format. All those properties are generated on the fly when accessing the property.

Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations.

Additional context

A similar format change is planned for Community.knockout_taxa. elasticities already uses a long format.

Analyse sample growth on multiple media (multi threading)

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Is your feature related to a problem? Please describe it.

I want to analyse the microbial growth rates for a given microbiome profile (single sample) when run on multiple different diets (or media). The idea is to understand how individual growth rates are impacted with changes in the medium. Multiple samples run on the same medium or grow with the appropriate models has in-built multithreading to take advantage of a multi-core cpu. But growth of a single sample on multiple media doesn't have a multi threading option.

Describe the solution you would like.

Given that a single sample grow uses only 1 thread, I want to be able to run growth calculations on different media (100-200 types of media) by using multithreading to access all threads simultaneously.

Describe alternatives you considered

I currently use the pool.starmap function in the standard python multiprocessing library. I am facing an issue of memory pile up (likely due to threads that are not deallocated/ terminated despite the completion of the calculation) that causes the program to crash due to heavy ram usage.

Update and fix the docs

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Bundled action items

This is issue is a collection of small things that should be fixed in the docs.

fix links in the docs
provide a section summarizing the available DBs and media
Improve the workflows section and maybe split it
give more info and the tradeoff parameter and how to choose it
update the theoretical/methods intro
document the QIIME2 artifact readers
update the docs for elasticities
add solver installation to docs

Why does problems.solve() return None when status is not optimal?

Just curious since there is additional information about what did not work in the cobrapy Solution object that could be returned. Or does it just not make sense to return a CommunitySolution object when the optimization did not work? I'm most interested in knowing what the solver status is.

Implement more checks and help for model tables

The format for the taxonomy table community model manifests can be unclear and both are often confused for one another. Provide a validation helper and better error messages.

ValueError is raised with custom reaction identifiers

When trying to build a community, I always get a ValueError similar to the one below.

ValueError: ('The model already contains a reaction with the id:', 'ABC_RXN__smaAA1')

Files causing this can be found at https://gitlab.com/mostueve/micom-error-files

Why is cooperative_tradeoff with "fluxes=True" not feasible sometimes?

Hi. I have a question about cooperative_tradeoff. (micom 0.16.1, cplex 12.10.0)
I used medium condition from VMH database and manually added some components to make my community model(over 50 members) could grow.

What I want to look at is exchange reactions of each member.
It worked when I run this code with the community model,

com.cooperative_tradeoff(min_growth=0,fraction=0.4, fluxes=False, pfba=True)

But cooperative_tradeoff with fluxes=True returned this result.
com.cooperative_tradeoff(min_growth=0,fraction=0.4, fluxes=True, pfba=True)

2020-07-18 03:23:17.495 | WARNING  | micom.solution:solve:222 - solver encountered an error infeasible

Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/MicomEdit/trash.py", line 74, in <module>
    a=com.cooperative_tradeoff(min_growth=0,fraction=0.4,fluxes=True, pfba=True)
  File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\micom\community.py", line 693, in cooperative_tradeoff
    return cooperative_tradeoff(self, min_growth, fraction, fluxes, pfba)
  File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\micom\problems.py", line 98, in cooperative_tradeoff
    sol = crossover(com, sol, fluxes=fluxes, pfba=pfba)
  File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\micom\solution.py", line 282, in crossover
    % community.solver.status
cobra.exceptions.OptimizationError: crossover could not converge (status = infeasible).

Process finished with exit code 1

Also, I found that if I run with another samples (with different members of microbial community) it sometimes worked with fluxes=True.

I'm trying to find the medium condition which I can apply to multiple samples before I compare exchange fluxes.
And this issue keeps occuring and wondering why "fluxes" parameter matters and how to solve it.
Thank you!

Different cooperative tradeoff solutions between first and second run

Hi,

I am experiencing instability with MICOM when solving some cooperative tradeoff problems in small communities (2-5 species).
In short: assuming that a problem solved on a certain community gives a flux solution S, the same problem solved again returns a flux solution S' (all arguments and constraints being equal). After the second run, the returned solution remains S'.

Below there is a two-model community example using the E. coli iJO1366 model from BiGG.

System information:
OS Ubuntu 16.04
CPLEX 12.8
python 3.6.12
pandas 1.1.3
cobra 0.20.0
micom 0.21.3

import pandas as pd
import micom as mc

species_names = ['ecoli1', 'ecoli2']
model_paths = ['iJO1366.xml', 'iJO1366.xml']
abundances = [0.5, 0.5]
species_df = pd.DataFrame(data={'id': species_names, 'file': model_paths, 'abundance': abundances})
com = mc.Community(species_df, solver='cplex')

sol1 = com.cooperative_tradeoff(fraction = 0.5, fluxes=True, pfba=True)
sol2 = com.cooperative_tradeoff(fraction = 0.5, fluxes=True, pfba=True)
sol3 = com.cooperative_tradeoff(fraction = 0.5, fluxes=True, pfba=True)

print('first vs second:')
print((sol1.fluxes.round(6) - sol2.fluxes.round(6)).abs().max().max())
print('first vs third:')
print((sol1.fluxes.round(6) - sol3.fluxes.round(6)).abs().max().max())

Expected output:

first vs second:
0.0
first vs third:
0.0

Obtained output:

first vs second:
0.002876000000000545
first vs third:
0.002876000000000545

The status is optimal for all the three solutions.

While in this example the maximum flux difference is not dramatic, it tends to increase with more limiting medium uptake bounds, and it grows worringly large with communities composed by automatically-generated models. Apart from individual model instabilities, this behaviour would seem to depend also on some numerical instability in cooperative tradeoff that gets adjusted by CPLEX after the first run, seeing previous discussions opencobra/cobrapy#924.

It would seem that S' is the solution to be trusted given that it remains stable even after the third run, and because the two identical models in the example above get equal growth rates only in S' (up to solver tolerance):

print(sol1.members.loc['ecoli1', 'growth_rate'].round(6) == sol1.members.loc['ecoli2', 'growth_rate'].round(6))
print(sol2.members.loc['ecoli1', 'growth_rate'].round(6) == sol2.members.loc['ecoli2', 'growth_rate'].round(6))
print(sol3.members.loc['ecoli1', 'growth_rate'].round(6) == sol3.members.loc['ecoli2', 'growth_rate'].round(6))

Expected output:

True
True
True

Obtained output:

False
True
True

Is the reasoning correct?
I was wondering if this behaviour has been observed before and if the solver version could make a difference, or else what could be done to avoid it.

Cheers,
Guido

UnboundLocalError when running minimal_media with `summarize=False`

When using the second dataframe output from complete_db_medium and summing up the fluxes per metabolite as the medium file for grow, most communities couldn't grow (the flux values seemed too low)

I also tried getting the per-sample results using the minimal_media function with summarize=False, but I'm receiving this error after 100%:

from micom.workflows import minimal_media


minimal_media_persample = minimal_media(manual_amplicon_manifest, summarize=False,
                    model_folder="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/manual_amplicon/", 
                          min_growth=.01,threads=28)

Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:04:52

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[5], line 3
      1 from micom.workflows import minimal_media
----> 3 minimal_media_persample = minimal_media(manual_amplicon_manifest, summarize=False,
      4                     model_folder="/projectnb/dietzelab/zrwerbin/N-cycle/data/MICOM/manual_amplicon/", 
      5                           min_growth=.01,threads=28)

File /projectnb/talbot-lab-data/zrwerbin/.conda/envs/micom/lib/python3.12/site-packages/micom/workflows/media.py:64, in minimal_media(manifest, model_folder, summarize, min_growth, threads)
     62 if summarize:
     63     medium = results.groupby("reaction").flux.max().reset_index()
---> 64 medium["metabolite"] = medium.reaction.str.replace("EX_", "")
     65 return medium

UnboundLocalError: cannot access local variable 'medium' where it is not associated with a value

Originally posted by @zoey-rw in #177 (comment)

Increase the number of threads for grow for a single model

Checklist

There are no similar issues or pull requests for this yet.
The request is not specific for MICOM Qiime 2 plugin (q2-micom)

Is your feature related to a problem? Please describe it.

I want to be able to increase the number of threads allowed in the grow function. It seems that the number of threads assigned to the growth of a single model (community model from a single sample) is fixed at 1. The threads option is used only when the model folder has multiple models.

Describe the solution you would like.

Since the grow function can take upto 20 minutes for a solver to converge on a solution, I would like to be able to parallelise the grow process for individual models in order to speed up solve time and optimise system usage since running a single model engages only 1 thread, leaving the others unused.

Support the `.sbml` extensions for SBML files

From personal communication. Since it is an allowed file extension for SBML files we should support it.

Question on units for media fluxes

In the documentation when running minimal_medium(), what are the units for the fluxes returned? I'm guessing something like mMol / hr. Also when returning fluxes for a community, I noticed that the fluxes are broken down by metabolite and bacteria. Do the fluxes already include a scaling factor for the abundance of the bacteria in the community?

Thanks!

keyerror: '_gpr' when using 'load_pickle'

Im have some issues with 'load_pickle' command

for com_No in all_com:
com_name = 'com_' + str(com_No) + '.pickle'
print(com_name)
com = load_pickle('C:/Users/rotemb/OneDrive/Ph.D/binning_90_5/communities/com_test/'+com_name)

im geting the error

keyerror: '_gpr' when trying to load a pickle file python

my pickle file contain 4 species, the GEMs built using CarvMe and validated with MIMOTE

build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

Following your recommendation from #31 , I execute build_database as follows:

# df has columns: 
#     id   kingdom   phylum  class  order  family   genus     species    file
> m = build_database( manifest = df , out_path = "/data/out" , threads = 8 )

which gives warning message (but still completes successfully):

/usr/local/lib/python3.8/dist-packages/micom/workflows/build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.
  meta.index = meta[rank].str.replace("[^\\w\\_]", "_")
Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━  79% 0:02:12

System information:

$ python3  -c "import micom; micom.show_versions()"

System Information
==================
OS                  Linux
OS-release 5.4.0-1030-aws
Python              3.8.5

Package Versions
================
cobra        0.21.0
jinja2       2.10.1
micom        0.22.5
pip          20.0.2
scikit-learn 0.24.1
scipy         1.6.1
setuptools   45.2.0
symengine     0.7.2
wheel        0.34.2

Mention and link where to ask questions

Similar to the Qiime 2 plugin the micom readme and docs should mention where one can ask questions.

add to README
add to docs

"Metrics are only available for models build using a model database" when using custom SBML files

I have a custom set of SBML files for the organisms in my samples.

The input DataFrame looks like this:

> df.head()
    sample_id   id  abundance                          file
0  sample1  taxa1   0.364398  /data/model.taxa1.xml.gz
1  sample1  taxa2   0.029670  /data/model.taxa2.xml.gz
2  sample1  taxa3   0.016877  /data/model.taxa3.xml.gz
...

Following the micom documentation for building community models, I execute build as follows:

from micom.workflows import build
manifest = build( taxonomy = df , model_db = None , out_folder = '/data/out', cutoff = 1e-2 , threads = 8 , solver = 'cplex')

Note that model_db argument is None because there is a file column in the input DataFrame, as instructed by the documentation for build.

However, the above build command generates the following error:

Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.8/dist-packages/micom/workflows/build.py", line 28, in build_and_save
    metrics = com.build_metrics.to_frame().T
  File "/usr/local/lib/python3.8/dist-packages/micom/community.py", line 652, in build_metrics
    raise ValueError(
ValueError: Metrics are only available for models build using a model database :(
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ubuntu/work/20210423_micom/src/test.py", line 13, in <module>
    manifest = build( taxonomy = indf , model_db = None , out_folder = outdir , cutoff = 1e-2 , threads = 8 , solver = 'cplex')
  File "/usr/local/lib/python3.8/dist-packages/micom/workflows/build.py", line 87, in build
    res = workflow(build_and_save, args, threads)
  File "/usr/local/lib/python3.8/dist-packages/micom/workflows/core.py", line 41, in workflow
    results = list(it)
  File "/usr/local/lib/python3.8/dist-packages/rich/progress.py", line 148, in track
    yield from progress.track(
  File "/usr/local/lib/python3.8/dist-packages/rich/progress.py", line 669, in track
    for value in sequence:
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
ValueError: Metrics are only available for models build using a model database :(

Here is some system information:

$ python3  -c "import micom; micom.show_versions()"

System Information
==================
OS                  Linux
OS-release 5.4.0-1030-aws
Python              3.8.5

Package Versions
================
cobra        0.21.0
jinja2       2.10.1
micom        0.22.5
pip          20.0.2
scikit-learn 0.24.1
scipy         1.6.1
setuptools   45.2.0
symengine     0.7.2
wheel        0.34.2

Export optcom results

Hello,
I am running optcom on different communities with sol=com.optcom(strategy='original', min_growth=0.0, fluxes=False, pfba=True) and I would like to export automatically the growth rates of all microbial models in the community. I tried sol.growth_rate and sol.objective_value but in both cases I get the growth rates of the community and not of each single metabolic model. How can I solve this issue?

Best,
Arianna

micom-dev / micom Goto Github PK

micom's Issues

Checklist

Is your feature related to a problem? Please describe it.

Describe the solution you would like.

Describe alternatives you considered

Additional context

Problem description

Code Sample

Context

Problem description

Code Sample

Suggested fix

Checklist

Is your feature related to a problem? Please describe it.

Describe the solution you would like.

Describe alternatives you considered

Additional context

Discussed in #69

Problem description

Context

Problem description

Code Sample

Expected Behavior

Context

Checklist

Problem description

Code Sample

Context

Checklist

Is your feature related to a problem? Please describe it.

Describe the solution you would like.

Discussed in #80

Checklist

Is your feature related to a problem? Please describe it.

Describe the solution you would like.

Checklist

Current state

Proposed new API for fluxes

CommunitySolution.exchange_fluxes

CommunitySolution.internal_fluxes

Additional context

Checklist

Is your feature related to a problem? Please describe it.

Describe the solution you would like.

Describe alternatives you considered

Checklist

Bundled action items

Checklist

Is your feature related to a problem? Please describe it.

Describe the solution you would like.

Recommend Projects

Recommend Topics

Recommend Org

`CommunitySolution.exchange_fluxes`

`CommunitySolution.internal_fluxes`