Git Product home page Git Product logo

q2-gneiss's Introduction

qiime2 (the QIIME 2 framework)

Source code repository for the QIIME 2 framework.

QIIME 2™ is a powerful, extensible, and decentralized microbiome bioinformatics platform that is free, open source, and community developed. With a focus on data and analysis transparency, QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.

Visit https://qiime2.org to learn more about the QIIME 2 project.

Installation

Detailed instructions are available in the documentation.

Users

Head to the user docs for help getting started, core concepts, tutorials, and other resources.

Just have a question? Please ask it in our forum.

Developers

Please visit the contributing page for more information on contributions, documentation links, and more.

Citing QIIME 2

If you use QIIME 2 for any published research, please include the following citation:

Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, and Caporaso JG. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37:852–857. https://doi.org/10.1038/s41587-019-0209-9

q2-gneiss's People

Contributors

andrewsanchez avatar chriskeefe avatar david-rod avatar ebolyen avatar jairideout avatar lizgehret avatar mortonjt avatar nbokulich avatar oddant1 avatar q2d2 avatar qiyunzhu avatar thermokarst avatar turanoo avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

q2-gneiss's Issues

Error in gneiss tutorial

Hello,
I am following gneiss tutorial and I stuck at the beginning, when run
qiime gneiss correlation-clustering --i-table table.qza --o-clustering hierarchy.qza

I got this error:
Plugin error from gneiss:

Argument to parameter 'table' is not a subtype of FeatureTable[Composition].

Even when use my files, I got the same error.

Is there any updates for the plugin that not posted on the web

I wonder if any could help me.
Thanks

numerical values encoded as categories can raise error

The original problem was spotted here
https://forum.qiime2.org/t/plugin-error-from-gneiss-cannot-perform-reduce-with-flexible-type/8717

If the metadata column of interest that encodes categorical values that can also be represented as numerical values, then matplotlib will convert the values to numerical values and throw a flexible types error. The fix in the meantime is to convert the categories into strings that cannot be converted into numerical values (i.e. A, B, C, ...).

An example of the error message can be found below.

Traceback (most recent call last):
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in __call__
    results = action(**arguments)
  File "</Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-277>", line 2, in balance_taxonomy
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
    output_types, provenance)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 427, in _callable_executor_
    ret_val = self._callable(output_dir=temp_dir, **view_args)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_gneiss/plot/_plot.py", line 138, in balance_taxonomy
    palette=sample_palette)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/gneiss/plot/_decompose.py", line 74, in balance_boxplot
    a = sns.boxplot(ax=ax, x=balance_name, data=data, **kwargs)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/seaborn/categorical.py", line 2237, in boxplot
    plotter.plot(ax, kwargs)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/seaborn/categorical.py", line 549, in plot
    self.draw_boxplot(ax, boxplot_kws)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/seaborn/categorical.py", line 486, in draw_boxplot
    **kws)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/matplotlib/__init__.py", line 1867, in inner
    return func(ax, *args, **kwargs)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/matplotlib/axes/_axes.py", line 3571, in boxplot
    labels=labels, autorange=autorange)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/matplotlib/cbook/__init__.py", line 1843, in boxplot_stats
    stats['mean'] = np.mean(x)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2920, in mean
    out=out, **kwargs)
  File "/Users/jmorton/miniconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/numpy/core/_methods.py", line 75, in _mean
    ret = umr_sum(arr, axis, dtype, out, keepdims)
TypeError: cannot perform reduce with flexible type

Split ilr-transform into two commands

I think it may be a good idea to have two ilr transform commands, namely

  • ilr-hierarchical
  • ilr-phylogenetic

That way we can explicitly handle phylogenetic trees. No need for the assign-ids command, especially if we can fold this inside of the ilr-transform. Any thoughts?

Explicit support for phylogenies

Improvement Description
There are quite a few problems preventing phylogenies to be used in the qiime2 interface - namely because of the lack of tree tip filtering methods through the q2cli.

This can either be resolved by creating a tip filtering command, or even better, automatically filter tips (via gneiss.util.match_tips) from the tree prior to performing any gneiss command.

Questions
Any takers on this? cc @tanaes @wasade

Match tips when tree is larger than table in heatmap command

When trying to run the heatmap function as follows

qiime gneiss dendrogram-heatmap \
    --i-table voles_137_sortmerna_filtered_even18000_no177_filt100_composition.biom.qza \
    --i-tree phylogeny.qza \
    --m-metadata-file chernobyl_map_v2_no177.txt \
    --m-metadata-category treatment \
    --o-visualization rad_heatmap_no177 \
    --p-ndim 10 --p-method clr --p-color-map seismic

It can throw an error (see below).

Traceback (most recent call last):
  File "/Users/mortonjt/miniconda3/envs/q2-gneiss/lib/python3.5/site-packages/q2cli/commands.py", line 222, in __call__
    results = action(**arguments)
  File "<decorator-gen-261>", line 2, in dendrogram_heatmap
  File "/Users/mortonjt/miniconda3/envs/q2-gneiss/lib/python3.5/site-packages/qiime2/sdk/action.py", line 203, in callable_wrapper
    output_types, provenance)
  File "/Users/mortonjt/miniconda3/envs/q2-gneiss/lib/python3.5/site-packages/qiime2/sdk/action.py", line 363, in _callable_executor_
    ret_val = callable(output_dir=temp_dir, **view_args)
  File "/Users/mortonjt/Dropbox/UCSD/research/software/q2/q2-gneiss/q2_gneiss/plot/_plot.py", line 198, in dendrogram_heatmap
    highlight_width=0.01, figsize=(12, 8))
  File "/Users/mortonjt/miniconda3/envs/q2-gneiss/lib/python3.5/site-packages/gneiss-0.4.1-py3.5.egg/gneiss/plot/_heatmap.py", line 126, in heatmap
    _plot_highlights_dendrogram(ax_highlights, table, t, highlights)
  File "/Users/mortonjt/miniconda3/envs/q2-gneiss/lib/python3.5/site-packages/gneiss-0.4.1-py3.5.egg/gneiss/plot/_heatmap.py", line 179, in _plot_highlights_dendrogram
    node = t.find(n)
  File "/Users/mortonjt/miniconda3/envs/q2-gneiss/lib/python3.5/site-packages/skbio/tree/_tree.py", line 1562, in find
    raise MissingNodeError("Node %s is not in self" % name)
skbio.tree._exception.MissingNodeError: Node 1L-9fe19287-17b2-4dd7-ad52-60ff31dc67ad is not in self

Turns out that this happens when the size of the tree is larger than the table, and some of the internal nodes get filtered out prior to the actual rendering. Thanks @cuttlefishh for catching this!

Proportion plots can be misleading with few features

Improvement Description
The proportion plots can be a little misleading at the moment.

Right now, if there is only 1 feature in the numerator or the denominator, the proportion-plots will plot multiple features, even there is only 1.

The proportion plot should only plot 1 feature if there is only 1 feature.

Current Behavior

image

Here, there is only 1 feature in the denominator, but for some reason there are 5 features plotted.

Future ideas for `balance_taxonomy`

Improvement Description
Showing the full taxonomy string from the root to the current level is usually how we display the taxonomic information. That would make the Balance Taxonomy barplot match q2-taxa a bit better.

Additionally, the proportion plot should probably have the full taxonomy string, since it is representing a given balance tip. Alternatively, it might make sense for those to be collapsed to the taxa level as well.

It would be nice if the scatter plot was also colored by the partition group when working with numeric metadata.

ILR ordination plots

One of the things that makes the ILR transform really hard to use is its difficulty interpreting the balances.

I propose that we represent the ILR transform as an ordination object, something as follows in the below picture.

image

For each clade, -1 indicates if the species belongs to the denominator, +1 indicates if the species belongs to the numerator, and 0 indicates that it doesn't belong in that particular clade.

The sample loadings represents the log-ratios to be plotted in emperor, and the feature loadings represents the tips to be colored in empress.

The only minor concern is the memory requirements. If there are 100k microbes, then this object will be 100k x 100k.
So there will need to be some pruning for the really massive objects.
To figure out how to select the balances of interest, I propose two possible options

  1. Select the top k balances through variance
  2. Pass in a list of balances one wishes to analyze.

The only question I have now is, are the clade-membership values best represented as feature-metadata, or feature-loadings?

CC @ElDeveloper @fedarko

Remove deprecated actions

Improvement Description
#63 marked several actions for deprecation. We should remove those now that they have been deprecated for a whole release cycle (technically two now, unless we cut a 2019.10 patch).

Adding support for FeatureTensor types

Addition Description
I have a use case where I would like to compute the ILR transform on an entire tensor of (samples) x (features) x (monte_carlo_estimates). Doing so would open the doors for enabling Bayesian phylogenetic inference for differential abundance tasks.

Current Behavior
Doesn't exist

Proposed Behavior
Introduce a new function qiime gneiss ilr-phylogenetic-tensor taking the FeatureTensor type proposed in q2-differential as input.

Comments
This is going to depend on biocore/gneiss#288

balance-taxonomy CSV header inconsistencies

Bug Description
When the artifact generated by qiime gneiss balance-taxonomy is exported with qiime tools export, the resulting files include a “numerator.csv” and “denominator.csv” file.

The “usual” header line in these files contains: Feature ID,0,1,2,3,4,5,6.

However, if the file only contains only one feature, then the header becomes: ,0,1,2,3,4,5,6. The Feature ID is missing. This isn’t a big deal unless one tries to aggregate multiple “numerator.csv” and “denominator.csv” files together, which then produces some problems.

References
Ported wholesale from the forum

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.