Git Product home page Git Product logo

galaxy-tools's Introduction

qiime2 (the QIIME 2 framework)

Source code repository for the QIIME 2 framework.

QIIME 2™ is a powerful, extensible, and decentralized microbiome bioinformatics platform that is free, open source, and community developed. With a focus on data and analysis transparency, QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.

Visit https://qiime2.org to learn more about the QIIME 2 project.

Installation

Detailed instructions are available in the documentation.

Users

Head to the user docs for help getting started, core concepts, tutorials, and other resources.

Just have a question? Please ask it in our forum.

Developers

Please visit the contributing page for more information on contributions, documentation links, and more.

Citing QIIME 2

If you use QIIME 2 for any published research, please include the following citation:

Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, and Caporaso JG. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37:852–857. https://doi.org/10.1038/s41587-019-0209-9

galaxy-tools's People

Contributors

bernt-matthias avatar ebolyen avatar lizgehret avatar oddant1 avatar q2d2 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

galaxy-tools's Issues

Drop suite_ from tools

By using suite_ planemo will actually not upload tool XML files from auto_tool_repositories as the suite type is inferred and only allows repository_dependencies.xml through, which isn't very helpful for us. See: bd0aeaf for the current hack.

Filter out parallel parameters

As indicated by @bernt-matthias in #32, Galaxy tools should not provide the user with parameters to control the underlying resource usage. Currently parameters such as n_jobs/n_threads/n_cores/etc are all basic Integer or String types, which means that we would have to rely on the parameter name to "guess" at the purpose.

Ideally we update the Framework to have a Parsl-compatible notion of the above concepts as primitive types, which we can then filter out more intentionally. @Oddant1, would you be able to do some investigative work to figure out what is going to be the most compatible option here? We don't want these new primitive types to show up to the user in Galaxy (unlike other interfaces), but we may be able to use them to template out some basic job configs for administrator convenience.

Once the framework is updated, we will need to update q2galaxy, all other interfaces to recognize the new primitive, and any plugins using these resource parameters.

Additional docs:
https://docs.galaxyproject.org/en/latest/admin/jobs.html#dynamic-destination-mapping

Figure out how test toolsheds work

We have a plugin called q2-mystery-stew which produces the tests used by q2galaxy.

Ideally we could render this plugin and start testing toolshed server to install this suite and run the tests. This would also help prove that our .shed.yml files are meaningful.

There's probably a lot of overlap here with #7

qiime2view does not work if Galaxy is behind a firewall

If Galaxy is behind a firewall, then the qiime2view visualization does not work. Currently the visualization sends a link to the viz website which does not work in such a setup.

Wondering if it is possible to rewrite the viz plugin such that it sends the data directly .. and if the qiime2view website would support getting data directly.

Tool tests

Wondering what your plans are wrt galaxy tool tests. Is there a possibility to auto generate them? Or maybe have manually curated tests that can be accumulated over time in this repo?

qiime2__diversity__core_metrics_phylogenetic error

2023.5.1+q2galaxy.2023.5.0.2

let me know if I should provide additional info.

This plugin encountered an error:
Command '['ssu', '-i', '/tmp/qiime2/b
erntm/data/0181e66c-c1cd-4a8f-841f-b2
6e294cdf4f/data/feature-table.biom',
'-t', '/tmp/qiime2/berntm/data/095f38
0b-f423-4a9a-a922-70c3d6fc7284/data/t
ree.nwk', '-m', 'unweighted', '-o',
'/tmp/q2-LSMatFormat-jwa2_krh']'
returned non-zero exit status 1.
                                                                                                                                                                                                                                                               
:(
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-okxb46uu because the default path (/home/qiime2/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Current file:     /opt/conda/conda-bld/unifrac-binaries_1664395747097/work/src/unifrac_cmp.cpp
        function: _ZN6su_acc21UnifracUnweightedTaskIdE4_runEjPKd
        line:     558
This file was compiled: -ta=tesla:cc35,cc50,cc60,cc60,cc70,cc75,cc80,cc80
/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/sklearn/metrics/pairwise.py:1776: DataConversionWarning: Data was converted to boolean for metric jaccard
  warnings.warn(msg, DataConversionWarning)
/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/skbio/stats/ordination/_principal_coordinate_analysis.py:143: RuntimeWarning: The result contains negative eigenvalues. Please compare their magnitude with the magnitude of some of the largest positive eigenvalues. If the negative ones are smaller, it's probably safe to ignore them, but if they are large in magnitude, the results won't be useful. See the Notes section for more details. The smallest eigenvalue is -0.008521727917718724 and the largest is 5.200826700955479.
  warn(
/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/skbio/stats/ordination/_principal_coordinate_analysis.py:143: RuntimeWarning: The result contains negative eigenvalues. Please compare their magnitude with the magnitude of some of the largest positive eigenvalues. If the negative ones are smaller, it's probably safe to ignore them, but if they are large in magnitude, the results won't be useful. See the Notes section for more details. The smallest eigenvalue is -0.0934827577673413 and the largest is 4.863627869126633.
  warn(
Traceback (most recent call last):
  File "/opt/conda/envs/qiime2-2023.5/bin/q2galaxy", line 11, in <module>
    sys.exit(root())
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/__main__.py", line 98, in run
    action_runner(plugin, action, config)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/action.py", line 30, in action_runner
    results = _execute_action(action, action_kwargs,
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/stdio.py", line 38, in wrapped
    return function(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/action.py", line 115, in _execute_action
    return action(**action_kwargs)
  File "<decorator-gen-176>", line 2, in core_metrics_phylogenetic
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
    outputs = self._callable_executor_(
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 590, in _callable_executor_
    outputs = self._callable(scope.ctx, **view_args)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_diversity/_core_metrics.py", line 66, in core_metrics_phylogenetic
    dms += unweighted_unifrac(table=cr.rarefied_table, phylogeny=phylogeny,
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/context.py", line 140, in deferred_action
    return action_obj._bind(
  File "<decorator-gen-724>", line 2, in unweighted_unifrac
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
    outputs = self._callable_executor_(
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 509, in _callable_executor_
    output_views = self._callable(**view_args)
  File "<decorator-gen-133>", line 2, in unweighted_unifrac
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_diversity_lib/_util.py", line 69, in _disallow_empty_tables
    return wrapped_function(*args, **kwargs)
  File "<decorator-gen-132>", line 2, in unweighted_unifrac
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_diversity_lib/_util.py", line 112, in _validate_requested_cpus
    return wrapped_function(*bound_arguments.args, **bound_arguments.kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_diversity_lib/beta.py", line 221, in unweighted_unifrac
    _omp_cmd_wrapper(threads, cmd)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_diversity_lib/_util.py", line 128, in _omp_cmd_wrapper
    return _run_external_cmd(cmd, verbose=verbose, env=env)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_diversity_lib/_util.py", line 122, in _run_external_cmd
    return subprocess.run(cmd, check=True, env=env)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ssu', '-i', '/tmp/qiime2/berntm/data/0181e66c-c1cd-4a8f-841f-b26e294cdf4f/data/feature-table.biom', '-t', '/tmp/qiime2/berntm/data/095f380b-f423-4a9a-a922-70c3d6fc7284/data/tree.nwk', '-m', 'unweighted', '-o', '/tmp/q2-LSMatFormat-jwa2_krh']' returned non-zero exit status 1.

Reorganize/improve import tool

I have a hard time figuring out how to import data into qiime2 tools using the import tool. I guess the most frequently used data is demultiplexed fastq.gz (maybe + sample data tsv file), e.g https://data.qiime2.org/2022.8/tutorials/importing/casava-18-single-end-demultiplexed.zip. I failed to find the corresponding option in the import tool.

  • I guess I start with SampleData[PairedEndSequencesWithQuality]?
  • Then there are many options that allow to select either
    • 1 dataset (e.g. Paired End Fastq Manifest Phred33)
    • or a collection (eg Casava One Eight Laneless Per Sample Directory Format) but the collection type is not set (I guess it should be collection_type="list:paired")
    • or individual files via a repeat where the number of elements can be anything between 1 and infinity

To get me started with exploring downstream tools it would be nice if someone could tell me for now how I could import data like the above (is there already a Galaxy specific tutorial that I did not notice so far?).

I guess the main problem is that the mapping between Galaxy concepts and qiime2 concepts needs a bit of improvement (e.g. that galaxy data types and collection types are not used yet). But probably its also because I'm unexperienced with qiime2 .. at the moment I'm just guessing that the goal of the import is to create a single qza dataset from all fastq files? Also I'm missing info in the help (like the definition of what a manifest is).

Since the tool is auto generated I'm unsure if this is easily possible. An alternative would be to handcraft an import tool covering the most frequently used types of input data that has a tight integration of the Galaxy concepts.

I imagine a tool that takes as input either

  • list:paired (for paired end data) or
  • list (for single end data)

with format fastq.gz plus (in addition simple data inputs with multiple="true" might be useful [because some users don't seem to like collections for some reason])

  • optional barcodes
  • optional tabular data set for metadata.

The tool then automatically knows about the phred encoding due to the specific Galaxy fastq.gz sub-datatypes.

Extra text in tool version

Seems that the tool version command outputs some extra text:

Matplotlib created a temporary config/cache directory at /tmp/matplotlib-kaxxsc2g because the default path (/home/qiime2/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
diversity version 2022.11.1

I guess MPLCONFIGDIR should be set for the version command (and maybe even for command).

Alternatively one could use some redirection / grep / tail to get only the actual version.

Use specific data formats

I just started to explore the qiime2 Galaxy tools. Obviously starting with the import tool I noticed that often the unspecific format="data" is used, e.g.

<param name="data" type="data" format="data" help="This data should be formatted as a FastqGzFormat. See the documentation below for more information."/>

this should be avoided, in particular if there are corresponding datatypes in Galaxy. In this specific example format="fastq.gz" seems appropriate. But there are also fastqsanger.gz or fastqillumina.gz if a specific phred encoding is required.

Errors due to wrong archive version

I have users reporting the following error (with strage line breaks)

Unexpected error loading arguments in
q2galaxy: /gpfs1/data/galaxy_server/g
alaxy/database/files/000/575/dataset_
575964.dat was created by 'QIIME
2023.5.1'. The currently installed
framework cannot interpret archive
version '6'.

I guess there is some versioning in the qza files and the users used a wrong one.

I think the metadata stored in the galaxy datatype https://github.com/galaxyproject/galaxy/blob/29290dea0cc78566947fc871fcc5634f11b4ee48/lib/galaxy/datatypes/qiime2.py#L67 can be used by adding a metadata validator https://docs.galaxyproject.org/en/master/dev/schema.html#tool-inputs-param-validator to data parameters that take qza input

Import: set file name automatically

Would it be possible to set the name argument, e.g. over here [https://github.com/qiime2/galaxy-tools/blob/65e4952f33eb335528e8553150e9097e5ea8f556/tools/suite_qiime2_core__tools/qiime2_core__tools__import.xml#L245C78-L245C78] automatically based on the corresponding data parameter?

In Galaxy you could use data.element_identifier. This would be equal to data.name for normal datasets, but different for collection elements (where element_identfier is the name of the collection element).

Edit: Maybe use the element identifier in case if name is left empty (which leads currently to an error).

Edit: So, the actual problem is that we see the following error if the name is something different than the name of the dataset. When running the tool manually that is "no problem" (but still a bit inconvenient and redundant), but when running in a workflow we do not know the name of the dataset...

Traceback (most recent call last):
  File "/opt/conda/envs/qiime2-2023.5/bin/q2galaxy", line 11, in <module>
    sys.exit(root())
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/__main__.py", line 96, in run
    builtin_runner(action, config)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/builtins.py", line 24, in builtin_runner
    tool(inputs, stdio=stdio)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/builtins.py", line 43, in import_data
    artifact = _import_name_data(type_, format_, files_to_move,
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/stdio.py", line 38, in wrapped
    return function(*args, **kwargs)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/q2galaxy/core/drivers/builtins.py", line 85, in _import_name_data
    return qiime2.Artifact.import_data(type_, dir_, view_type=format_)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/result.py", line 327, in import_data
    return cls._from_view(type_, view, view_type, provenance_capture,
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/result.py", line 355, in _from_view
    result = transformation(view, validate_level)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/core/transform.py", line 68, in transformation
    self.validate(view, level=validate_level)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/core/transform.py", line 143, in validate
    view.validate(level)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 177, in validate
    getattr(self, field)._validate_members(collected_paths, level)
  File "/opt/conda/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/plugin/model/directory_format.py", line 109, in _validate_members
    raise ValidationError(
qiime2.core.exceptions.ValidationError: Missing one or more files for EMPSingleEndDirFmt: 'sequences.fastq.gz'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.