This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org.
qiime2 / q2-fragment-insertion Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org.
Bug Description
Travis isn't set up correctly.
Improvement Description
I thought about the FeatureData[Taxonomy]
artifact and Daniel's warnings about the quality of the assigned taxonomic labels, which depend on the quality of the placements of taxonomic labels in the reference phylogeny. Furthermore, fragment insertion is not unambiguous, but results in a distribution of positions and I remember Siavash suggesting his program TIPP for taxonomy assignment. Thus, I think we better organize creation of a FeatureData[Taxonomy]
as a separate function instead of integrating it into the main function ("sepp").
Proposed Behavior
Currently, I am thinking about two alternatives to generate a FeatureData[Taxonomy]
:
classify-paths
: the current method which collects all taxonomic labels along the path from tip to root. Single input would be the Phylogeny[Rooted]
artifact.
classify-otus
: For every inserted fragment, we traverse the tree from tip to root. In every step, we check if we can find any OTU nodes in the current sub-tree. If so, we stop, otherwise continue the same procedure with the parent node. Once we found one (or maybe several) OTUs, we look up their assigned taxonomy lineage in Greengenes/Silva taxonomy table for corresponding reference tree. In case of several OTUs we report the longest commong prefix. This would require two inputs, the Phylogeny[Rooted]
artifact and the taxonomy table from Greengenes with two columns: OTU-ID and lineage-string. This is the more conservative method and should only produce results en par with current Greengenes based taxonomy assignment algorithms.
classify-tipp
: A feature development could use Siavash's TIPP to generate taxonomic lineages.
Questions
@wasade what are your thoughts?
Bug Description
I ran the classify otus experimental and I was getting an error that one of the entries was a float and it couldn't parse it. After digging into the taxonomy file, it looks like one of the entries was blank, and it was reading it as NaN, and it broke it. Once I deleted the line, everything ran fine.
Questions
Any chance something could be coded to avoid this issue in the future?
Siavash merged my patch, thus we can drop that and switch to his latest tagged version
Improvement Description
check what happens to seq names with whitespaces
Those are outdated, now that this is in the core distribution.
Improvement Description
PR #66 introduced major changes to the plugin and we have some open ToDos. Let us keep track of them here with this list:
References
PR #66
I ran fragment insertion as seen in the tutorial. I used the Silva 128 provided tree and alignment. My insertion tree was created, and I filtered my feature table. However, once I get to the classify-otus-experimental step and use the Silva 128 consensus 7 level taxonomy, I get the following error:
Not all OTUs in the provided insertion tree have mappings in the provided reference taxonomy.
I am attaching my insertion tree.
Any help would be appreciated!
Hello,
I've run into an error when installing this package into a QIIME2 (2018.8) conda environment (Miniconda3-latest-Linux-x86_64) installed into my home directory on a computing cluster (i.e., barnacle).
This is the code I ran to install the package:
$ conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion
This is the error that prints to screen when running the install code:
Solving environment: failed
Traceback (most recent call last):
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/exceptions.py", line 819, in __call__
return func(*args, **kwargs)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/main.py", line 78, in _main
exit_code = do_call(args, p)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 77, in do_call
exit_code = getattr(module, func_name)(args, parser)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/main_install.py", line 11, in execute
install(args, parser, 'install')
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/cli/install.py", line 235, in install
force_reinstall=context.force,
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 518, in solve_for_transaction
force_remove, force_reinstall)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 451, in solve_for_diff
final_precs = self.solve_final_state(deps_modifier, prune, ignore_pinned, force_remove)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 180, in solve_final_state
index, r = self._prepare(prepared_specs)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/solve.py", line 592, in _prepare
self.subdirs, prepared_specs)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/index.py", line 215, in get_reduced_index
new_records = query_all(spec)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/index.py", line 184, in query_all
return tuple(concat(future.result() for future in as_completed(futures)))
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 95, in query
self.load()
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 149, in load
_internal_state = self._load()
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 246, in _load
_internal_state = self._process_raw_repodata_str(raw_repodata_str)
File "/home/jpshaffer/software/miniconda3/lib/python3.7/site-packages/conda/core/subdir_data.py", line 369, in _process_raw_repodata_str
info['fn'] = fn
TypeError: 'NoneType' object does not support item assignment
$ /home/jpshaffer/software/miniconda3/bin/conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion
environment variables:
CIO_TEST=
CONDA_DEFAULT_ENV=qiime2-2018.8
CONDA_EXE=/home/jpshaffer/software/miniconda3/bin/conda
CONDA_PREFIX=/home/jpshaffer/software/miniconda3/envs/qiime2-2018.8
CONDA_PROMPT_MODIFIER=(qiime2-2018.8)
CONDA_PYTHON_EXE=/home/jpshaffer/software/miniconda3/bin/python
CONDA_ROOT=/home/jpshaffer/software/miniconda3
CONDA_SHLVL=1
MANPATH=/opt/slurm-18.08.0/share/man:/opt/torque-4.2.8/man:
MODULEPATH=/opt/modules/Modules/versions:/opt/modules/Modules/$MODULE_VERSION/mod
ulefiles:/opt/modules/Modules/modulefiles
PATH=/home/jpshaffer/software/miniconda3/envs/qiime2-2018.8/bin:/home/jpsha
ffer/software/miniconda3/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin
:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/gold/2.2.0.5/sbin:/opt/
gold/2.2.0.5/bin:/opt/torque-4.2.8/bin:/opt/torque-4.2.8/sbin:/opt/mau
i-3.3.1/bin:/opt/slurm-18.08.0/bin:/opt/slurm-18.08.0/sbin
PYTHONNOUSERSITE=/home/jpshaffer/software/miniconda3/envs/qiime2-2018.8/lib/python*/sit
e-packages/
REQUESTS_CA_BUNDLE=
SSL_CERT_FILE=
active environment : qiime2-2018.8
active env location : /home/jpshaffer/software/miniconda3/envs/qiime2-2018.8
shell level : 1
user config file : /home/jpshaffer/.condarc
populated config files :
conda version : 4.5.11
conda-build version : not installed
python version : 3.7.0.final.0
base environment : /home/jpshaffer/software/miniconda3 (writable)
channel URLs : https://conda.anaconda.org/anaconda/linux-64
https://conda.anaconda.org/anaconda/noarch
https://repo.anaconda.com/pkgs/main/linux-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/free/linux-64
https://repo.anaconda.com/pkgs/free/noarch
https://repo.anaconda.com/pkgs/r/linux-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/pro/linux-64
https://repo.anaconda.com/pkgs/pro/noarch
https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://conda.anaconda.org/bioconda/linux-64
https://conda.anaconda.org/bioconda/noarch
https://conda.anaconda.org/biocore/linux-64
https://conda.anaconda.org/biocore/noarch
package cache : /home/jpshaffer/software/miniconda3/pkgs
/home/jpshaffer/.conda/pkgs
envs directories : /home/jpshaffer/software/miniconda3/envs
/home/jpshaffer/.conda/envs
platform : linux-64
user-agent : conda/4.5.11 requests/2.19.1 CPython/3.7.0 Linux/2.6.32-573.26.1.el6.x86_64.debug centos/6.6 glibc/2.12
UID:GID : 420084:550
netrc file : None
offline mode : False
An unexpected error has occurred. Conda has prepared the above report.
If submitted, this report will be used by core maintainers to improve
future releases of conda.
I was able to reproduce the error after uninstalling and reinstalling both Miniconda and the QIIME2 environment.
Please let me know if you need additional information to troubleshoot this error.
Thanks in advance and best wishes,
Justin
We are now able to pass in other reference trees/alignments. Thus, I think we should rename the QIIME 2 function into something independent of "16S-greengenes", if we intend to compile other references like Silva. Maybe just call it "sepp" ?
Excuse me, how to solve this problem
Plugin error from phylogeny:
Command '['mafft', '--preservecase', '--inputorder', '--thread', '33', '/tmp/qiime2-archive-tspmm41w/4dd87431-bf1c-465f-8f38-2d4c3a9605cf/data/dna-sequences.fasta']' returned non-zero exit status 1.
Debug info has been saved to /tmp/qiime2-q2cli-err-6e_xupmi.log
Bug Description
The tutorial (recently moved to the QIIME 2 library) cites a taxonomy_gg99.qza file link that is broken.
Steps to reproduce the behavior
See "assign taxonomy" tutorial here: https://library.qiime2.org/plugins/q2-fragment-insertion/16/
Expected behavior
File should be replaced here, or the link fixed to point elsewhere in the tutorial (probably a better solution).
References
forum xref
Is there a reason why you chose to not use the tree and placement files (.relabelled) that have the restored internal node labels? As far as I understand the code, Siavash assigns every node a unique ID and prefixed the original label with this ID. In a postprocessing step (a generated python program) those IDs get trimmed from the labels to restore their original values.
Thus, users don't see those IDs in e.g. the taxonomy labels of the reference.
Improvement Description
It should be possible to download the QIIME compatible version of Silva and construct reference phylogeny and alignment for SEPP to enable 18S analyses.
Questions
@josenavas @wasade do you know if release 128 is the latest?
How and where would we host SEPP compatible references? Within this Plugin (which is already 130 MB large), on the github repo?
with respect to later q2 version and the newly optional reference inputs
add a qiime2 function feature-table -> phylogeny -> feature-table that removes those features not found in phylogeny.
And maybe reports about lost read ratio?!
Improvement Description
There are increasing numbers of use cases where one wants to merge placements from different runs against the same reference phylogeny.
Questions
Hello.
I am trying the following:
conda config --add channels anaconda
conda config --add channels conda-forge
conda config --add channels defaults
conda config --add channels r
conda config --add channels bioconda
conda install -c qiime2/label/r2018.6 qiime2
conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion
qiime dev refresh-cache
But, when trying to "Solve the environment", I am getting the PackagesNotFoundError
:
conda install -c anaconda -c defaults -c conda-forge -c bioconda -c https://conda.anaconda.org/biocore q2-fragment-insertion
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- q2-fragment-insertion
- q2cli[version='>=2017.12.*']
- q2-fragment-insertion
- q2-feature-table[version='>=2017.12.*']
- q2-fragment-insertion
- q2-types[version='>=2017.12.*']
- q2-fragment-insertion
- q2templates[version='>=2017.12.*']
Current channels:
- https://conda.anaconda.org/anaconda/linux-64
- https://conda.anaconda.org/anaconda/noarch
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/linux-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://repo.anaconda.com/pkgs/pro/linux-64
- https://repo.anaconda.com/pkgs/pro/noarch
- https://conda.anaconda.org/conda-forge/linux-64
- https://conda.anaconda.org/conda-forge/noarch
- https://conda.anaconda.org/bioconda/linux-64
- https://conda.anaconda.org/bioconda/noarch
- https://conda.anaconda.org/biocore/linux-64
- https://conda.anaconda.org/biocore/noarch
- https://conda.anaconda.org/r/linux-64
- https://conda.anaconda.org/r/noarch
I am trying to create a Singularity container with qiime2 plus your extension.
Thank you.
Anders.
Hi @wasade ,
testing is currently not very convenient, because of the long waiting times. Therefore, I think passing reference tree/alignment would be quite beneficial. I wonder how to design that.
Since both are Semantic Types (FeatureData[AlignedSequence] Phylogeny[Rooted]) they can only be "inputs" not "parameters" right? If so, do you know if it is possible to have optional inputs?
If not, the user needs to always pass reference alignment and reference phylogeny as q2 artifacts. Do we really want to put that burden to users or would we be fine to have two "parameters" (which can be optional) that point to filenames?
P.S. could you invite me to the slack channel for q2?
Improvement Description
I finally was able to clean up Siavash's source code and created a bioconda recipe for SEPP, producing the packages at https://anaconda.org/bioconda/sepp
Note that this package does NOT contain the default Greengenes 13.8 99% reference (which consists of three files a) alignment b) tree c) info file.) In the future, we also want to support alternative references like SILVA.
Proposed Behavior
I wonder how we best do this? I see the following options:
Data resources
Questions
Any thoughts @thermokarst @antgonza ?
References
double check if --p-threads is correctly passed to executable
pick a licence!
according to Siavash, SEPP might fail if fragments to be inserted have same names as tips of reference tree. Add a testing function to abort early if user provides conflicting names.
How about internal node names?
Can be found here:
Here's the relevant pin. This will need to be addressed for q2-fragment-insertion to stay in the amplicon distribution when QIIME 2 transitions its Python version to 3.10 (planned for the 2024.10 release, which is currently scheduled for 2 October 2024).
Bug Description
Hello I've been trying to use q2-fragment insertion in order to use PICRUSt2, following the instructions from the original source, unfortunately I got an error from this plug in, in some forum I saw the same error, and followed the instructions using this command:
first I tried it with the files of my interest but then I tried the files provided in the tutorial
qiime fragment-insertion sepp --i-representative-sequences mammal_seqs.qza --p-threads 12 --i-reference-alignment reference.fna.qza --i-reference-phylogeny reference.tre.qza --output-dir pruebapicrust2tutorial --p-debug --verbose 2> err.txt > out.txt
there was no follow up on the error.
References
in order to view a more detailed information here are the files
err.txt
out.txt
Comments
I'm using an hp with the following hardware:
AMD® A12-9720p radeon r7, 12 compute cores 4c+8g × 4
I though it may be a problem with the installation, so I removed qiime2 and reinstalled. I updated Anaconda and conda to the latest version.
Thank you
Hi there,
I stumbled on a weird behaviour with qiime fragment-insertions
where why I run the following I get an error that it cannot file the 'filter-features' option.
qiime fragment-insertion filter-features \
--i-table $path2table \
--i-tree insertion-tree.qza \
--o-filtered-table filtered_table.qza \
--o-removed-table removed_table.qza
Returns the following error:
Error: QIIME 2 plugin 'fragment-insertion' has no action 'filter-features'.
Further, if I look at the qiime fragment-insertion --help
there are only two options classify-otus-experimental
and sepp
I would be very grateful for any help you could provide. I'm an amateur bioinformatician and I have now exhausted my troubleshooting skills.
I am running QIIME 2 version 2018.4.0 with fragment-insterion 2018.2.0.dev0. See attached for my complete qiime info
output.
Thank you very much for your help (and the easy-to-use software!!)
Courtney
If SEPP fails it should be more verbose, i.e. override Siavash's trap function which eliminates protocols and thus hinders debugging.
Improvement Description
"Preserve" original feature IDs by renaming with the rename-json.py
output by SEPP.
Because SEPP renames nodes , the trees it produces don't play nice with downstream tools like Empress that can color trees using feature metadata.
Current Behavior
This tree cannot be easily colored by taxonomy, because the node IDs do not map to the original feature IDs.
Proposed Behavior
Use the rename-json.py
script output by SEPP to "preserve" original feature IDs, probably by exposing a new parameter so as to not impact runtimes.
Improvement Description
Jake asked if it would be possible to compute insertion trees not only for 16S but also for 18S and ITS.
Comments
I think that would work in principle, however we would need to create reference trees for the according databases (Silva and Unite). Any comments?
Hi @antgonza @josenavas ,
I hope that we will have soon completed the q2 plugin for SEPP. I wonder how we would integrate that into Qiita? Can you wrap general qiime2 plugins or would we have to create our own Qiita plugin?
Would you consider SEPP a tool for data processing or (meta)-analysis?
Following up on a months-old discussion regarding including this plugin in the QIIME 2 Core Distribution. Here on some options for us to proceed:
@sjanssen2, I think the easiest path is for us to go with 1 - since this will be the least friction for busywork to be wired up.
@antgonza, @sjanssen2, @ebolyen, @gregcaporaso, @nbokulich (and probably more, apologies if my list is incomplete) have discussed getting this into the "core" distribution of QIIME 2, and we would really like that to happen in time for the upcoming release of QIIME 2 (2018.11, scheduled for this Thursday). I don't expect there to be too much to get this rolled into the distro, but, it would be a lot simpler if we moved this over to @qiime2.
Thoughts, @sjanssen2?
It would be very useful if upon completion sepp would print out the # of successfully inserted features.
So far, working on human and mouse samples I've never had a case where any features failed to be inserted to the tree, but I still do the filtering step and each time my table is unchanged. The filtering step can also take a little time depending on the number of features you have. It would be super convenient if sepp can just take print how many features it inserted and the user could compare that number to their feature-table reads and see if a filtering step is needed or not.
Alternatively, the full insertion and filtering can be turned into a pipeline to do all in one go.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.