Git Product home page Git Product logo

readytowear's Introduction

readytowear

Ready-made Taxonomic Weights Repository

Ready-made taxonomic weights generated by q2-clawback for use with the q2-feature-classifier taxonomy classifier.

Searchable Inventory of readytowear taxonomic weights.

If you use any materials in readytowear, please cite:

Kaehler BD, Bokulich NA, McDonald D, Knight R, Caporaso JG, Huttley GA. 2019. Species-level microbial sequence classification is improved by source-environment information. Nature Communications 10: 4643. https://doi.org/10.1038/s41467-019-12669-6

Please also cite (as this classifier was used to provide taxonomic labels for the class weights):

Bokulich NA, Kaehler BD, Rideout JR, Dillon M, Bolyen E, Knight R, Huttley GA, Caporaso JG. 2018. Optimizing taxonomic classification of marker gene sequences. Microbiome 6(1): 90. doi: https://doi.org/10.1186/s40168-018-0470-z.

And finally do not forget to cite the reference database used (citations for individual reference databases are located in the appropriate data subdirectories).

How to use the readytowear collection

NOTE: The readytowear collection currently only includes taxonomic weights generated for 16S rRNA gene sequence data. Hence, the collection currently does not include weights for other marker genes. We may accommodate these others needs in future releases, and encourage community contributions (contribution instructions coming soon). In the mean time, if you use non-16S rRNA gene data and wish to use bespoke classifiers, assemble your own custom taxonomic weights with q2-clawback as described here

q2-feature-classifier is a plugin for QIIME 2, and hence QIIME 2 must be installed to use. Before beginning this tutorial, install and activate your QIIME 2 environment.

Clone readytowear to get started:

git clone https://github.com/BenKaehler/readytowear.git

Train a non-saline soil naive Bayes taxonomy classifier using the latest readytowear fashions:

qiime feature-classifier fit-classifier-naive-bayes \
  --i-reference-reads readytowear/data/gg_13_8/515f-806r/ref-seqs.qza \
  --i-reference-taxonomy readytowear/data/gg_13_8/515f-806r/ref-tax.qza \
  --i-class-weight readytowear/data/gg_13_8/515f-806r/soil-non-saline.qza \
  --o-classifier gg138_v4_soil-non-saline_classifier.qza

Now this classifier is ready to use! Classify a set of query sequences contained in a FASTA format file as follows:

qiime tools import \
  --input-path sequences.fna \
  --output-path sequences.qza \
  --type 'FeatureData[Sequence]'

qiime feature-classifier classify-sklearn \
  --i-reads sequences.qza \
  --i-classifier gg138_v4_soil-non-saline_classifier.qza \
  --o-classification bespoke-classifier-results.qza

qiime metadata tabulate \
  --m-input-file bespoke-classifier-results.qza \
  --m-input-file sequences.qza \
  --o-visualization bespoke-classifier-results.qzv

Obtaining reference sequences for full-length Greengenes or SILVA

We couldn't save the full-length reference sequences for Greengenes or SILVA in this repository because they were too big. Note that if you are using GTDB or only using V4, the referece sequences are saved in the repo and you don't have to worry about this step. If you are using full-length reference sequences, you need to download them before you can train any classifier.

To obtain the full-length SILVA reference sequences you can run

wget https://data.qiime2.org/2020.11/common/silva-138-99-seqs.qza \
  -O readytowear/data/silva_138/full_length/ref-seqs.qza

To obtain the full-length Greengenes reference sequences you can run

wget ftp://greengenes.microbio.me/greengenes_release/gg_13_5/gg_13_8_otus.tar.gz
tar xzf gg_13_8_otus.tar.gz
qiime tools import \
  --input-path gg_13_8_otus/rep_set/99_otus.fasta \
  --type FeatureData[Sequence] \
  --output-path readytowear/data/gg_13_8/full_length/ref-seqs.qza
rm -r gg_13_8_otus

readytowear's People

Contributors

benkaehler avatar nbokulich avatar mestaki avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.