Git Product home page Git Product logo

co1classifier's Issues

retrain Classifier for CO1 custom data

Dear Dr. Porter,

I've been trying to retrain the RDPClassifier to be used for a custom database on CO1; and starting from your trained dataset as a baseline I've moved towards retraining using just my custom data. I've realised that I'm currently missing any documentation regarding the construction of the "gene_copynumber" file that should be used in the last steps of retraining the classifier. Is there any documentation available on the topic?

Thanks in advance,
Marco

add Homo sapiens COI seqs?

Would it be possible to add Homo sapiens COI seqs in the following releases? Would be very useful to detect human "contamination"

re-format taxonomy training data for use in R (dada2)?

Hi, I'm trying to re-format the training data to have 7 consistent taxonomy levels (Kindgom, Phylum, Class, Order, Family, Genus, Species), but I'm unsure how to parse the "mytaxon.txt" file or FASTA headers into a format that I can easily manipulate them in R.

Essentially, I want to use your database with the RDP classifier included with the DADA2 pipeline, but I need to reformat the sequence names to have this format.

Any thoughts or suggestions would be greatly appreciated, and thanks for all your work putting this reference DB together.

Which trained files to use for classification with RDP classifier?

Hey guys,

I wanted to use your train dataset (V4) to taxonomically assign my sequences in dada2 using the RDP classifier. I am a bit confused though on how to actually do this with the files provided in the trained tar.gz directory.

I have unzipped the CO1v4_trained.tar.gz but there is no fasta file with reference sequences. Isn't that what is needed as input for the RDP classifier? The genus_wordConditionalProbList.txt file is a pretty large file so I am pretty sure this one is being used for the actual taxonomic classification, I just don't know how exactly.

I have the feeling I totally misunderstand how your datasets are being used with the RDP classifier, sorry for that. Your help is much appreciated.

Cheers
Nauras

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.