bluegenes / makemytranscriptome Goto Github PK
View Code? Open in Web Editor NEWassemble, annotate, and assess transcriptomes in a single step
License: Other
assemble, annotate, and assess transcriptomes in a single step
License: Other
update kegg scripts to get (& map to) arbitrary list of KEGG maps
re-integrate into main pipeline
grab KO from annotation table, not simple conversion file
optional user-specified KO map numbers?
check memory usage of random_subset.py; reduce if necessary
Continue to annotate full Trinity transcriptome, but also provide annotation info for the subset of contigs called "good" by transrate: --> just need to grep relevant contig info out of annotation table (or print the subset as part of annotation table script)
*use general database conversion file (ID: Name) to add the 'Name' / stitle field to the blast tabular output from diamond. not necessary if using blast+, as this output is already supported
allow user input of pre-downloaded databases
move NR "best words" /query coverage calculation --> main annotation table script
calculate query coverage = (alignment length/ query length)
- alignment length = index [3] in the blast file
- query length = from input fasta
- "name" /words can come from index 12 - Tessa will extend the diamond blast output to contain this field
Update masterlog to include parameters for each run
(useful for quality tool b/c translate assesses assemblies based on their input reads, if given)
integrate diamond as default in annotator.py
write up the tasks
When executing manage_tools, it should direct stdout to console not capture and hide it
reduce size footprint
describe how to use database config files
For blastP outputs + parsing (swissprot blastP; PFAM):
Download BUSCO Metazoa by default
wget http://buscos.ezlab.org/files/metazoa_buscos.tar.gz
tar zxf metazoa_buscos.tar.gz
first, need to check that this version works on ni
change: --JM ---> --max_memory
remove: --bflyopts "-V10 stderr"
Modifications for initial install:
create folder structure if it doesn't exist
Known modules needed to be fixed:
functions_general
manage_tools
remove reliance on Trinity gene-trans map for assembly:
Add PFAM db download:
wget https://data.broadinstitute.org/Trinity/Trinotate_v2.0_RESOURCES/Pfam-A.hmm.gz
gunzip Pfam-A.hmm.gz
hmmpress Pfam-A.hmm
Refactor databases to support user configurable install locations and user defined read locations
get --email working again
current gen_sample_info (in mmt) parses the csv --> generate this sample info differently to allow excel inputs
modify handle_cv function: use pandas to allow excel input
make installation/setup for home-brew & linuxbrew
chmod commands don't normally make new targets meaning they don't work well with the supervisor interface. Needs a fix.
Potential fix : Just chmod using subprocess instead of Task system
The transrate setup task installs dependendencies for transrate, meaning it doesn't have any obvious targets. This results in the task getting skipped as the declared targets are created by downloading transrate, not by running the setup command.
make the following optional:
signalP
TMHMM
RNAMMER
the license associated with these programs means they can only be used by academic users
Low priority, but imp:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.