russel88 / maginator Goto Github PK
View Code? Open in Web Editor NEWMAGinator - Accurate SNV calling and profiling of MAGs
License: MIT License
MAGinator - Accurate SNV calling and profiling of MAGs
License: MIT License
We should update to the GTDB version 214 and the newest GTDB-tk version, but check that the output formats has not changed
We might have to calculate how much time/memory/cores the jobs need depending on the size of the input dataset, such that jobs don't get terminated if some users have many samples.
Snakemake version 7.20 and above is trying to read the runtime as a human-readable string. This breaks with MAGinator as runtime here is defined as XX:XX:XX which to snakemake is not human-readable. It works with snakemake 7.19
Hello,
I was wondering is it possible to have it work working with viruses to create vMAGs?
Best,
In MAGinator/maginator/workflow/pileup.Snakefile'
Change the too ancient to not rerun every time?
bam=os.path.join(WD, 'mapped_reads', 'bams', 'gene_counts_{sample}.bam')
Currently it reruns every time with the message: 'Params have changed since last execution'
Hi Jakob !
Trying to run the handy snakemake from reads to bins in MAGinator but I am soon hitting this problem which seems to be an odd Pythin syntax error :
snakemake --use-conda -s MAGinator/maginator/recommended_workflow/reads_to_bins.Snakefile --resources mem_gb=180 --config reads=reads.csv --cores 10 --printshellcmds
returns :
TabError in file <string>, line 308:
inconsistent use of tabs and spaces in indentation (<string>, line 308)
File "/home/projects/ku_00041/apps/jnesme/miniconda3/envs/maginator/lib/python3.12/tokenize.py", line 541, in _generate_tokens_from_c_tokenizer
File "/home/projects/ku_00041/apps/jnesme/miniconda3/envs/maginator/lib/python3.12/tokenize.py", line 537, in _generate_tokens_from_c_tokenizer
Do you have any idea if this is a simple fix or I should just run the recommended workflow stepwise ?
Best,
Joseph.
We should make a small tutorial on how to create the clusters.tsv file using other binners.
If VAMB does not create any bins for a sample, then the following error occurs:
"ERROR: Sample names in read file do not match those in the VAMB clusters.tsv file"
Instead of the user needs to remove the sample from the reads-file, do this automatically
Hi, I would like to run MAGinator on a pretty large data set. I have around 420 samples, with 60 bins per sample on average, and the preprocessed reads are around 6GB each sample.
I have been running a subset of the samples (5) as a trial run on a cluster (40ppn and 180GB), and it has been running for more than 24 hours already.
Is there any possibility to run MAGinator in parallel to speed up the process? I am running the following command:
maginator -v trial/maginator_clusters.tsv
-r trial/maginator_reads.csv
-c trial/maginator_contigs.fasta
-o trial/maginator
-g /home/people/pablop/workdir/databases/gtdb_release207_v2
bin/run_maginator.sh (END)
Thank you,
Pablo
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.