russel88 / maginator Goto Github PK

View Code? Open in Web Editor NEW

6.0 6.0 1.0 78.21 MB

MAGinator - Accurate SNV calling and profiling of MAGs

License: MIT License

Python 78.55% Shell 0.14% R 21.31%

maginator's People

Contributors

Stargazers

Watchers

Forkers

pabloati

maginator's Issues

New GTDB version

We should update to the GTDB version 214 and the newest GTDB-tk version, but check that the output formats has not changed

Adaptive resource management

We might have to calculate how much time/memory/cores the jobs need depending on the size of the input dataset, such that jobs don't get terminated if some users have many samples.

Snakemake 7.20.x error with runtime

Snakemake version 7.20 and above is trying to read the runtime as a human-readable string. This breaks with MAGinator as runtime here is defined as XX:XX:XX which to snakemake is not human-readable. It works with snakemake 7.19

Is it working with Viruses?

Hello,

I was wondering is it possible to have it work working with viruses to create vMAGs?

Best,

Avoid rerunning of of rule to produce BAM-files

In MAGinator/maginator/workflow/pileup.Snakefile'

 Change the too ancient to not rerun every time?
bam=os.path.join(WD, 'mapped_reads', 'bams', 'gene_counts_{sample}.bam')

Currently it reruns every time with the message: 'Params have changed since last execution'

Snakemake workflow "inconsistent use of tabs and spaces in indentation" error ?

Hi Jakob !
Trying to run the handy snakemake from reads to bins in MAGinator but I am soon hitting this problem which seems to be an odd Pythin syntax error :

snakemake --use-conda -s MAGinator/maginator/recommended_workflow/reads_to_bins.Snakefile --resources mem_gb=180 --config reads=reads.csv --cores 10 --printshellcmds

returns :

TabError in file <string>, line 308:
inconsistent use of tabs and spaces in indentation (<string>, line 308)
  File "/home/projects/ku_00041/apps/jnesme/miniconda3/envs/maginator/lib/python3.12/tokenize.py", line 541, in _generate_tokens_from_c_tokenizer
  File "/home/projects/ku_00041/apps/jnesme/miniconda3/envs/maginator/lib/python3.12/tokenize.py", line 537, in _generate_tokens_from_c_tokenizer

Do you have any idea if this is a simple fix or I should just run the recommended workflow stepwise ?
Best,
Joseph.

How to use MAGinator without VAMB

We should make a small tutorial on how to create the clusters.tsv file using other binners.

Allow samples with no bins

If VAMB does not create any bins for a sample, then the following error occurs:
"ERROR: Sample names in read file do not match those in the VAMB clusters.tsv file"

Instead of the user needs to remove the sample from the reads-file, do this automatically

Paralellization

Hi, I would like to run MAGinator on a pretty large data set. I have around 420 samples, with 60 bins per sample on average, and the preprocessed reads are around 6GB each sample.

I have been running a subset of the samples (5) as a trial run on a cluster (40ppn and 180GB), and it has been running for more than 24 hours already.

Is there any possibility to run MAGinator in parallel to speed up the process? I am running the following command:

maginator -v trial/maginator_clusters.tsv
-r trial/maginator_reads.csv
-c trial/maginator_contigs.fasta
-o trial/maginator
-g /home/people/pablop/workdir/databases/gtdb_release207_v2
bin/run_maginator.sh (END)

Thank you,
Pablo

russel88 / maginator Goto Github PK

maginator's People

Contributors

Stargazers

Watchers

Forkers

maginator's Issues

New GTDB version

Adaptive resource management

Snakemake 7.20.x error with runtime

Is it working with Viruses?

Avoid rerunning of of rule to produce BAM-files

Snakemake workflow "inconsistent use of tabs and spaces in indentation" error ?

How to use MAGinator without VAMB

Allow samples with no bins

Paralellization

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent