Git Product home page Git Product logo

maginator's People

Contributors

russel88 avatar trinezac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

pabloati

maginator's Issues

New GTDB version

We should update to the GTDB version 214 and the newest GTDB-tk version, but check that the output formats has not changed

Adaptive resource management

We might have to calculate how much time/memory/cores the jobs need depending on the size of the input dataset, such that jobs don't get terminated if some users have many samples.

Snakemake 7.20.x error with runtime

Snakemake version 7.20 and above is trying to read the runtime as a human-readable string. This breaks with MAGinator as runtime here is defined as XX:XX:XX which to snakemake is not human-readable. It works with snakemake 7.19

Avoid rerunning of of rule to produce BAM-files

In MAGinator/maginator/workflow/pileup.Snakefile'


Change the too ancient to not rerun every time?
bam=os.path.join(WD, 'mapped_reads', 'bams', 'gene_counts_{sample}.bam')

Currently it reruns every time with the message: 'Params have changed since last execution'

Snakemake workflow "inconsistent use of tabs and spaces in indentation" error ?

Hi Jakob !
Trying to run the handy snakemake from reads to bins in MAGinator but I am soon hitting this problem which seems to be an odd Pythin syntax error :

snakemake --use-conda -s MAGinator/maginator/recommended_workflow/reads_to_bins.Snakefile --resources mem_gb=180 --config reads=reads.csv --cores 10 --printshellcmds

returns :

TabError in file <string>, line 308:
inconsistent use of tabs and spaces in indentation (<string>, line 308)
  File "/home/projects/ku_00041/apps/jnesme/miniconda3/envs/maginator/lib/python3.12/tokenize.py", line 541, in _generate_tokens_from_c_tokenizer
  File "/home/projects/ku_00041/apps/jnesme/miniconda3/envs/maginator/lib/python3.12/tokenize.py", line 537, in _generate_tokens_from_c_tokenizer

Do you have any idea if this is a simple fix or I should just run the recommended workflow stepwise ?
Best,
Joseph.

Allow samples with no bins

If VAMB does not create any bins for a sample, then the following error occurs:
"ERROR: Sample names in read file do not match those in the VAMB clusters.tsv file"

Instead of the user needs to remove the sample from the reads-file, do this automatically

Paralellization

Hi, I would like to run MAGinator on a pretty large data set. I have around 420 samples, with 60 bins per sample on average, and the preprocessed reads are around 6GB each sample.

I have been running a subset of the samples (5) as a trial run on a cluster (40ppn and 180GB), and it has been running for more than 24 hours already.

Is there any possibility to run MAGinator in parallel to speed up the process? I am running the following command:

maginator -v trial/maginator_clusters.tsv
-r trial/maginator_reads.csv
-c trial/maginator_contigs.fasta
-o trial/maginator
-g /home/people/pablop/workdir/databases/gtdb_release207_v2
bin/run_maginator.sh (END)

Thank you,
Pablo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.