Git Product home page Git Product logo

metassemble's Introduction

MetAssemble

Content

  1. Overview

  2. Dependencies

  3. Installation

  4. Usage

  5. Overview ===========

MetAssemble is a pipeline that runs several metagenomic assembly strategies combining Velvet, Meta-Velvet, Minimus2, Ray and Bambus2 on Illumina paired end reads. The pipeline was originally developed to validate the performance of the individual strategies, but can be used to perform the assembly strategies without validation as well. The pipeline is written in GNU make and not very user friendly for the average user, but if you are familiar with GNU make you shouldn't have too many troubles getting it to run. The only other metagenomics assembly pipeline that I am aware of is metAMOS, which seems to be an effort towards a more user-friendly approach if you are looking for that. A reason for using MetAssemble instead is because it allows one to schedule parts of the assembly pipeline with sbatch or qsub. Different steps in the assembly pipeline require different resources. Velvet for instance runs on only one node, whereas Ray runs over multiple. MetAssemble allows you to specify resource usage per rule with gnu-make-job-scheduler. Furthermore GNU make makes sure intermediate output files don't have to be recomputed in case of an error.

  1. Dependencies =============== Dependencies need to be installed by oneself. There is no automated way to do this at the moment. One can however check if the dependencies are met by running

    bash test/dependencies/test_dependencies.sh

Do note that it is not necessary to install all programs if you only want to do a subset of the assemblies that MetAssemble covers. MetAssemble requires the following programs to perform all different assemblies:

Supported input:

  • Illumina fastq CASAVA v1.8 paired end reads

Running the MetAssemble pipeline (scripts/Makefile) requires

  • GNU make (tested on v3.81)

The Makefile features four steps of the metagenomic assembly pipeline:

  1. Read processing.

  2. Assembling contigs

  3. Merging contigs

    • With cd-hit and minimus2. See Angus.
    • Cut up contigs and merge with Newbler RunAssembly 2.6
      • scripts/process-reads/cut-up-fasta.py requires Biopython
      • Newbler RunAssembly 2.6 (COMMERCIAL)
  4. Scaffolding

    • Construct linkage information by mapping reads to contigs
    • Scaffold contigs
  5. Installation =============== After installing all the dependencies point METASSEMBLE_DIR environment variable to the root directory of this repository e.g.: export METASSEMBLE_DIR='~/gitrepos/metassemble'. You can do a test run with cd test && make test, which downloads a small set from the HMP project and runs a subset of all different assembly strategies in the MetAssemble pipeline.

  6. Usage ======== See example in examples/chris-mock. There is a Makefile and a Makefile-sbatch which set some input paramaters and then include scripts/metassemble.mk and scripts/metassemble-scheduler.mk respectively. Hopefully that is clear enough to help you understand how to run your own subset of the available assembly strategies. If you want to change the resource usage per rule, change Makefile-sbatch accordingly. In the future I might add automatic computation of the resource usage. For assembly this is unfortunately still a problem, since it depends on the complexity of your sample and not just the filesize. The specified resource usage is for a library of ~1M and a mixed community of 60 bacteria and archaeae.

To see which assemblies have been created:

make echoexisting

All assemblies, created or not:

make echoall

To create all:

make all

Only show commands:

make -n all

Only make velvet:

make velvet

Schedule rules with sbatch:

make -f Makefile-sbatch all

For more rules check in the scripts/parameters.mk file.

metassemble's People

Contributors

inodb avatar

Watchers

Blaise Alako avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.