Git Product home page Git Product logo

neutronstar's Introduction

nfcore/neutronstar

De novo assembly pipeline for 10X linked-reads.

Build Status GitHub Actions CI Status GitHub Actions Linting Status Nextflow

Docker Container available Docker Container available Singularity Container available install with bioconda

⚠️ Important note

Due to the discontinuation of the primary data source (10X Chromium) for this pipeline, it is now archived. This means that it will no longer be updated.

Table of Contents

  1. Introduction
  2. Important installation information
  3. Usage instructions
  4. Pipeline output
  5. Pipeline overview
  6. Credits

Introduction

nf-core/neutronstar is a bioinformatics best-practice analysis pipeline used for de-novo assembly and quality-control of 10x Genomics Chromium data. The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Quick Start

i. Install nextflow

ii. Install one of docker, singularity or conda

iii. Download the pipeline and test it on a minimal dataset with a single command

nextflow run nf-core/neutronstar -profile test,<docker/singularity/conda>

iv. Start running your own analysis!

nextflow run nf-core/neutronstar -profile <docker/singularity/conda> --id assembly_id --fastqs fastq_path --genomesize 1000000

See usage docs for all of the available options when running the pipeline.

Disclaimer

This software is in no way affiliated with nor endorsed by 10x Genomics.

Pipeline overview

nf-core/neutronstar chart

Credits

nf-core/neutronstar was originally written by Remi-Andre Olsen (@remiolsen).

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on Slack (you can join with this invite).

Citation

If you use nf-core/neutronstar for your analysis, please cite it using the following doi:

You can cite the nf-core pre-print as follows:

Ewels PA, Peltzer A, Fillinger S, Alneberg JA, Patel H, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. nf-core: Community curated bioinformatics pipelines. bioRxiv. 2019. p. 610741. doi: 10.1101/610741.

neutronstar's People

Contributors

ewels avatar remiolsen avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neutronstar's Issues

Tigmint integration

Use tigmint for assembly error correction using linked-read information, usually before ar{c,k}s scaffolding.

Same use cases as related issue #16.

Arcs integration

Have a think about integrating EMA mapping + Arcs (or EMA bc whitelisting + Arks) to do additional and more aggressive scaffolding than than Supernova does.

With the coming of DSL 2. It could be a good idea to write this out to a sub-workflow with its own entrypoint. A common use-case would also be running these tools on an existing long-read assembly (e.g. PacBio).

These tools exist on bioconda! 👍

Remis master To-do list

From @remiolsen on December 11, 2017 17:4

For transparency, here's my "design document"


Main requirements

  • Nextflow? yes
  • Supernova running in $SNIC_TMP (Irma compatible?)
    • 1.20 compatible — multiple input parameter assemblies
    • [ ] Use nextflow publishdata in stead of rsync couldn't make it work. Use rsync!
    • Make a this optional
  • Rsync supernova assembly back to workdir
  • supernova mkoutput - pseudohap, megabubbles
    • gunzip
    • parameter of additional outputs — always output .phased.fasta
    • parameter for minimum length
  • QUAST
    • make it run on Irma
  • BUSCO
    • UPPMAX — beforeScript
  • MultiQC
    • Needs testing
  • support for --no-preflight flag
  • Documentation
    • Readme.md
  • dump software versions & commands that were run
  • Send mail when done pipeline is finished
  • Clean up and generalize the configs
    • Common HPC config
    • Common Uppmax config
    • Make a general local run config
  • Release tags

Docker / Singularity

  • Supernova (copyright issues?)
  • Quast
  • BUSCO
  • Script for automatic singularity/docker download / installation

NX script

  • input configuration:
    • id
      • fastqs
      • sample
      • maxreads
      • bcfrac
    • genomesize
  • memory parameter
  • cpu parameter
  • make Longranger / fastqc optional

Input_validation

  • id — only numbers, letters, dash, and underscore allowed
  • bcfrac (0,1)
  • maxreads - num

MultiQC

  • Fix when having empty molecule.yaml files
  • Does having “ASSEMBLER_CS” folders break multiqc?
  • Fix QUAST module. It breaks when running with -s option

Testing

  • Test data from NA12878 run.
  • Travis-CI integration

Could haves

  • Tigmint evaluation
  • Delivery template mail / output folder structure
  • BWA align
    • picard-tools
    • remove dups
    • collectinsertsize
  • qaTools-singularity
  • FRC-singularity
  • BUSCOv2 datasets in config
    • auto-script to download datasets

Copied from original issue: SciLifeLab#3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.