Git Product home page Git Product logo

quicksand's Introduction

MIT License DOI

quicksand

quick analysis of sedimentary ancient DNA

See the documentation for a comprehensive documentation of the pipeline.

Description

quicksand is a bioinformatic pipeline for the analysis and taxonomic binning of (target enriched) ancient, mitochondrial, sedimentary DNA. quicksand uses krakenuniq for metagenomic classification, BWA for the mapping of DNA sequences and analyses mapped sequences for DNA deamination patterns.

Optimized for speed and portablity, quicksand is written in Nextflow and requires either Singularity or Docker.

Workflow

Graphical representation of the pipeline workflow

Quickstart

Requirements

To run the pipeline, please install

Input

The pipeline accepts demultiplexed, adapter-trimmed and overlap-merged bam and fastq files. Put all files in one directory, name the files DIR/{READGROUP}.{bam, fastq}. Provide the directory with the --split flag

Download Datastructure

To run quicksand a kraken database for metagenomics classification, the reference genomes for mapping and a set of bed-files are required for the run of the pipeline.

For the most recent RefSeq releases please download the quicksand-datastructure here:

latest=$(curl http://ftp.eva.mpg.de/quicksand/LATEST)
wget -r -np -nc -nH --cut-dirs=3 --reject="*index.html*" -q --show-progress -P refseq http://ftp.eva.mpg.de/quicksand/build/$latest

This step takes a while! Make yourself a coffee and relax

For a custom creation of the datastructure see the quicksand-build pipeline

Download Test-data

To run quicksand with real data, download the Hohlenstein-Stadel mtDNA (please see the README for more information) as input

wget -P split \
http://ftp.eva.mpg.de/neandertal/Hohlenstein-Stadel/BAM/mtDNA/HST.raw_data.ALL.bam

Run quicksand

quicksand is executed directly from github, no local build is required. With the databases and the testdata downloaded, run the pipeline.

nextflow run mpieva/quicksand -r v2.1 \
  --db        refseq/kraken/Mito_db_kmer22/ \
  --genomes   refseq/genomes/ \
  --bedfiles  refseq/masked/ \
  --split     split/ \
  -profile    singularity

Output

Please see the documentation for a comprehensive description of the output!

References

This pipeline uses code inspired by the nf-core initative, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

quicksand's People

Contributors

merszym avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

quicksand's Issues

Muridae Assignments

In the reports, Rattus norwegicus (Muridae) is always listed twice with the exact same values. The binned reads look fine - seems to be a problem with the report

Machine-readable report

Feature request:

"It would be great if all the column names could be used as R variables
like, ReadsDeam(3term) is quite a challenge"

fixed option malfunction

Hi there,

Whenever I try to run Quicksand with the --fixed option the bedfiltering does not run the bedfilterbam workflow on the fixed referenced ending with "-" in the ReadsBedfiltered and PostBedCoveredBP output columns.

I have tested the samples with different fixed references. Also, when running on the "best" option, these taxa output/ have bed filtered reads.

best wishes and thank you in advance,
Freya

Documentation of Filters

The documentation of the filters is quite bad...

Add a section 'filtering the output' to the 'examples' or as a separate subpage in the docs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.