Git Product home page Git Product logo

sg-varcall-assignment's Introduction

SG variant calling assignment

All steps are implemented as a Snakemake workflow.

Dependencies

All dependencies to run the workflow can be installed using the following steps:

# Install mambaforge if no existing *conda installation
# (on Linux, adapt commands if you are on a different OS)
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
bash ./Mambaforge-Linux-x86_64.sh
# When asked “Do you want to installer to initialize Miniconda by running conda init” then type “yes”

# Install workflow dependencies
conda env create -n sg-varcall-assignment -f workflow/env/snakemake.yaml
conda activate sg-varcall-assignment

Running the workflow

To run the variant calling pipeline, all necessary data files must be placed in a folder data, parallel to the workflowfolder. You're all set to go when your working directory looks like this:

├── data
│   ├── sample_control_R1.fastq.gz
│   ├── sample_control_R2.fastq.gz
│   └── target_regions.bed
└── workflow
    ├── config
    │   └── config.yaml
    ├── envs
    │   ├── bedtools.yaml
    │   ├── bwa.yaml
    │   ├── fastqc.yaml
    │   ├── gatk.yaml
    │   ├── picard.yaml
    │   ├── samtools.yaml
    │   └── snakemake.yaml
    ├── rules
    │   ├── map.smk
    │   ├── preproc.smk
    │   ├── qc.smk
    │   └── varcall.smk
    ├── scripts
    │   ├── eval_vars.sh
    │   ├── expand_reads.bioawk
    │   └── expand_reads.sh
    └── Snakefile

Then, the whole workflow can be started:

cd workflow
snakemake -j1 --use-conda ../results/sample_control.recal.vcf

The resulting VCF must be called <sample_x>.recal.vcf, and the sequencing reads must be in data/<sample_x_R1.fastq.gz and data/<sample_x>_R2.fastq.gz.

sg-varcall-assignment's People

Contributors

hdetering avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.