Git Product home page Git Product logo

gatk-workflows's Introduction

The new standard in cloud bioinformatics (WDL+Cromwell)

Prerequisites

  • A Unix-based operating system (Mac/Linux)
  • A Java 8 runtime environment (you can download java here)
  • Google Cloud SDK (you can download GCS here)
  • A MySQL database (create database cromwell)
  • On your Google project, open up the API Manager and enable the following APIs
    • Google Compute Engine
    • Google Cloud Strorage
    • Google Genomics API

Download Cromwell jar

This code tested on Cromwell v35.

# wget https://github.com/broadinstitute/cromwell/releases/download/36/cromwell-36.jar

Authorizing Google Cloud

# gcloud init

# gcloud auth login <login-id>

# gcloud auth application-default login

# gcloud config set project <google-cloud-project-id>

Download gatk-workflows for somatic

# git clone https://github.com/hongiiv/gatk-workflows.git

copy cromwell.jar

# cp cromwell-36.jar gatk-workflows/

# cd gatk-workfolws

This workflow calls another workflow, that second workflow is referred to as a sub-workflow. This workflow based on official GATK workflows

# zip -r workflowDependencies.zip ./gatk4-data-processing/ ./gatk4-somatic-snvs-indels/ ./seq-format-conversion/

Edit application options (edit google.conf)

  • MySQL database id/pw (user = your mysql id, password = your mysql p/w)
  • Google project (project = your google project name)
  • Base bucket (root = your google storage bucket name)

Run Cromwell server

# java -Dconfig.file=google.conf -jar cromwell-35.jar run hello.wdl -o generic.google-papi.options.json

Run workflow via Cromwell server

# java -Dconfig.file=google.conf -jar cromwell-35.jar \
submit FullPipeline.wdl \
-I FullPipeline.input.json \
-o generic.google-papi-options.json \
-h http://localhost:8080 \
-p workflowDependencies.zip

Run workflow via command

# java -Dconfig.file=google.conf -jar cromwell-35.jar \
run FullPipeline.wdl \
-I FullPipeline.input.json \
-o generic.google.papi-options.json \ 

Run workflow via REST API

# curl -X POST "http://localhost:8080/api/workflows/v1" \
-H "accept: application/json" -H "Content-Type: multipart/form-data" \
-F "[email protected];type=" \
-F "[email protected];type=application/json" \
-F "[email protected];type=application/zip"

Run workflow via REST API using Cromwell Server

You can access swagger web page (http://localhost:8080)

FAQ

Q: Workflow를 실행하는 방법이 여러개가 존재합니다. 어떠한 방법을 사용하는것이 좋을까요?

A: Server 모드 (REST API 또는 submit 명령)를 통해 실행하는 것이 좋습니다. 해당 workflow를 관리가 가능하기 때문입니다.

Q: 왜 GATK workflow에서는 FASTQ 파일 대신 uBAM을 기본으로 사용합니까?

A: 최근 발표된 "Standards Guidelines for Validating Next-Generation Sequencing Bioinformatics Pipelines"에 따르면 laboratory, run, patient identifiers가 파일의 메타 데이터에 기록될 것을 권고하고 있습니다. 또한 파일 이름 자체에도 이러한 식별자가 표시되어야 한다고 권고하고 있습니다. 이러한 요구사항을 충족 시키기 위해서는 uBAM이 적합합니다.

Q: WDL을 지원하는 상업적인 서비스가 존재합니까?

A: 미국의 DNAnexus와 캐나다의 DNAstack이 WDL을 지원하고 있습니다.

gatk-workflows's People

Contributors

hongiiv avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.