Git Product home page Git Product logo

wf-template's Introduction

Template workflow

Nextflow workflow template repository.

Introduction

This workflow is not intended to be used by end users.

This workflow can be used for the following:

  • As a template using gitlabs create project from template.
  • For testing of any scripts that are the same across workflows such as scripts in the lib directory.

Compute requirements

Recommended requirements:

  • CPUs = 2
  • Memory = 2GB

Minimum requirements:

  • CPUs = 2
  • Memory = 2GB

Approximate run time: 5 minutes per sample

ARM processor support: True

Install and run

These are instructions to install and run the workflow on command line. You can also access the workflow via the EPI2ME Desktop application.

The workflow uses Nextflow to manage compute and software resources, therefore Nextflow will need to be installed before attempting to run the workflow.

The workflow can currently be run using either [Docker](https://www.docker.com/products/docker-desktop or Singularity to provide isolation of the required software. Both methods are automated out-of-the-box provided either Docker or Singularity is installed. This is controlled by the -profile parameter as exemplified below.

It is not required to clone or download the git repository in order to run the workflow. More information on running EPI2ME workflows can be found on our website.

The following command can be used to obtain the workflow. This will pull the repository in to the assets folder of Nextflow and provide a list of all parameters available for the workflow as well as an example command:

nextflow run epi2me-labs/wf-template --help

To update a workflow to the latest version on the command line use the following command:

nextflow pull epi2me-labs/wf-template

A demo dataset is provided for testing of the workflow. It can be downloaded and unpacked using the following commands:

wget https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-template/wf-template-demo.tar.gz
tar -xzvf wf-template-demo.tar.gz

The workflow can then be run with the downloaded demo data using:

nextflow run epi2me-labs/wf-template \
	--fastq 'wf-template-demo/test_data/reads.fastq.gz' \
	-profile standard

For further information about running a workflow on the command line see https://labs.epi2me.io/wfquickstart/

Related protocols

This workflow is designed to take input sequences that have been produced from Oxford Nanopore Technologies devices.

Find related protocols in the Nanopore community.

Input example

This workflow accepts either FASTQ or BAM files as input.

The FASTQ or BAM input parameters for this workflow accept one of three cases: (i) the path to a single FASTQ or BAM file; (ii) the path to a top-level directory containing FASTQ or BAM files; (iii) the path to a directory containing one level of sub-directories which in turn contain FASTQ or BAM files. In the first and second cases (i and ii), a sample name can be supplied with --sample. In the last case (iii), the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with --sample_sheet.

(i)                     (ii)                 (iii)    
input_reads.fastq   ─── input_directory  ─── input_directory
                        ├── reads0.fastq     ├── barcode01
                        └── reads1.fastq     │   ├── reads0.fastq
                                             │   └── reads1.fastq
                                             ├── barcode02
                                             │   ├── reads0.fastq
                                             │   ├── reads1.fastq
                                             │   └── reads2.fastq
                                             └── barcode03
                                              └── reads0.fastq

Input parameters

Input Options

Nextflow parameter name Type Description Help Default
fastq string FASTQ files to use in the analysis. This accepts one of three cases: (i) the path to a single FASTQ file; (ii) the path to a top-level directory containing FASTQ files; (iii) the path to a directory containing one level of sub-directories which in turn contain FASTQ files. In the first and second case, a sample name can be supplied with --sample. In the last case, the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with --sample_sheet.
bam string BAM or unaligned BAM (uBAM) files to use in the analysis. This accepts one of three cases: (i) the path to a single BAM file; (ii) the path to a top-level directory containing BAM files; (iii) the path to a directory containing one level of sub-directories which in turn contain BAM files. In the first and second case, a sample name can be supplied with --sample. In the last case, the data is assumed to be multiplexed with the names of the sub-directories as barcodes. In this case, a sample sheet can be provided with --sample_sheet.
analyse_unclassified boolean Analyse unclassified reads from input directory. By default the workflow will not process reads in the unclassified directory. If selected and if the input is a multiplex directory the workflow will also process the unclassified directory. False
watch_path boolean Enable to continuously watch the input directory for new input files. This option enables the use of Nextflow’s directory watching feature to constantly monitor input directories for new files. False
fastq_chunk integer Sets the maximum number of reads per chunk returned from the data ingress layer. Default is to not chunk data and return a single FASTQ file.

Sample Options

Nextflow parameter name Type Description Help Default
sample_sheet string A CSV file used to map barcodes to sample aliases. The sample sheet can be provided when the input data is a directory containing sub-directories with FASTQ files. The sample sheet is a CSV file with, minimally, columns named barcode and alias. Extra columns are allowed. A type column is required for certain workflows and should have the following values; test_sample, positive_control, negative_control, no_template_control. An optional analysis_group column is used by some workflows to combine the results of multiple samples. If the analysis_group column is present, it needs to contain a value for each sample.
sample string A single sample name for non-multiplexed data. Permissible if passing a single .fastq(.gz) file or directory of .fastq(.gz) files.

Output Options

Nextflow parameter name Type Description Help Default
out_dir string Directory for output of all workflow results. output

Outputs

Output files may be aggregated including information for all samples or provided per sample. Per-sample files will be prefixed with respective aliases and represented below as {{ alias }}.

Title File path Description Per sample or aggregated
workflow report ./wf-template-report.html Report for all samples aggregated
Per file read stats ./fastq_ingress_results/reads/fastcat_stats/per-file-stats.tsv A TSV with per file read stats, including all samples. aggregated
Per read stats ./fastq_ingress_results/reads/fastcat_stats/per-read-stats.tsv A TSV with per read stats, including all samples. aggregated
Run ID's ./fastq_ingress_results/reads/fastcat_stats/run_ids List of run ID's present in reads. aggregated
Meta map json ./fastq_ingress_results/reads/metamap.json Meta data used in workflow presented in a JSON. aggregated
Concatenated sequence data ./fastq_ingress_results/reads/{{ alias }}.fastq.gz Per sample reads concatenated in to one fastq file. per-sample

Pipeline overview

1. Concatenates input files and generate per read stats.

The fastcat/bamstats tool is used to concatenate multifile samples to be processed by the workflow. It will also output per read stats including average read lengths and qualities.

Troubleshooting

  • If the workflow fails please run it with the demo data set to ensure the workflow itself is working. This will help us determine if the issue is related to the environment, input parameters or a bug.
  • See how to interpret some common nextflow exit codes here.

FAQ's

If your question is not answered here, please report any issues or suggestions on the github issues page or start a discussion on the community.

Related blog posts

See the EPI2ME website for lots of other resources and blog posts.

wf-template's People

Contributors

cjw85 avatar julibeg avatar samstudio8 avatar sarahjeeeze avatar mattdmem avatar nggvs avatar nrhorner avatar renzotale88 avatar vlshesketh avatar amblina avatar

Stargazers

 avatar  avatar  avatar Vincent Dietrich avatar A.s. avatar Eric Talevich avatar  avatar Sander Boden avatar Jack Tierney avatar Inês Mendes avatar Thomas Sandmann avatar  avatar Michael Foster avatar Dami Rebergen avatar

Watchers

James Cloos avatar  avatar Dami Rebergen avatar

wf-template's Issues

ARM64 version of base image

Thanks for the cool workflows!

Would it be possible to please post an ARM64 image to docker hub? Or alternatively post the Dockerfile used to generate the base image (base-workflow-image).

Could this be linked to from the website

Is your feature related to a problem?

I discovered this exists at ION BRU. I have recently been working on importing custom workflows of my own using the directions on this page https://labs.epi2me.io/nexflow-for-epi2melabs/ and didn't know this existed.

Describe the solution you'd like

Link to wf-template from the main how to page https://labs.epi2me.io/nexflow-for-epi2melabs/

Describe alternatives you've considered

There was enough information on the page already for me to succeed in importing my existing nextflow pipeline in a minimal way, but I might have included more useful features...

Additional context

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.