Git Product home page Git Product logo

nanoseq's Introduction

nf-core/nanoseq nf-core/nanoseq

GitHub Actions CI Status GitHub Actions Linting Status AWS CI Cite with Zenodo

Nextflow run with conda run with docker run with singularity Launch on Nextflow Tower

Get help on SlackFollow on TwitterWatch on YouTube

Introduction

nfcore/nanoseq is a bioinformatics analysis pipeline for Nanopore DNA/RNA sequencing data that can be used to perform basecalling, demultiplexing, QC, alignment, and downstream analysis.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

On release, automated continuous integration tests run the pipeline on a full-sized dataset obtained from the Singapore Nanopore Expression Consortium on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the nf-core website.

Pipeline Summary

  1. Demultiplexing (qcat; optional)
  2. Raw read cleaning (NanoLyse; optional)
  3. Raw read QC (NanoPlot, FastQC)
  4. Alignment (GraphMap2 or minimap2)
    • Both aligners are capable of performing unspliced and spliced alignment. Sensible defaults will be applied automatically based on a combination of the input data and user-specified parameters
    • Each sample can be mapped to its own reference genome if multiplexed in this way
    • Convert SAM to co-ordinate sorted BAM and obtain mapping metrics (samtools)
  5. Create bigWig (BEDTools, bedGraphToBigWig) and bigBed (BEDTools, bedToBigBed) coverage tracks for visualisation
  6. DNA specific downstream analysis:
  7. RNA specific downstream analysis:
    • Transcript reconstruction and quantification (bambu or StringTie2)
      • bambu performs both transcript reconstruction and quantification
      • When StringTie2 is chosen, each sample can be processed individually and combined. After which, featureCounts will be used for both gene and transcript quantification.
    • Differential expression analysis (DESeq2 and/or DEXSeq)
    • RNA modification detection (xpore and/or m6anet)
    • RNA fusion detection (JAFFAL)
  8. Present QC for raw read and alignment results (MultiQC)

Functionality Overview

A graphical overview of suggested routes through the pipeline depending on the desired output can be seen below.

nf-core/nanoseq metro map

Quick Start

  1. Install Nextflow (>=22.10.1)

  2. Install any of Docker, Singularity (you can follow this tutorial), Podman, Shifter or Charliecloud for full pipeline reproducibility (you can use Conda both to install Nextflow itself and also to manage software within pipelines. Please only use it within pipelines as a last resort; see docs).

  3. Download the pipeline and test it on a minimal dataset with a single command:

    nextflow run nf-core/nanoseq -profile test,YOURPROFILE

    Note that some form of configuration will be needed so that Nextflow knows how to fetch the required software. This is usually done in the form of a config profile (YOURPROFILE in the example command above). You can chain multiple config profiles in a comma-separated string.

    • The pipeline comes with config profiles called docker, singularity, podman, shifter, charliecloud and conda which instruct the pipeline to use the named tool for software management. For example, -profile test,docker.
    • Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use -profile <institute> in your command. This will enable either docker or singularity and set the appropriate execution settings for your local compute environment.
    • If you are using singularity and are persistently observing issues downloading Singularity images directly due to timeout or network issues, then you can use the --singularity_pull_docker_container parameter to pull and convert the Docker image instead. Alternatively, you can use the nf-core download command to download images first, before running the pipeline. Setting the NXF_SINGULARITY_CACHEDIR or singularity.cacheDir Nextflow options enables you to store and re-use the images from a central location for future pipeline runs.
    • If you are using conda, it is highly recommended to use the NXF_CONDA_CACHEDIR or conda.cacheDir settings to store the environments in a central location for future pipeline runs.
  4. Start running your own analysis!

Documentation

The nf-core/nanoseq pipeline comes with documentation about the pipeline usage, parameters and output.

nextflow run nf-core/nanoseq \
    --input samplesheet.csv \
    --protocol DNA \
    --barcode_kit SQK-PBK004 \
    -profile <docker/singularity/podman/institute>

See usage docs for all of the available options when running the pipeline.

An example input samplesheet for performing both basecalling and demultiplexing can be found here.

Credits

nf-core/nanoseq was originally written by Chelsea Sawyer and Harshil Patel from The Bioinformatics & Biostatistics Group for use at The Francis Crick Institute, London. Other primary contributors include Laura Wratten, Ying Chen, Yuk Kei Wan and Jonathan Goeke from the Genome Institute of Singapore, Christopher Hakkaart from Institute of Medical Genetics and Applied Genomics, Germany, and Johannes Alneberg and Franziska Bonath from SciLifeLab, Sweden.

Many thanks to others who have helped out along the way too, including (but not limited to): @crickbabs, @AnnaSyme, @ekushele.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on Slack (you can join with this invite).

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

nanoseq's People

Contributors

alneberg avatar christopher-hakkaart avatar csawye01 avatar cying111 avatar drpatelh avatar dschreyer avatar ekushele avatar ewels avatar kevinmenden avatar lwratten avatar maxulysse avatar nf-core-bot avatar vsmalladi avatar yuukiiwa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nanoseq's Issues

Failed to pass path of testdata azure

Check Documentation

I have checked the following places for your error:

Description of the bug

Missing output file(s) test-datasets/fast5/barcoded/ expected by process GET_TEST_DATA

Steps to reproduce

Steps to reproduce the behaviour:

  1. nextflow run nf-core/nanoseq -profile test,azurebatch -r 1.1.0
  2. Missing output file(s) test-datasets/fast5/barcoded/ expected by process GET_TEST_DATA

Expected behaviour

Should be able to pass the files to the next step

Log files

Have you provided the following extra information/files:

  • The command used to run the pipeline
  • The .nextflow.log file

System

  • Hardware: Azure
  • Executor: Azure Batch

Nextflow Installation

  • Version: 21.10.5.5658

Container engine

  • Engine: Docker

Error executing process > 'NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT

  It runs well in the non-basecalling, non-demultiplexing, non-alignment pipeline. However, it not runs well in the non-basecalling, non-demultiplexing pipeline. The test dataset is from nf-core/test-datasets. The version of nanoseq is v2.01. Code is as follows:

nextflow run nf-core/nanoseq
--input samplesheet_nobc_nodx.csv
--protocol cDNA
--skip_basecalling
--skip_demultiplexing
--max_cpus 40
--max_memory 100.GB
-profile docker

Error information is as follows (Note: the subsequent process such as alignment is okay, and I found only html file not png file in work/"hash dir"/fastq/MCF7_R1):
"
Error executing process > 'NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT (MCF7_R1)'

Caused by:
Missing output file(s) fastq/MCF7_R1/*.png expected by process NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT (MCF7_R1)
"

new guppy version

Hi all,
it is worth considering the new guppy version 4.0.14. It performs much better in basecalling accuracy.

container = 'genomicpariscentre/guppy-gpu:3.4.4'

May be build and maintain own guppy containers to keep them up to date?

error running the test example run

Description of the bug

if I try to test the pipeline with:

nextflow run nf-core/nanoseq -r 3.0.0 -profile test,docker --outdir results

I get an error in the log:

Nov-02 07:58:59.240 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_NANOSEQ:NANOSEQ:GET_NANOLYSE_FASTA'

Caused by:
Process NFCORE_NANOSEQ:NANOSEQ:GET_NANOLYSE_FASTA terminated with an error exit status (56)

Command executed:

curl
-L https://github.com/wdecoster/nanolyse/raw/master/reference/lambda.fasta.gz
-o lambda.fasta.gz

Command exit status:
56

Command output:
(empty)

Command error:
0 0 0 0 0 0 0 0 --:--:-- 0:02:13 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:14 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:15 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:16 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:17 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:18 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:19 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:20 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:21 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:22 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:23 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:24 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:25 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:26 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:27 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:28 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:29 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:30 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:31 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:32 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:33 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:34 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:35 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:36 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:37 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:38 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:39 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:40 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:41 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:42 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:43 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:44 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:45 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:46 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:47 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:48 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:49 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:50 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:51 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:52 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:53 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:54 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:55 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:56 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:57 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:58 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:02:59 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:03:00 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:03:01 --:--:-- 0
curl: (56) OpenSSL SSL_read: Connection reset by peer, errno 104

Work dir:
/home/ubuntu/dev/nanoseq/work/bd/457c9c715db4c5a237aeca8311a1df

Any Idea whats wrong? If I run the command in a terminal:

curl
-L https://github.com/wdecoster/nanolyse/raw/master/reference/lambda.fasta.gz
-o lambda.fasta.gz

It can download the file, but not in the pipeline context.

Command used and terminal output

No response

Relevant files

No response

System information

No response

Add option for high accuracy basecalling with Guppy

From a little reading around it seems you can change the name of the config file you pass to Guppy using the -c parameter to tailor the basecalling to run it either in fast (-c dna_r9.4.1_450bps_fast.cfg; default) or high accuracy (dna_r9.4.1_450bps_hac.cfg) mode or for different machines (-c dna_r9.4.1_450bps_hac_prom.cfg; PromethION).

The version numbers i.e. r9.4.1 relate to the the version of the pores as described in more detail in the paper below:
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1727-y
https://github.com/rrwick/Basecalling-comparison

It would be good to add a parameter such as --guppy_config to the pipeline to make this cutomisable.

Multiqc step failure

Description of the bug

When running nanoseq pipeline with nextflow 22.08.1-edge the following error is reported

[WARNING]         multiqc : MultiQC Version v1.12 now available!
[INFO   ]         multiqc : This is MultiQC v1.9
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Report title: lethal_elion_5
[INFO   ]         multiqc : Searching   : /tmp/nxf.RMe6fJWinC
[INFO   ]         multiqc : Only using modules custom_content, pycoqc, fastqc, samtools, featureCounts
[INFO   ]  custom_content : nf-core-nanoseq-summary: Found 1 sample (html)
[INFO   ]  custom_content : software_versions: Found 1 sample (html)
[INFO   ]          pycoqc : Found 1 reports
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
  axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
  axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
[INFO   ]          fastqc : Found 3 reports
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
  axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
  axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
[INFO   ]        samtools : Found 1 stats reports
[INFO   ]        samtools : Found 1 stats reports
[INFO   ]        samtools : Found 1 flagstat reports
[INFO   ]        samtools : Found 1 idxstats reports
[INFO   ]         multiqc : Compressing plot data
[INFO   ]         multiqc : Report      : lethal_elion_5_multiqc_report.html
[INFO   ]         multiqc : Data        : lethal_elion_5_multiqc_report_data
[INFO   ]         multiqc : Plots       : lethal_elion_5_multiqc_report_plots
[INFO   ]         multiqc : MultiQC complete
ls: cannot access 'multiqc_plots': No such file or directory

The critical problem looks like the multiqc_plots file is not created. Provided with older version of nextflow it looks working fine. Don't see how it could be related to nextflow runtime upgrade.

Is multiqc_plots expected to be a file or a directory?

Command used and terminal output

nextflow run 'https://github.com/nf-core/nanoseq'
		 -name lethal_elion_6
		 -params-file 'https://api.staging-tower.xyz/ephemeral/rkGBtNRmVJexasHRKLtDVg.json'
		 -with-tower 'https://api.staging-tower.xyz'
		 -r ad5b2bb7a37a6edb9f61603636bd26e43259f58b
		 -profile test


### Relevant files

_No response_

### System information

_No response_

GraphMap2_Align Crash

Check Documentation

I have checked the following places for your error:

Description of the bug

Piepline crashes at the Grapmap2_Align stage

Steps to reproduce

Steps to reproduce the behaviour:

  1. Command line:

nextflow run nf-core/nanoseq --monochrome_logs --input /mnt/blah-blah-blah/samplesheet.csv --outdir /home/mike.harbourblah-blah-blah --protocol directRNA --skip_basecalling --skip_demultiplexing --aligner graphmap2 --quantification_method bambu -profile singularity --skip_bigbed --skip_bigwig --skip_pycoqc --skip_nanoplot --skip_fastqc --skip_multiqc --skip_qc

  1. See error:
    "Execution aborted due to an unexpected error
    -- Check script '.nextflow/assets/nf-core/nanoseq/./workflows/../subworkflows/local/../../modules/local/graphmap2_align.nf' at line: 8 or see '.nextflow.log' file for more details
    "

Expected behaviour

Graphmap2 pipeline stage to complete and for pipeline to continue.

Log files

Have you provided the following extra information/files:

  • The command used to run the pipeline (see point 1 above)
  • The .nextflow.log file

System

  • Hardware: internal server Kernel Linux 5.4.0-105-generic x86_64
  • Executor: local
  • OS: Ubuntu
  • Version Release 20.04.3 LTS (Focal Fossa) 64-bit

Nextflow Installation

  • Version: 1.10.6 build 5661

Container engine

  • Engine: Singularity
  • version: 3.6.3

Additional context

nextflow.log

Skip_quantification problem

Hello
I have an issue with running the pipeline. I just try manually to run the analysis with the test dataset from yours.
with this command
nextflow run nf-core/nanoseq \ --input samplesheet_bc_dx.csv \ --protocol cDNA \ --input_path /home/jirachote/test-datasets/fast5/barcoded \ --flowcell FLO-MIN106 \ --kit SQK-DCS109 \ --barcode_kit EXP-NBD103 \ -profile docker
give me an error
Quantification can only be performed if all samples in the samplesheet have the same reference fasta and GTF file." Please specify the '--skip_quantification' parameter if you wish to skip these steps.
even I add the reference to the spreadsheet, the same error will pop up.
so I have to add --skip_quantification then pipeline run successfully but the calibration (the third barcode) error in bigbed and bigwig step.
Now I tried with
./nextflow run nf-core/nanoseq --input samplesheet_bc_nodx.csv --protocol cDNA --input_path /home/jirachote/test-datasets/nonbarcoded/ --flowcell FLO-MIN106 --kit SQK-DCS108 --skip_demultiplexing -profile docker --max_cpus 2 --max_memory 6.GB --max_time 3.h
and again they give me
Quantification can only be performed if all samples in the samplesheet have the same reference fasta and GTF file." Please specify the '--skip_quantification' parameter if you wish to skip these steps.
This is weird because in the test spreadsheet of basecalling but no demultiplexing has only one reference.
again. I have to add --skip_quantification However, unlike with basecalling and demultiplexing (barcoded). This time the pipeline skip many steps
Here

executor >  local (7)
[63/8dee55] process > CHECK_SAMPLESHEET (samplesh... [100%] 1 of 1 ✔
[f0/a4bfc6] process > GUPPY (nonbarcoded)            [100%] 1 of 1 ✔
[b3/535422] process > PYCOQC (sequencing_summary.... [100%] 1 of 1 ✔
[01/8d25f2] process > NANOPLOT_SUMMARY (sequencin... [100%] 1 of 1 ✔
[-        ] process > NANOPLOT_FASTQ                 -
[-        ] process > FASTQC                         -
[-        ] process > GET_CHROM_SIZES                -
[-        ] process > GTF_TO_BED                     -
[-        ] process > MINIMAP2_INDEX                 -
[-        ] process > MINIMAP2_ALIGN                 -
[-        ] process > SAMTOOLS_SORT                  -
[-        ] process > BEDTOOLS_GENOMECOV             -
[-        ] process > UCSC_BEDGRAPHTOBIGWIG          -
[-        ] process > BEDTOOLS_BAMTOBED              -
[-        ] process > UCSC_BED12TOBIGBED             -
[8c/450417] process > OUTPUT_DOCUMENTATION           [100%] 1 of 1 ✔
[70/5874db] process > GET_SOFTWARE_VERSIONS          [100%] 1 of 1 ✔
[42/f0c6a3] process > MULTIQC (1)                    [100%] 1 of 1 ✔
-[nf-core/nanoseq] Pipeline completed successfully-

Please, could you give me any suggestions.

typo in subway map

Nanopore basecalling, demultiplexing, QC, alignment, and downstram analysis
should be:
Nanopore basecalling, demultiplexing, QC, alignment, and downstream analysis

Basecalling model as input to Guppy

I would like to add a basecalling model option for input to Guppy.
I.e. take the .json output of something like Taiyaki and pass it to Guppy using the --model flag. Seems like custom basecalling models are being used more and more so would be super nice to have this in from the first release.

Will need a basecalling model to test it works though which I am finding very difficult to locate.

Thinking something like --guppy_model or --guppy_basecalling_model as an optional command line parameter to take a .json file.

Improve documentation

Need to add more description and images to output.md. Also, usage.md could do with some more detail regarding the parameters used and how they can be used in combination.

Output demultiplexed fast5 files

Is your feature request related to a problem? Please describe

I want the pipeline to output demultiplexed fast5 (devided to directories of barcode01,barcode02 etc.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

I'm working on pull request to do it using demux_fast5 from ont_fast5_api

problem specifying local genome and transcriptome

Hi,

I am trying to specify a local genome/transcriptome in my sample sheet, but am getting an error:-


Command output:
  ERROR: Please check samplesheet -> Input file does not have extension '.fastq.gz', '.fq.gz' or '.bam'!
  Line: 'GFP_2A,1,,merged_fastq/barcode01.fastq,dmel-all-chromosome-r6.42.fasta,dmel-all-r6.42.gtf'

Command wrapper:
  ERROR: Please check samplesheet -> Input file does not have extension '.fastq.gz', '.fq.gz' or '.bam'!
  Line: 'GFP_2A,1,,merged_fastq/barcode01.fastq,dmel-all-chromosome-r6.42.fasta,dmel-all-r6.42.gtf'

it seems to think that the genome and transcriptome is an input_file?

Here's the first row of my sample sheet

group,replicate,barcode,input_file,genome,transcriptome
GFP_2A,1,,merged_fastq/barcode01.fastq,dmel-all-chromosome-r6.42.fasta,dmel-all-r6.42.gtf

and my command is;-

nextflow run nf-core/nanoseq \
                                                        -resume \
                                                        --input sampleSheet.csv \
                                                        --protocol cDNA \
                                                        --skip_basecalling \
                                                        --skip_demultiplexing \
                                                        -profile singularity \
                                                        --max_memory 64.GB \
                                                        --max_cpus 4

Thanks for any advice!

Samtool sort or indexing aborted due to an "unexpected error"

Description of the bug

pipeline stops with the somewhat undescript error "Execution aborted due to an unexpected error"
I couldn't figure out whether the issue lies within samtools sort or samtools index, but it seems a .bai file is created while .csi and .crai are missing. If this is by design or not, I am not sure.

Command used and terminal output

$nextflow run /vulpes/ngi/staging/220908.aadcfe0/sw/nanoseq/workflow/ -profile uppmax \ -c /vulpes/ngi/staging/220908.aadcfe0/conf//nextflow_miarka_sthlm.config \ -c /vulpes/ngi/staging/220908.aadcfe0/conf//nanoseq_sthlm.config --input samplesheet_v2.csv --protocol directRNA --kit SQK-LSK109 --skip_basecalling --skip_demultiplexing --outdir /proj/ngi2016003/nobackup/fran/analysis/nanopore/projects/P25605/results/nanoseq_RNA2/ -resume --skip_fusion_analysis --skip_differential_analysis

terminal output:
Monitor the execution with Nextflow Tower using this url https://ngi-tower.scilifelab.se/orgs/NGI/workspaces/NGI-Stockholm/watch/266swhDov0hekG
Monitor the execution with Nextflow Tower using this url https://ngi-tower.scilifelab.se/orgs/NGI/workspaces/NGI-Stockholm/watch/266swhDov0hekG
Monitor the execution with Nextflow Tower using this url https://ngi-tower.scilifelab.se/orgs/NGI/workspaces/NGI-Stockholm/watch/266swhDov0hekG
[43/057fed] process > NFCORE_NANOSEQ:NANOSEQ:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet_v2.csv)                                     [100%] 1 of 1, cached: 1 ✔
[dd/3921bb] process > NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT (P25605_101_CII_R1)                                   [100%] 4 of 4, cached: 4 ✔
[cd/fb90cc] process > NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:FASTQC (P25605_102_CII_SF3B1_WT_OE_shSF3B1_1_R1)               [100%] 4 of 4, cached: 4 ✔
[96/ddf0e8] process > NFCORE_NANOSEQ:NANOSEQ:PREPARE_GENOME:GET_CHROM_SIZES (genome.fa)                                             [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:PREPARE_GENOME:GTF2BED                                                                 -
[da/c72cdc] process > NFCORE_NANOSEQ:NANOSEQ:PREPARE_GENOME:SAMTOOLS_FAIDX (genome.fa)                                              [100%] 4 of 4, cached: 4 ✔
[37/819191] process > NFCORE_NANOSEQ:NANOSEQ:ALIGN_MINIMAP2:MINIMAP2_INDEX (genome.fa)                                              [100%] 1 of 1, cached: 1 ✔
[dc/daa322] process > NFCORE_NANOSEQ:NANOSEQ:ALIGN_MINIMAP2:MINIMAP2_ALIGN (P25605_101_CII_R1)                                      [100%] 4 of 4, cached: 4 ✔
[58/2766db] process > NFCORE_NANOSEQ:NANOSEQ:BAM_SORT_INDEX_SAMTOOLS:SAMTOOLS_VIEW_BAM (P25605_104_HG3SF3B1_MUT_OEshSF3B1-1_R1)     [100%] 4 of 4, cached: 4 ✔
[5b/2ab78c] process > NFCORE_NANOSEQ:NANOSEQ:BAM_SORT_INDEX_SAMTOOLS:SAMTOOLS_SORT (P25605_104_HG3SF3B1_MUT_OEshSF3B1-1_R1)         [100%] 4 of 4, cached: 4 ✔
[e5/d81f52] process > NFCORE_NANOSEQ:NANOSEQ:BAM_SORT_INDEX_SAMTOOLS:SAMTOOLS_INDEX (P25605_101_CII_R1)                             [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:BAM_SORT_INDEX_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS                              [  0%] 0 of 4
[fa/959feb] process > NFCORE_NANOSEQ:NANOSEQ:BAM_SORT_INDEX_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT (P25605_104_HG3SF3B1_M... [ 25%] 1 of 4, cached: 1
[de/4c7ddf] process > NFCORE_NANOSEQ:NANOSEQ:BAM_SORT_INDEX_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS (P25605_104_HG3SF3B1_M... [ 25%] 1 of 4, cached: 1
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:BEDTOOLS_UCSC_BIGWIG:BEDTOOLS_GENOMECOV                                                [  0%] 0 of 4
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:BEDTOOLS_UCSC_BIGWIG:UCSC_BEDGRAPHTOBIGWIG                                             -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:BEDTOOLS_UCSC_BIGBED:BEDTOOLS_BAMBED                                                   [  0%] 0 of 4
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:BEDTOOLS_UCSC_BIGBED:UCSC_BED12TOBIGBED                                                -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:BAMBU                                                                                  [  0%] 0 of 1
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:RNA_MODIFICATION_XPORE_M6ANET:NANOPOLISH_INDEX_EVENTALIGN                              -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:RNA_MODIFICATION_XPORE_M6ANET:XPORE_DATAPREP                                           -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:RNA_MODIFICATION_XPORE_M6ANET:XPORE_DIFFMOD                                            -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:RNA_MODIFICATION_XPORE_M6ANET:M6ANET_DATAPREP                                          -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:RNA_MODIFICATION_XPORE_M6ANET:M6ANET_INFERENCE                                         -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:CUSTOM_DUMPSOFTWAREVERSIONS                                                            -
[-        ] process > NFCORE_NANOSEQ:NANOSEQ:MULTIQC                                                                                -
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/nanoseq] Pipeline completed with errors-
Execution aborted due to an unexpected error

 -- Check script '/vulpes/ngi/production/v22.09/sw/nanoseq/workflow/./workflows/../subworkflows/local/../../modules/local/nanopolish_index_eventalign.nf' at line: 2 or see '.nextflow.log' file for more details

Relevant files

nextflow.log

System information

N E X T F L O W ~ version 21.10.6
Hardware - HPC (Uppmax)
Executor - Slurm
container - Docker/Singularity
nanoseq version: nf-core/nanoseq v3.0.0

skipped alignment steps

Hi,

When I use Nanoseq with .fasta5 files as input, all the alignement steps are skipped. Here are my commands:

nextflow run nf-core/nanoseq \
    -r 1.0.0 \
    -profile singularity \
    --input /home/genouest/cnrs_umr6290/mlorthiois/nanoseq_test/input.csv \
    --protocol cDNA \
    --input_path /home/genouest/cnrs_umr6290/mlorthiois/nanoseq_test/data \
    --flowcell FLO-MIN106 \
    --kit SQK-DCS109 \
    --skip_demultiplexing \
    --max_cpus 30 \
    --igenomes_ignore false 

At the end, I get -[nf-core/nanoseq] Pipeline completed successfully-, but only these steps are computed:

[86/2a19f8] process > CheckSampleSheet (input.csv)   [100%] 1 of 1 ✔
[5b/8e1c88] process > Guppy (data)                   [100%] 1 of 1 ✔
[42/1c9b63] process > PycoQC (sequencing_summary.... [100%] 1 of 1 ✔
[b5/a3702e] process > NanoPlotSummary (sequencing... [100%] 1 of 1 ✔
[-        ] process > NanoPlotFastQ                  -
[-        ] process > FastQC                         -
[-        ] process > GetChromSizes                  -
[-        ] process > GTFToBED                       -
[-        ] process > GraphMap2Index                 -
[-        ] process > GraphMap2Align                 -
[-        ] process > SortBAM                        -
[-        ] process > BAMToBedGraph                  -
[-        ] process > BedGraphToBigWig               -
[-        ] process > BAMToBed12                     -
[-        ] process > Bed12ToBigBed                  -
[23/a0803d] process > output_documentation           [100%] 1 of 1 ✔
[3f/323218] process > get_software_versions          [100%] 1 of 1 ✔
[99/fdd59c] process > MultiQC (1)                    [100%] 1 of 1 ✔

So I get a fastq.gz file by guppy, the only solution I found is to run again Nanoseq, without basecalling and demultiplexing, with the fastq.gz as input. In this case, all the steps are computed.

Improve test-dataset to avoid 255 error

-profile test is currently passing but I have had to bypass a 255 error we are getting when generating the bigWig and bigBed files because one the samples doesnt have any reads that map to reference gene sequence. Would be good if we can amend the test-dataset so we dont get this error.

Transcriptome alignment

We need a way to specify which fasta files in the samplesheet are reference transcriptomes in order to implement this.
For protocols such as directRNA, the alignment call will change if the reference is a transcriptome (-ax splice changes to -ax map-ont)
Do we want to be able to perform both genome and transcriptome alignment in one run of the pipeline?

Let me know what you think

ascii codec issue

Dear nf-core team,

Hello guys.
2 weeks ago, the moment that I found nextflow and nf-core library, I was so happy that I could re-write all pipelines I've tried on you guy's platform.
by the way, in the period testing newly written pipelines, I found some errors below.

$ nf-core lint --release .

                                      ,--./,-.
      ___     __   __   __   ___     /,-._.--~\
|\ | |__  __ /  ` /  \ |__) |__         }  {
| \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                      `._,._,'

Running pipeline tests [####################################] 100% None
Traceback (most recent call last):
File "/home/Program/miniconda3/bin/nf-core", line 353, in
nf_core_cli()
File "/home/Program/miniconda3/lib/python2.7/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/home/Program/miniconda3/lib/python2.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/Program/miniconda3/lib/python2.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/Program/miniconda3/lib/python2.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/Program/miniconda3/lib/python2.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/Program/miniconda3/bin/nf-core", line 239, in lint
lint_obj = nf_core.lint.run_linting(pipeline_dir, release)
File "/home/Program/miniconda3/lib/python2.7/site-packages/nf_core/lint.py", line 54, in run_linting
lint_obj.print_results()
File "/home/Program/miniconda3/lib/python2.7/site-packages/nf_core/lint.py", line 922, in print_results
click.style(" [{}] {:>4} tests failed".format(u'\u2717', len(self.failed)), fg='red') + rl
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2714' in position 0: ordinal not in range(128)

Only things that I found issue is that it seems kind of type-casting error, but nothing else.
Please let me know how to lint properly and run system as you've intended.

Thanks.

guppy_basecaller with gpu: error while loading shared libraries: libcuda.so.1

Hello,
I am trying to run nanoseq on our HPC with singularity. To do so, I created a conda environemnt and installed singularity (3.6.3) and nextflow (20.10.0). I changed maksuashfs path in the singularity.conf into mksquashfs path = /home/abdosa/miniconda2/envs/singularity/bin/ (otherwise I get an error not finding squahfs).
Finally I submitted this job to our cluster (slurm) -through the submission node- with singularity profile and (runOptions = '--nv') : sbatch -J nanoseq_test --wrap="nextflow run nf-core/nanoseq -r 1.1.0 -profile test,singularity -c resources.singularity.nanoseq.slurm.short.config --guppy_gpu --guppy_gpu_runners 6 --gpu_cluster_options '--part=gpu20 --gres=gpu:1'" . This worked perfectly and the pipeline finished succesfully (without --nv option it failed).

When I ran nanoseq with gpu on my data it failed (with and without --nv option) at the GUPPY (barcode) step becasue of the follwoing error:

Error executing process > 'GUPPY (fast5)'

Caused by:
  Process `GUPPY (fast5)` terminated with an error exit status (127)

Command executed:

  guppy_basecaller \
      --input_path fast5 \
      --save_path ./basecalling \
      --records_per_fastq 0 \
      --compress_fastq \
       \
      --device auto --num_callers 12 --cpu_threads_per_caller 2 --gpu_runners_per_device 12 \
       \
      --config dna_r9.4.1_450bps_hac.cfg \

  guppy_basecaller --version &> v_guppy.txt

  ## Concatenate fastq files
  mkdir fastq
  cd basecalling
  if [ "$(find . -type d -name "barcode*" )" != "" ]
  then
      for dir in barcode*/
      do
          dir=${dir%*/}
          cat $dir/*.fastq.gz > ../fastq/$dir.fastq.gz
      done
  else
      cat *.fastq.gz > ../fastq/HM27_0h_R1.fastq.gz
  fi

Command exit status:
  127

Command output:
  (empty)

Command error:
  INFO:    Convert SIF file to sandbox...
  guppy_basecaller: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
  INFO:    Cleaning up image...

I logged-in to the gpu node and checked the follwoing:

abdosa@gpu20-20:~$ nvidia-smi
Mon Feb  1 10:32:09 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro RTX 8000     On   | 00000000:44:00.0 Off |                  Off |
| 33%   30C    P8    11W / 260W |      0MiB / 48601MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

abdosa@gpu20-20:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148

Then I activated singularity conda environemnt and started the genomicpariscentre-guppy-gpu-4.0.14.img image with interactive shell and ran guppy_basecaller --help but I got the same error:
guppy_basecaller: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory

singularity shell genomicpariscentre-guppy-gpu-4.0.14.img
INFO:    Convert SIF file to sandbox...
Singularity> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

when I look at the submission node into /usr/lib/ I can find the follwoing directories: cuda/ cuda-10.0/ cuda-10.1/ cuda-9.0/ cuda-9.2/

could you please help debugging this error
Thank you
Abdul

error of test run

Description of the bug

I tried to run the test, docker command:

nextflow run nf-core/nanoseq -r 3.0.0 -profile test,docker --outdir results
```bash

### Command used and terminal output

```console
here the log:

Oct-31 08:09:19.271 [main] DEBUG nextflow.cli.Launcher - $> nextflow run nf-core/nanoseq -r 3.0.0 -profile test,docker --outdir results
Oct-31 08:09:19.406 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 19.01.0
Oct-31 08:09:20.577 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/.git/config; branch: master; remote: origin; url: https://github.com/nf-core/nanoseq.git
Oct-31 08:09:20.589 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/.git/config; branch: master; remote: origin; url: https://github.com/nf-core/nanoseq.git
Oct-31 08:09:21.191 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/.git/config; branch: master; remote: origin; url: https://github.com/nf-core/nanoseq.git
Oct-31 08:09:21.192 [main] INFO  nextflow.cli.CmdRun - Launching `nf-core/nanoseq` [furious_lichterman] - revision: 1e60482a2c [3.0.0]
Oct-31 08:09:22.075 [main] DEBUG nextflow.config.ConfigBuilder - Found config base: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/nextflow.config
Oct-31 08:09:22.077 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/nextflow.config
Oct-31 08:09:22.094 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `test,docker`
Oct-31 08:09:23.887 [main] DEBUG nextflow.config.ConfigBuilder - In the following config object the attribute `params.genomes.GRCh37.projectDir` is empty:
  fasta=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa
  bwa=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/version0.6.0/
  bowtie2=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/
  star=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/
  bismark=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Sequence/BismarkIndex/
  gtf=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf
  bed12=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.bed
  readme=s3://ngi-igenomes/igenomes//Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt
  mito_name='MT'
  macs_gsize='2.7e9'
  blacklist=[:]/assets/blacklists/GRCh37-blacklist.bed

Oct-31 08:09:23.895 [main] ERROR nextflow.cli.Launcher - Unknown config attribute `params.genomes.GRCh37.projectDir` -- check config file: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/nextflow.config
nextflow.exception.ConfigParseException: Unknown config attribute `params.genomes.GRCh37.projectDir` -- check config file: /home/ubuntu/.nextflow/assets/nf-core/nanoseq/nextflow.config
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:83)
	at org.codehaus.groovy.reflection.CachedConstructor.doConstructorInvoke(CachedConstructor.java:77)
	at org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrap.callConstructor(ConstructorSite.java:84)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallConstructor(CallSiteArray.java:59)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:237)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:249)
	at nextflow.config.ConfigBuilder.validate(ConfigBuilder.groovy:400)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrap.invoke(PogoMetaMethodSite.java:179)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:184)
	at nextflow.config.ConfigBuilder.validate(ConfigBuilder.groovy:402)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrap.invoke(PogoMetaMethodSite.java:179)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:63)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:184)
	at nextflow.config.ConfigBuilder.validate(ConfigBuilder.groovy:402)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:156)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:184)
	at nextflow.config.ConfigBuilder.validate(ConfigBuilder.groovy:402)
	at nextflow.config.ConfigBuilder.validate(ConfigBuilder.groovy)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:156)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:176)
	at nextflow.config.ConfigBuilder.merge0(ConfigBuilder.groovy:367)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:156)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:184)
	at nextflow.config.ConfigBuilder.buildConfig0(ConfigBuilder.groovy:316)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:156)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:176)
	at nextflow.config.ConfigBuilder.buildGivenFiles(ConfigBuilder.groovy:282)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:156)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:168)
	at nextflow.config.ConfigBuilder.buildConfigObject(ConfigBuilder.groovy:633)
	at nextflow.config.ConfigBuilder.build(ConfigBuilder.groovy:646)
	at nextflow.script.ScriptRunner.<init>(ScriptRunner.groovy:121)
	at nextflow.cli.CmdRun.run(CmdRun.groovy:223)
	at nextflow.cli.Launcher.run(Launcher.groovy:445)
	at nextflow.cli.Launcher.main(Launcher.groovy:622)

Relevant files

No response

System information

No response

Error "cat: '*.fastq.gz': No such file or directory" when adding additional Guppy parameters

Hello, I tried to run the pipeline (-r dev) into accepting additional Guppy parameters so that Guppy would perform Qscore filtering of fastq into "pass" and "fail" groups. However the pipeline failed with this error Command error: cat: '*.fastq.gz': No such file or directory

The command line that I used was
nextflow run nf-core/nanoseq --input samplesheet.csv --protocol cDNA --input_path ./fast5_pass --flowcell 'FLO-MIN106 --qscore_filtering --min_qscore 7' --kit SQK-DCS109 --skip_demultiplexing --skip_quantification -config myconfig.config -profile singularity -r dev

The contents of my sample sheet and the pipeline_report.txt are listed in this GIST. When I look in the nextflow work directory used by Guppy, I see that it is properly separating the failed fastq from the passed fastq into their respective fail and pass directories. But the pipeline doesn't realize that the *fastq.gz files are in the pass directory and not in the current directory; hence the cat error.

"/bin/bash: .command.run: Permission denied" with SELinux enabled

Hi I faced the following "/bin/bash: .command.run: Permission denied" when running nf-core/nanoseq with with profile test,docker.

Is there any idea to find the reason? And is there a good way to debug?

Thank you.

My environment. nextflow is the latest version I installed last week.

The docker is installed by sudo dnf install moby-engine on Fedora 32.

$ uname -m
x86_64

$ cat /etc/fedora-release
Fedora release 32 (Thirty Two)

$ nextflow -v
nextflow version 20.07.1.5412

$ docker --version
Docker version 19.03.11, build 42e35e6

$ rpm -qf $(which docker)
moby-engine-19.03.11-1.ce.git42e35e6.fc32.x86_6
$ docker system prune -a -f

$ docker images 
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

$ pwd
/home/jaruga/git/nf-core/nanoseq

On master: b88e1c9a77083e90dd0bc5e900e9bcff84814559.

$ nextflow run nf-core/nanoseq -profile test,docker 2>&1 | tee nextflow-run-nanoseq-test-docker.log

Here is the log file.

Caused by:
  Process `output_documentation` terminated with an error exit status (126)

Command executed:

  markdown_to_html.py output.md -o results_description.html

Command exit status:
  126

Command output:
  (empty)

Command error:
  Unable to find image 'nfcore/nanoseq:1.0.0' locally
...
  Status: Image is up to date for nfcore/nanoseq:1.0.0
  /bin/bash: .command.run: Permission denied

Work dir:
  /home/jaruga/git/nf-core/nanoseq/work/7f/48ee820291acfcbfa6fc35f1c27139

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

The .command.out is empty.

$ cat  /home/jaruga/git/nf-core/nanoseq/work/7f/48ee820291acfcbfa6fc35f1c27139/.command.out

$ cat  /home/jaruga/git/nf-core/nanoseq/work/7f/48ee820291acfcbfa6fc35f1c27139/.command.sh 
#!/bin/bash -euo pipefail
markdown_to_html.py output.md -o results_description.html

Add optionnal DNA cleaning option

Hello everyone !

Thank you for the work on nanoseq !

Often in Nanopore sequencing, there is a QC DNA that must be added during the library prepatation which are known portion of the lambda phage, called DNA CS. There is an existing tool that we are using to remove the read corresponding to this DNA control before handing them to research team called nanolyse : https://github.com/wdecoster/nanolyse .

The tool take fastq as input from stdin and give the cleaned sequences in stdout. It is possible to also add as an input the sequencing summary and as an output the cleaned sequencing summary, so the short read from the DNA_cs do not disturb the QC tools on the median and N50 metrics of usable reads (QC tools often take the sequencing_summary as an input).

I am a very beginner in nextflow, and before I discovered nanoseq, I was trying to do that myself. If someone feels more comfortable doing it and adding it to nanoseq, that would be great !

I will be continuing to work on it on my side and I can give a hand if needed but as I said, I am very noobish in nextflow...

Have a great day !

Roxane

fastq concatenation expects barcode01 directory exists

line 239: if [ -d "barcode01" ]

get Error when running samples that don't contain barcode01 (e.g. my pool contains barcode 16-18)

solved it by replacing line 239 with
if ls -d "barcode"*

but there might be a more beautiful bash solution.

MultiQC Error when running pipeline in Ubuntu WSL under win 10

When running the pipeline in a Ubuntu 20.4 wsl 2 with docker windows the pipeline errors out before finishing when creating the MultiQC Report:

Log from Commandline:

jkbenotmane@DESKTOP-H68A51G:/mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline$ nextflow run nf-core/nanoseq --input '/mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/SPATCR#1_2_samplesheet_10XVDJ.csv' --protocol cDNA --skip_basecalling --skip_demultiplexing --skip_quantification --outdir '/mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Merged_SPATCR1_2_10XVDJ' --max_memory '30.GB' --max_cpus 8 -profile docker -w "Output_logs_1"
N E X T F L O W ~ version 20.10.0
Launching nf-core/nanoseq [berserk_tesla] - revision: ad5b2bb [master]

                                    ,--./,-.
    ___     __   __   __   ___     /,-._.--~'

|\ | |__ __ / / \ |__) |__ } { | \| | \__, \__/ | \ |___ \-.,--, .,._,'
nf-core/nanoseq v1.1.0

Pipeline Release : master
Run Name : berserk_tesla
Samplesheet : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/SPATCR#1_2_samplesheet_10XVDJ.csv
Protocol : cDNA
Stranded : No
Skip Basecalling : Yes
Skip Demultiplexing : Yes
Aligner : minimap2
Save Intermeds : No
Max Resources : 30.GB memory, 8 cpus, 10d time per job
Container : docker - nfcore/nanoseq:1.1.0
Output dir : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Merged_SPATCR1_2_10XVDJ
Launch dir : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline
Working dir : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Output_logs_1
Script dir : /home/jkbenotmane/.nextflow/assets/nf-core/nanoseq
User : jkbenotmane
Config Profile : docker
Config Files : /home/jkbenotmane/.nextflow/assets/nf-core/nanoseq/nextflow.config

executor > local (13)
[d7/9b11bb] process > CHECK_SAMPLESHEET (SPATCR#1_2_samplesheet_10XVDJ.csv) [100%] 1 of 1 ✔
[- ] process > PYCOQC -
[- ] process > NANOPLOT_SUMMARY -
[85/883c8f] process > NANOPLOT_FASTQ (UDI5_R1) [100%] 1 of 1 ✔
[dc/6f7c9a] process > FASTQC (UDI5_R1) [100%] 1 of 1 ✔
[f4/ab7749] process > GET_CHROM_SIZES (VDJ_regions.fa) [100%] 1 of 1 ✔
[- ] process > GTF_TO_BED -
[17/4f1020] process > MINIMAP2_INDEX (VDJ_regions.fa) [100%] 1 of 1 ✔
[92/867e4c] process > MINIMAP2_ALIGN (UDI5_R1) [100%] 1 of 1 ✔
[a6/216f2e] process > SAMTOOLS_SORT (UDI5_R1) [100%] 1 of 1 ✔
[7c/297fd3] process > BEDTOOLS_GENOMECOV (UDI5_R1) [100%] 1 of 1 ✔
[- ] process > UCSC_BEDGRAPHTOBIGWIG [ 0%] 0 of 1
[24/f45afa] process > BEDTOOLS_BAMTOBED (UDI5_R1) [100%] 1 of 1 ✔
[7e/f7a186] process > UCSC_BED12TOBIGBED (UDI5_R1) [ 0%] 0 of 1
[f9/bbaddf] process > OUTPUT_DOCUMENTATION [100%] 1 of 1 ✔
[75/bb927d] process > GET_SOFTWARE_VERSIONS [100%] 1 of 1 ✔
[a6/d5b1e1] process > MULTIQC (1) [ 0%] 0 of 1
Error executing process > 'MULTIQC (1)'

Caused by:
Process MULTIQC (1) terminated with an error exit status (1)

Command executed:

multiqc . -f

Command exit status:
1

Command output:
(empty)

Command error:
[WARNING] multiqc : MultiQC Version v1.10.1 now available!
[INFO ] multiqc : This is MultiQC v1.9
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Output_logs_1/a6/d5b1e186c3f88fb02836b305bcea2d
[INFO ] multiqc : Only using modules custom_content, pycoqc, fastqc, samtools, featureCounts
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-vvr3wxj8 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
[INFO ] custom_content : nf-core-nanoseq-summary: Found 1 sample (html)
[INFO ] custom_content : software_versions: Found 1 sample (html)
[INFO ] fastqc : Found 1 reports
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
[INFO ] samtools : Found 1 stats reports
[INFO ] samtools : Found 1 flagstat reports
[INFO ] samtools : Found 1 idxstats reports
[INFO ] multiqc : Compressing plot data
[INFO ] multiqc : Report : multiqc_report.html
[INFO ] multiqc : Data : multiqc_data
executor > local (13)
[d7/9b11bb] process > CHECK_SAMPLESHEET (SPATCR#1_2_samplesheet_10XVDJ.csv) [100%] 1 of 1 ✔
[- ] process > PYCOQC -
[- ] process > NANOPLOT_SUMMARY -
[85/883c8f] process > NANOPLOT_FASTQ (UDI5_R1) [100%] 1 of 1 ✔
[dc/6f7c9a] process > FASTQC (UDI5_R1) [100%] 1 of 1 ✔
[f4/ab7749] process > GET_CHROM_SIZES (VDJ_regions.fa) [100%] 1 of 1 ✔
[- ] process > GTF_TO_BED -
[17/4f1020] process > MINIMAP2_INDEX (VDJ_regions.fa) [100%] 1 of 1 ✔
[92/867e4c] process > MINIMAP2_ALIGN (UDI5_R1) [100%] 1 of 1 ✔
[a6/216f2e] process > SAMTOOLS_SORT (UDI5_R1) [100%] 1 of 1 ✔
[7c/297fd3] process > BEDTOOLS_GENOMECOV (UDI5_R1) [100%] 1 of 1 ✔
[- ] process > UCSC_BEDGRAPHTOBIGWIG [ 0%] 0 of 1
[24/f45afa] process > BEDTOOLS_BAMTOBED (UDI5_R1) [100%] 1 of 1 ✔
[7e/f7a186] process > UCSC_BED12TOBIGBED (UDI5_R1) [ 0%] 0 of 1
[f9/bbaddf] process > OUTPUT_DOCUMENTATION [100%] 1 of 1 ✔
[75/bb927d] process > GET_SOFTWARE_VERSIONS [100%] 1 of 1 ✔
[a6/d5b1e1] process > MULTIQC (1) [100%] 1 of 1, failed: 1 ✘
Execution cancelled -- Finishing pending tasks before exit
Error executing process > 'MULTIQC (1)'

Caused by:
Process MULTIQC (1) terminated with an error exit status (1)

Command executed:

multiqc . -f

Command exit status:
1

Command output:
(empty)

Command error:
[WARNING] multiqc : MultiQC Version v1.10.1 now available!
[INFO ] multiqc : This is MultiQC v1.9
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Output_logs_1/a6/d5b1e186c3f88fb02836b305bcea2d
[INFO ] multiqc : Only using modules custom_content, pycoqc, fastqc, samtools, featureCounts
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-vvr3wxj8 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
[INFO ] custom_content : nf-core-nanoseq-summary: Found 1 sample (html)
[INFO ] custom_content : software_versions: Found 1 sample (html)
[INFO ] fastqc : Found 1 reports
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
[INFO ] samtools : Found 1 stats reports
[INFO ] samtools : Found 1 flagstat reports
[INFO ] samtools : Found 1 idxstats reports
[INFO ] multiqc : Compressing plot data
[INFO ] multiqc : Report : multiqc_report.html
[INFO ] multiqc : Data : multiqc_data
executor > local (13)
[d7/9b11bb] process > CHECK_SAMPLESHEET (SPATCR#1_2_samplesheet_10XVDJ.csv) [100%] 1 of 1 ✔
[- ] process > PYCOQC -
[- ] process > NANOPLOT_SUMMARY -
[85/883c8f] process > NANOPLOT_FASTQ (UDI5_R1) [100%] 1 of 1 ✔
[dc/6f7c9a] process > FASTQC (UDI5_R1) [100%] 1 of 1 ✔
[f4/ab7749] process > GET_CHROM_SIZES (VDJ_regions.fa) [100%] 1 of 1 ✔
[- ] process > GTF_TO_BED -
[17/4f1020] process > MINIMAP2_INDEX (VDJ_regions.fa) [100%] 1 of 1 ✔
[92/867e4c] process > MINIMAP2_ALIGN (UDI5_R1) [100%] 1 of 1 ✔
[a6/216f2e] process > SAMTOOLS_SORT (UDI5_R1) [100%] 1 of 1 ✔
[7c/297fd3] process > BEDTOOLS_GENOMECOV (UDI5_R1) [100%] 1 of 1 ✔
[- ] process > UCSC_BEDGRAPHTOBIGWIG [ 0%] 0 of 1
[24/f45afa] process > BEDTOOLS_BAMTOBED (UDI5_R1) [100%] 1 of 1 ✔
[7e/f7a186] process > UCSC_BED12TOBIGBED (UDI5_R1) [100%] 1 of 1 ✔
[f9/bbaddf] process > OUTPUT_DOCUMENTATION [100%] 1 of 1 ✔
[75/bb927d] process > GET_SOFTWARE_VERSIONS [100%] 1 of 1 ✔
[a6/d5b1e1] process > MULTIQC (1) [100%] 1 of 1, failed: 1 ✘
Execution cancelled -- Finishing pending tasks before exit
Error executing process > 'MULTIQC (1)'

Caused by:
Process MULTIQC (1) terminated with an error exit status (1)

Command executed:

multiqc . -f

Command exit status:
1

Command output:
(empty)

Command error:
[WARNING] multiqc : MultiQC Version v1.10.1 now available!
[INFO ] multiqc : This is MultiQC v1.9
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Output_logs_1/a6/d5b1e186c3f88fb02836b305bcea2d
[INFO ] multiqc : Only using modules custom_content, pycoqc, fastqc, samtools, featureCounts
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-vvr3wxj8 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
[INFO ] custom_content : nf-core-nanoseq-summary: Found 1 sample (html)
[INFO ] custom_content : software_versions: Found 1 sample (html)
[INFO ] fastqc : Found 1 reports
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
[INFO ] samtools : Found 1 stats reports
[INFO ] samtools : Found 1 flagstat reports
[INFO ] samtools : Found 1 idxstats reports
[INFO ] multiqc : Compressing plot data
[INFO ] multiqc : Report : multiqc_report.html
[INFO ] multiqc : Data : multiqc_data
Traceback (most recent call last):
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/bin/multiqc", line 6, in
executor > local (13)
[d7/9b11bb] process > CHECK_SAMPLESHEET (SPATCR#1_2_samplesheet_10XVDJ.csv) [100%] 1 of 1 ✔
[- ] process > PYCOQC -
[- ] process > NANOPLOT_SUMMARY -
[85/883c8f] process > NANOPLOT_FASTQ (UDI5_R1) [100%] 1 of 1 ✔
[dc/6f7c9a] process > FASTQC (UDI5_R1) [100%] 1 of 1 ✔
[f4/ab7749] process > GET_CHROM_SIZES (VDJ_regions.fa) [100%] 1 of 1 ✔
[- ] process > GTF_TO_BED -
[17/4f1020] process > MINIMAP2_INDEX (VDJ_regions.fa) [100%] 1 of 1 ✔
[92/867e4c] process > MINIMAP2_ALIGN (UDI5_R1) [100%] 1 of 1 ✔
[a6/216f2e] process > SAMTOOLS_SORT (UDI5_R1) [100%] 1 of 1 ✔
[7c/297fd3] process > BEDTOOLS_GENOMECOV (UDI5_R1) [100%] 1 of 1 ✔
[- ] process > UCSC_BEDGRAPHTOBIGWIG [ 0%] 0 of 1
[24/f45afa] process > BEDTOOLS_BAMTOBED (UDI5_R1) [100%] 1 of 1 ✔
[7e/f7a186] process > UCSC_BED12TOBIGBED (UDI5_R1) [100%] 1 of 1 ✔
[f9/bbaddf] process > OUTPUT_DOCUMENTATION [100%] 1 of 1 ✔
[75/bb927d] process > GET_SOFTWARE_VERSIONS [100%] 1 of 1 ✔
[a6/d5b1e1] process > MULTIQC (1) [100%] 1 of 1, failed: 1 ✘

Error executing process > 'MULTIQC (1)'

Caused by:
Process MULTIQC (1) terminated with an error exit status (1)

Command executed:

multiqc . -f

Command exit status:
1

Command output:
(empty)

Command error:
[WARNING] multiqc : MultiQC Version v1.10.1 now available!
[INFO ] multiqc : This is MultiQC v1.9
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching : /mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Output_logs_1/a6/d5b1e186c3f88fb02836b305bcea2d
[INFO ] multiqc : Only using modules custom_content, pycoqc, fastqc, samtools, featureCounts
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-vvr3wxj8 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
[INFO ] custom_content : nf-core-nanoseq-summary: Found 1 sample (html)
[INFO ] custom_content : software_versions: Found 1 sample (html)
[INFO ] fastqc : Found 1 reports
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/plots/bargraph.py:451: UserWarning: FixedFormatter should only be used together with FixedLocator
axes.set_xticklabels(['{:.0f}%'.format(x) for x in vals])
[INFO ] samtools : Found 1 stats reports
[INFO ] samtools : Found 1 flagstat reports
[INFO ] samtools : Found 1 idxstats reports
[INFO ] multiqc : Compressing plot data
[INFO ] multiqc : Report : multiqc_report.html
[INFO ] multiqc : Data : multiqc_data
Traceback (most recent call last):
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/bin/multiqc", line 6, in
from multiqc.main import multiqc
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/main.py", line 44, in
multiqc.run_cli(prog_name='multiqc')
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/multiqc.py", line 215, in run_cli
multiqc_run = run(
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/site-packages/multiqc/multiqc.py", line 784, in run
copy_tree(config.data_tmp_dir, config.data_dir)
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/distutils/dir_util.py", line 161, in copy_tree
copy_file(src_name, dst_name, preserve_mode,
File "/opt/conda/envs/nf-core-nanoseq-1.1.0/lib/python3.8/distutils/file_util.py", line 158, in copy_file
os.utime(dst, (st[ST_ATIME], st[ST_MTIME]))
PermissionError: [Errno 1] Operation not permitted

Work dir:
/mnt/d/Dropbox/KBJasim/Projects/Capture_Sequencing/Samples/SpaTCR/SPATCR1_2/Nextflow _pipeline/Output_logs_1/a6/d5b1e186c3f88fb02836b305bcea2d

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

Nanoseq not pulling Deseq2 and Dexseq

The entire processes are run when running through docker profile.

But somehow the pipeline is not performing Deseq2 and Dexseq analysis.

I tried all the possible combinations from the protocol designer but still no luck.

Can this happen because of skipping the basecalling step, because basecalling has already been done from the minknow and it will take unnecessary time if I perform basecalling in the pipeline.

Please let me know what can changes can I do if i want to make the pipeline run deseq2 and deseq on my RNA sequenced samples.

Thank you.

sample sheet for samples without barcodes

Currently I have to specify a barcode in the sample sheet even if a sample doesn't have one, only so it can pass the sample sheet check. This happens if we run only a single sample per flowcell.
Without specifying a barcode kit, demultiplexing will still be skipped in these cases (as should be).

Create Genome indices as an additional step

The index creation process with both Graphmap and minimap2 takes a few minutes for even larger genomes so its not a massive overhead. This is currently done at run-time during the alignment but there may be an option to add an additional process(es) to do this beforehand - possibly to reduce redundancy in storage? Its going to be a bit tricky because the pipeline can accept multiple genomes as input for mapping within the samplesheet. Need to figure out a way to only create the index once for any given genome and then pass this to the appropriate mapping step for that sample.

Is there a way to run multiple guppy_basecaller jobs in parallel?

Is your feature request related to a problem? Please describe

We have a multi-GPU server where each process can only access a single GPU. So it is very advantageous for us to run multiple guppy_basecaller instances in parallel (usually we batch 50 or so fast5 files in a single process) and merge the results. From what I can surmise from the sample sheet description, this is not a workflow that is currently supported by nanoseq. Is that correct?

Describe the solution you'd like

Is there a pattern for that from other nf-core workflows which use embarrassingly parallel processing of reads?

Describe alternatives you've considered

I guess we could hack together a solution by creating many nanoseq runs for the basecalling, and then merging those together using nanoseq with basecalling skipped. But it would be nice to have this built in.

Thanks!

Additional context

consideration output directory custom

Is your feature request related to a problem? Please describe

it always output in result folder, differenet analysis will overlap
it would be great to rename the output directory

Expose guppy parameters to the user

We found out that we had to tweak the --gpu_runners_per_device parameter to guppy so it would be good to enable user to do this as this parameter is highly specific to what type of GPU you have. Should probably do this for other guppy parameters as well such as --cpu_threads_per_caller and --num_callers.

Total alignments : 0 / featureCount Results

I have human RNA-Seq dataset it has two different barcodes in the different folder. I aligned with that command
minimap2 -ax splice -uf -k14 ref.fa direct-rna.fq > aln.sam
I try to quantify and counts using with Subread featureCounts function. In the subread results, there is a problem with one of the bam files. I downloaded reference and gtf files from GENCODE.
I checked the bam file with samtools view -H first.bam-second.bam I saw that I followed the same steps for each bam file.
In the IGV results, I saw matches and alignment for all bam files.

Do you have any suggestions the solve this problem? What am I doing wrong?

featureCounts -T 8 -a gencode.v38.chr_patch_hapl_scaff.annotation.gtf -g 'transcript_id' -o readcouts.txt bam/*.bam

|| Total alignments : 11214480 ||
|| Successfully assigned alignments : 4051945 (36.1%) ||
|| Running time : 2.67 minutes

|| Total alignments : 0 ||
|| Successfully assigned alignments : 0 ||
|| Running time : 2.89 minutes

I also tried with Salmon in the salmon alignment-based quantification results bam file has huge differences between each other.
salmon quant --ont -t reference.fa -l A -a first.bam -o salmon_quant1
Total # of mapped reads : 5465357

of uniquely mapped reads : 328808350000000

ambiguously mapped reads : 2177274
salmon quant --ont -t reference.fa -l A -a second.bam -o salmon_quant2
Completed first pass through the alignment file.

Total # of mapped reads : 3843632

of uniquely mapped reads : 2552463

ambiguously mapped reads : 1291169

Add guppy to BioConda

I cant seem to find GraphMap2 on BioConda which also means the associated BioContainer wont exist. It would be nice to be able to use this in the pipeline because it has some updated functionality for Nanopore protocols.

running using -profile test fails

Hello,
the execution fails when using the profile test.

Launching `nf-core/nanoseq` [romantic_dalembert] - revision: c960c586d1 [master]
No such file: https://raw.githubusercontent.com/drpatelh/test-datasets/nanoseq/samplesheet.csv


Add GPU support for Guppy

The pipeline is able to use CPUs only at this point. It would definitely be worth adding GPU support because the speed-up is massive. Im not entirely sure how best to do this with NF and which parameters to use but opinions are more than welcome 😃

NanoPlot 1.28.4 has crashed

Hi,
I encounter a problem when using Nanoseq with the singularity profile.

Here are my commands:

nextflow run nf-core/nanoseq -latest \
    --input /home/genouest/cnrs_umr6290/mlorthiois/nanoseq_test/input.csv \
    --protocol cDNA \
    --input_path /home/genouest/cnrs_umr6290/mlorthiois/nanoseq_test/data \
    --flowcell FLO-MIN106 \
    --kit SQK-DCS109 \
    --skip_demultiplexing \
    -profile singularity

During the Nanoplot step, I get this error :

Error executing process > 'NanoPlotSummary (sequencing_summary.txt)'

Caused by:
  Process `NanoPlotSummary (sequencing_summary.txt)` terminated with an error exit status (1)

Command executed:

  NanoPlot -t 2 --summary sequencing_summary.txt

Command exit status:
  1

Command output:

  If you read this then NanoPlot 1.28.4 has crashed :-(
  Please try updating NanoPlot and see if that helps...
  
  If not, please report this issue at https://github.com/wdecoster/NanoPlot/issues
  If you could include the log file that would be really helpful.
  Thanks!
  
Command error:
  Traceback (most recent call last):
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/bin/NanoPlot", line 10, in <module>
      sys.exit(main())
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/nanoplot/NanoPlot.py", line 96, in main
      plots = make_plots(datadf, settings)
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/nanoplot/NanoPlot.py", line 197, in make_plots
      plot_settings=plot_settings)
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/nanoplotter/timeplots.py", line 46, in time_plots
      color=color)
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/nanoplotter/timeplots.py", line 238, in cumulative_yield
      scatter_kws={"s": 3})
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/seaborn/regression.py", line 810, in regplot
      x_jitter, y_jitter, color, label)
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/seaborn/regression.py", line 114, in __init__
      self.dropna("x", "y", "units", "x_partial", "y_partial")
    File "/opt/conda/envs/nf-core-nanoseq-1.0.0/lib/python3.7/site-packages/seaborn/regression.py", line 66, in dropna
      setattr(self, var, val[not_na])
  IndexError: invalid index to scalar variable.

This error is reported on this issue. This one recommends to update Seaborn to version 0.10.1.

Will updating seaborn be enough to fix my error? If so, how can I modify the Singularity container to include the new version of Seaborn?

Thanks.

Spaces in samplesheet Path to genome

When including Pathes in the samplesheet to the reference genome/transcriptome etc. the pipeline throws execution errors if the Path contains Spaces.;

Command exit status: 1
Command output:
ERROR: Please check samplesheet -> Genome entry contains spaces!
Line: 'UDI17,1,,../NatTCRcompressed.fastq.gz,/home/jkb/Desktop/Dropbox/Alignment\ References/Gencode/GRCh38.p13.genome.fa.gz,/home/jkb/Desktop/Dropbox/Alignment\ References/gencode.v19.annotation_1.gff3'

Command output:
ERROR: Please check samplesheet -> Genome entry contains spaces!
Line: 'UDI17,1,,../NatTCRcompressed.fastq.gz,/home/jkb/Desktop/Dropbox/Alignment' 'References/Gencode/GRCh38.p13.genome.fa.gz,/home/jkb/Desktop/Dropbox/Alignment' 'References/gencode.v19.annotation_1.gff3'

Command output: ERROR: Please check samplesheet -> Genome entry contains spaces!
Line: 'UDI17,1,,../NatTCRcompressed.fastq.gz,"/home/jkb/Desktop/Dropbox/Alignment References/Gencode/GRCh38.p13.genome.fa.gz","/home/jkb/Desktop/Dropbox/Alignment References/gencode.v19.annotation_1.gff3"'

Neither escaping the Space with a Backslash nor providing it as a string (wrapping it in Hyphen) solved the issue.
I would be very happy if someone could help me with that.

Cannot invoke method containsKey() on null object when "igenomes_ignore=true"

Hi,

When I use the profile provide by my institution (Genouest), which set the parameter igenomes_ignore = true, I get this error :

Cannot invoke method containsKey() on null object

 -- Check script '/home/genouest/cnrs_umr6290/mlorthiois/.nextflow/assets/nf-core/nanoseq/main.nf' at line: 290 or see '.nextflow.log' file for more details

Solved by adding --igenomes_ignore false in my command line.

Transcriptome aware genome alignment

Looking to implement transcriptome aware genome alignment in graphmap2 (requires a gtf file as input) and minimap2 (the --junc_bed flag does this, requires a bed12 file as input).

Current idea is to add an optional column in the samplesheet for a gtf or bed12 file, and then if the selected aligner is minimap2 and it is in gtf format we can convert this to a bed12.

Let me know what you think.

Error executing process > 'NFCORE_NANOSEQ:NANOSEQ:RNA_FUSIONS_JAFFAL:UNTAR (null)'

I have been getting this error


Error executing process > 'NFCORE_NANOSEQ:NANOSEQ:RNA_FUSIONS_JAFFAL:UNTAR (null)'
 Caused by:
Not a valid path value type: org.codehaus.groovy.runtime.NullObject (null)

 Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

Failed to invoke `workflow.onComplete` event handler 
 -- Check script '/root/.nextflow/assets/nf-core/nanoseq/./workflows/nanoseq.nf' at line: 524 or see '.nextflow.log' file for more details

This is the line I have run it with

nextflow run nf-core/nanoseq \ 

--input nanoseq_samplesheet.csv \ 

--outdir nf-core_NANOSEQ/ \ 

--protocol cDNA \ 

--skip_basecalling \ 

--skip_demultiplexing \ 

-profile docker \ 

--multiqc_title _LongRead_Nanopore_seq \ 

--save_align_intermeds  

Any help would be very much be appreciated

bamaddreadbundles failed

Every time I try to run this functions it fails.

bamaddreadbundles -I mapped_od.bam -O filtered.bam

Is there an alternative to write this?

parameters to use for the rapid barcode 96 kit

It's not clear from the docs on what to use for --kit and --barcode_kit if the library prep and barcodes were done with the rapid barcode 96 kit (SQK-RBK110.96). For instance, can one use RBK004 for the --barcode_kit? Using --kit SQK-RBK110.96 with guppy 6.0.1+652ffd1 doesn't work (Could not find matching workflow error).

Add Nanolyse back to remove guppy branch

Description of feature

Nanolyse should still work in the pipeline.
No need to remove it.
It was removed in the remove_guppy branch but should actually be included.

Error on running profile test

Description of the bug

$nextflow pull nf-core/nanoseq
Checking nf-core/nanoseq ...
Already-up-to-date - revision: 1e60482 [master]
(base) [ec2-user@ip-1-9-8 nf-nanoseq]$ nextflow run nf-core/nanoseq -profile test,docker --outdir test_dir
N E X T F L O W ~ version 20.01.0
Launching nf-core/nanoseq [soggy_snyder] - revision: 1e60482 [master]
Unknown config attribute params.genomes.GRCh37.projectDir -- check config file: /home/ec2-user/.nextflow/assets/nf-core/nanoseq/nextflow.config

How to ignore the 'params.genomes.GRCh37.projectDir' ? Thanks.

Command used and terminal output

No response

Relevant files

No response

System information

No response

FastQC fails when fastq file is not in gzip format

I found this when I tried to run with the --skip_basecalling flag on some data we have
This is due to the symbolic link we create at the start
image
Are we making it mandatory for the fastq to be gzipped, or will we add a case here so it doesn't break?

adding modules from nanoseq dsl2 conversion

Hi! I converted the nf-core/nanoseq pipeline to dsl2 two months ago in this pull request, and @drpatelh suggested that it would be nice to add the modules from nf-core/nanoseq to nf-core/modules. I have opened a pull request to add the ucsc_bed12tobigbed process from nf-core/nanoseq to nf-core/modules.
Here are local process modules from nf-core/nanoseq that are not in nf-core/modules :

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.