nf-core / tools Goto Github PK

View Code? Open in Web Editor NEW

225.0 143.0 187.0 21.51 MB

Python package with helper tools for the nf-core community.

Home Page: https://nf-co.re

License: MIT License

Python 93.35% Nextflow 5.90% Dockerfile 0.24% HTML 0.17% Jinja 0.33%

nextflow pipeline workflow bioinformatics python linter linting nf-core

tools's Issues

Add linting results to PR as a comment

The linting tests give lots of information on pull requests, but you have to dig into the travis results to find them. It would be super cool if we could set up a GitHub app or something which the nf-core lint service could use to post a comment with the linting results to the PR directly.

Low priority, but would be fun 🤓

Make a release

We should really do a release on at least PyPI soon.

Command to install required software

xref: nf-core/rnaseq#32

It could be nice to have a subcommand to help with the installation of required software. Something like:

Install / update nextflow
- Check if java is installed and an appropriate version first
Install singularity?
Install docker??

Each should be optional and preferably work on linux and mac. These commands can also then be used in the CI tests (eg. to install nextflow and singularity).

Tool to create child images for users without overlayFS

One of the most common problems for users with singularity is that they don't have overlayFS configured, so the pipeline fails with directory mounting issues.

This subcommand would generate a Singularity image script based on the main pipeline, but dynamically adding in all base directories (detected or specified) so that mounting works properly.

Steps:

New nf-core tools subcommand fix-overlay, takes name of target pipeline
Download the Singularity file from the target pipeline
Detect all local root directories, or take supplied list
Replace From and %post statements in Singularity file
Save to /tmp and run singularity build to make a new image with the required mount points

Improve documentation

From @ewels on July 12, 2018 7:5

The docs are a little sparse in a few places, and some other pipelines have better, more comprehensive stuff written already (such as the methylseq pipeline). Would be good to go through all pipelines and homogenise the docs across all of them.

Copied from original issue: nf-core/cookiecutter#38

Revert conda environment handling

Previously, we created a conda environment with a specific name and then manually added this directory to the PATH. This worked fine, but felt a little hacky. After some discussion (@apeltzer @sven1103 @ewels @pditommaso and others), we removed it. Instead, we now install the environment to the base conda environment.

This all seemed fine, until we just came across a strange bug. After some investigation, it seems that the host filesystem conda installation was taking priority over the base conda environment (due to configuration files in the home directory). Combined with conflicting path mounts we ended up with a steaming mess and non-functional software. This issue creates two problems:

Software breaks (as above)
Software works, but is actually the host installation and not from the container (not reproducible)

Either we need a fix, or we can revert to the previous method. Manually prepending to the PATH env skips all of these problems, as we're essentially not using conda any more.

Check for new withName syntax

The old syntax, e.g.

process$name

Will be replaced by

process:withName

Which currently leads to an error while linting, see here

ICGC-featureCounts> nf-core lint .                                                                                                                                               (nf-core-lint) 

                                          ,--./,-.
          ___     __   __   __   ___     /,-._.--~\
    |\ | |__  __ /  ` /  \ |__) |__         }  {
    | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                          `._,._,'
    
Running pipeline tests  [##################------------------]   50%  'check_nextflow_config'

CRITICAL: Critical error: ('`nextflow config` returned non-zero error code: %s,\n   %s', 1, b"ERROR ~ Unable to parse config file: '/home/alex/IDEA/nf-core/ICGC-featureCounts/nextflow.config'\n\n  Compile failed for sources FixedSetSources[name='/groovy/script/ScriptACEE592A55CA6E05E4ED54DBAB544DAD']. Cause: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:\n  /groovy/script/ScriptACEE592A55CA6E05E4ED54DBAB544DAD: 26: expecting '}', found ':' @ line 26, column 12.\n       .withName:fetch_encrypted_s3_url {\n                ^\n  \n  1 error\n\n\n")

INFO: Stopping tests...

INFO: ===========
 LINTING RESULTS
=================
  16 tests passed   0 tests had warnings   0 tests failed

Should we support both or just the new syntax?

Add TODO statements in template

From @sven1103 on July 9, 2018 8:46

Dear all,

just started to port a pipeline into nf-core, and realized, that it might me a cool addition to have //TODO <description> comment lines in the cookiecutter template, for example for the help message function you need to adapt, and so on.

Then you could easily display them in your favourite IDE and don't forget something to implement/change. Moreover, we could also check on a release with nf-core/tools, that all //TODO tags are removed in the code?
Just brainstorming here, happy for your feedback!

Example

Have a tag //TODO nf-core:.

Were?:

main.nf
- helpMessage()
- log info
Dockerfile
Singularity
nextflow config
...

Best, Sven

Copied from original issue: nf-core/cookiecutter#34

TEMPLATE branch should be set up by the create command

Currently the create command makes sure that the TEMPLATE branch is possible to create in an easy way. I think we should take it a step further and the very least recommend the user to create it and push it to github.

Related to #88.

Update minimum nextflow version check lint

@pditommaso has added a new core nextflow feature for checking the required version of nextflow, see nextflow-io/nextflow#752

This means that we need to update the current linting test for our nextflow version check, and update all the pipelines.

Tool to apply cookiecutter changes to all other pipelines

A common problem that is only going to get worse is that we want to make a change in some code that is shared across all pipelines.

To facilitate this, we need a tool which uses the git history to find changes that can also be applied in other pipelines.

Ideally, this will be used by an automated GitHub robot of some kind which will automatically make the changes an open PRs in branches on all pipelines (as done in conda-forge and several other communities).

Flag ignored processes at end of run

From @ewels on August 8, 2018 9:31

Normally, ignored tasks are a bad thing in nf-core pipelines. It would be good to add some template code that flags a big warning when pipelines finish if any tasks errored and were ignored.

Copied from original issue: nf-core/cookiecutter#58

Travis should check releases

Travis builds on pushes, PRs and also tags (which for us are the same as releases). It would be great if there could be a set of tests which are specific to releases.

Suggestions:

Introduce --release flag for nf-core lint
Conditionally execute lint tests with --release flag, when git activity happens on the master branch (check env var TRAVIS_BRANCH)

On --release set:

Check that the version variable is numeric-only
Check that the docker slug has a tag which isn't latest
Check that the tag name (release number) is the same as the config version and container label.

Bonus points for challenging ones:

Check that fixed bioconda package versions are the latest available (warn if not)

Homebrew version of nextflow is out of date.

Need to update version on homebrew-bio

Polish Docker/Singularity template infos

From @apeltzer on July 28, 2018 16:38

The files local.md and adding_your_own.md have similar (if not identical) sections for Docker and Singularity - we should remove these in one location at least :-)

Copied from original issue: nf-core/cookiecutter#46

AWSBatch Configuration

From @apeltzer on August 2, 2018 13:6

Hi!

I think we could have a generic AWSBatch configuration in cookiecutter.

Did so for ICGC-featureCounts but we could use a similar way (open for ideas if thats not optimal in your opinion):

AWSBatch config:

https://github.com/nf-core/ICGC-featureCounts/blob/master/conf/awsbatch.config

Main Nextflow.config / setting some defaults:
https://github.com/nf-core/ICGC-featureCounts/blob/master/nextflow.config

And then allowing users to specify the required params /also using that in the summary if the proper profile is used:

https://github.com/nf-core/ICGC-featureCounts/blob/4929959b77d13169a8c96ef4abec3ff8b8af314e/main.nf#L138

I guess this could as well be extrapolated to other pipelines easily and could be in cookiecutter therefore!

I'd be happy to contribute that but would like to have some more feedback/ideas on this before moving on...

Copied from original issue: nf-core/cookiecutter#48

Check that bioconda package versions are latest

If pipelines have a bioconda environment.yml, we should be able to check:

That they are all pinned to a specific version (fail if not)

When running with --release:

That this version is the most recent bioconda version available (warn if not)

Add support for conda

From @ewels on May 24, 2018 15:19

Will soon be available in release 0.30.x. See:

https://github.com/nextflow-io/nextflow/blob/master/docs/conda.rst
https://github.com/nextflow-io/nextflow/blob/master/docs/process.rst#conda
https://github.com/nextflow-io/nextflow/blob/master/docs/config.rst#scope-conda

Copied from original issue: nf-core/cookiecutter#27

Update conda channel order priorities

conda-forge should now have highest priority, see bioconda/bioconda-recipes#10924 (comment)

Linting should check pipeline names for guidelines

qbic-pipelines/icgc-featurecounts#8

Update to new dockerhub everywhere

Since we switched, we should replace docker hub with docker cloud everywhere in the docs etc.
Would be quite a nice test case for the new automation :-)

Use checkIfExists

The new checkIfExists option was added in nextflow-io/nextflow#666 and is now released. Would be great to use it!

See https://gitter.im/nextflow-io/nextflow?at=5ae09da61130fe3d361684a1 for reason:

star_index = Channel
        .fromPath(params.star_index)
        .ifEmpty { exit 1, "STAR index not found: ${params.star_index}" }

However, as far as I can see, this does not work if a wrong file path is given since fromPath() doesn't check for file existence and therefore, the channel is not empty.

Lint check that documentation mentions all command line flags

It would be great if we could check that all options (params) are described in the markdown documentation. We'll need the ability to have a list of exceptions, but it would ensure that the docs are kept up to date.

Update docker hub URLs

Related to #155 - we need to update all occurrences of the https://hub.docker.com URLs from the pipeline documentation and website. These should now point to https://cloud.docker.com

Dockerfile Conventions

Just a thought:

Maybe we could have the Dockerfiles in a similar fashion than here, to keep things consistent.

I thought about having the following:

Dockerfile with very specific import setting (e.g. our standard one, that imports via conda all required packages)
Extra "Requirements.txt" file that is imported by that Dockerfile, containing conda packages to be installed

This way we could checksum the Dockerfile (or even provide it via cookiecutter) and then have people only add what kind of dependencies they want in a specific version.

This also provides us with the possibility to create a Tag that is identical to the Tag for the pipeline in NXF = keeping things consistently pulled as well!

Thoughts?

I will create another issue in cookiecutter to link to this here...

Code possibly not covered by CI

Here's how to reproduce the bugs referenced in #19

python --version

Python 3.6.3 :: Anaconda, Inc.

git clone https://github.com/SciLifeLab/NGI-NeutronStar.git
cd NGI-NeutronStar
git reset --hard 59bfe4e717419d1e3667422cd486071073b41bcd
nf-core lint .

Traceback (most recent call last):
  File "/Users/remi-andreolsen/miniconda3/envs/py3/bin/nf-core", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/remi-andreolsen/code/nf-core-tools/scripts/nf-core", line 10, in <module>
    import nf_core.lint
  File "/Users/remi-andreolsen/code/nf-core-tools/nf_core/lint.py", line 272
    except AssertionError, KeyError:
                         ^
SyntaxError: invalid syntax

Add paranthesis to except statement

nf-core lint .

Running pipeline tests  [######------------------------------]   16%

Traceback (most recent call last):
  File "/Users/remi-andreolsen/miniconda3/envs/py3/bin/nf-core", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/remi-andreolsen/code/nf-core-tools/scripts/nf-core", line 50, in <module>
    nf_core_cli()
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/remi-andreolsen/code/nf-core-tools/scripts/nf-core", line 34, in lint
    lint_obj.lint_pipeline()
  File "/Users/remi-andreolsen/code/nf-core-tools/nf_core/lint.py", line 64, in lint_pipeline
    getattr(self, fname)()
  File "/Users/remi-andreolsen/code/nf-core-tools/nf_core/lint.py", line 107, in check_files_exist
    self.files.append(f)
NameError: name 'f' is not defined

Change `f` -> `files`

git reset --hard c92ce0d99baff39bfcbb36c64be03160dc0331f8
touch .travis.yml CHANGELOG.MD docs/README.md docs/output.md docs/usage.md
nf-core lint .

Running pipeline tests  [########################------------]   66%

Traceback (most recent call last):
  File "/Users/remi-andreolsen/miniconda3/envs/py3/bin/nf-core", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/remi-andreolsen/code/nf-core-tools/scripts/nf-core", line 50, in <module>
    nf_core_cli()
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/remi-andreolsen/miniconda3/envs/py3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/remi-andreolsen/code/nf-core-tools/scripts/nf-core", line 34, in lint
    lint_obj.lint_pipeline()
  File "/Users/remi-andreolsen/code/nf-core-tools/nf_core/lint.py", line 64, in lint_pipeline
    getattr(self, fname)()
  File "/Users/remi-andreolsen/code/nf-core-tools/nf_core/lint.py", line 213, in check_config_vars
    k, v = l.split(' = ', 1)
TypeError: a bytes-like object is required, not 'str'

Lint: Check nextflow version badge in readme

It's good to always have a badge on the readme saying what version of Nextflow is required. We should be able to double check this version against what we have in the config file too.

Docker Cloud Migration

We need to get all our repositories set up at Docker Cloud instead of Docker hub.
Reason is, for repositories that were e.g. renamed (RNAseq -> rnaseq), the automated builds break.

It is impossible to add a new source repository on Docker Hub, but on Docker Cloud you can. Both are interfaces for the same backend - so we don't loose anything but gain something. I only have to set up the tags/branch builds for all repositories again once.

....

Naming conventions for NXF scripts

From @apeltzer on February 23, 2018 12:17

After the discussion in the NGI-RNASeq Gitter (credits to @rfenouil ), we might want to set some standards for naming e.g. channels/processes/variables. I opened this ticket to keep track of the ideas and we could maybe integrate these soon to make them mandatory once we agreed on defaults:

Some ideas:

channels start with a ch_ prefix (to hinder confusing them with variables)

Other suggestions/ideas for variable / process / ... naming conventions would be great too!

Copied from original issue: nf-core/nf-core.github.io#9

Fail lint tests for master pushes

Even with GitHub protected branches, we make mistakes with pushing to master fairly frequently. We should only ever have commits to master coming as pull-requests from the branch dev.

We should be able to test for this using Travis ENV variables quite specifically, and fail the test if the commit is coming from anywhere else (eg. a PR from a fork).

Lint: Don't fail on master if no releases yet

It's common for us to commit to the master branch with development code before the first release is made. The tests always fail here, which is kind of annoying. It would be good if before failing, the linting tool can check whether there are any pipeline releases.

Conda version check incorrect?

Hi!
just had a look at EAGER2, and the build fails (expectedly...) using newest nf-core linting tool:

https://travis-ci.org/nf-core/EAGER2/jobs/400150354

What concerns me is this here as a warning:

  83 tests passed   2 tests had warnings   1 tests failed
  Using --release mode linting tests
WARNING: Test Warnings:
  http://nf-co.re/errors#8: Conda package is not latest available: gatk4=4.0.5.1, 4.0.5.2 available
  http://nf-co.re/errors#8: Conda package is not latest available: multiqc=1.5, 1.6a0 available

Checking the multiqc page and the bioconda package page, there is no version 1.6a0 for MultiQC!

https://bioconda.github.io/recipes/multiqc/README.html

So I guess, we're parsing something incorrectly :-(

Add the nextflow summary to MultiQC reports

From @ewels on May 24, 2018 15:25

In other pipelines we are adding the summary variables to MultiQC as a config block. See https://github.com/wikiselev/rnaseq/blob/eaebf588e83e2f78cea0a4451db2d4eea5789493/main.nf#L1036-L1054

Add this into the cookiecutter recipe. @pditommaso thinks it "could be replaced by a few lines of groovy" so probably room for some refactoring too 😉

Copied from original issue: nf-core/cookiecutter#28

Add community templates to create

Refactor pipeline emails

The onComplete email code was written before nextflow had native support. It adds loads of boiler plate code to every pipeline which can be almost entirely removed now.

Needs refactoring, and quite a bit of testing. We want to try to keep as much of the email contents the same if possible, which may require some PRs to core nextflow.

Refactor config to import base.config first, outside of profile scope

From @ewels on August 2, 2018 14:19

See example of where I started doing this on the rnaseq pipeline here: https://github.com/ewels/nf-core-rnaseq/compare/config_refactor#diff-c79fe4336e72c04860afccd21f4ae1c5R17

Copied from original issue: nf-core/cookiecutter#50

S3 bucket storage of certain files not supported

cr: nextflow-io/nextflow#813

We can't (as of now) store trace files on an S3 Bucket, so specifying one to store data on when running pipelines will crash in many cases where we store the trace file in the --outdir on S3.

Update to new withName syntax

From @ewels on August 2, 2018 14:8

We should be using the new withName syntax in the configuration files, instead of the older process$name syntax.

Copied from original issue: nf-core/cookiecutter#49

nf-core bump-version does not adjust PATH in Dockerfile and Singularity

Expected behaviour:

The command bump-version (earlier release) adjusts the environment PATH for conda with the correct bumped version number.

Actual behaviour:

Both PATHs statements in the Dockerfile and Singularity file remain unchanged.

Travis: build docker if relevant

It'd be cool if the travis config could detect whether environment.yml or Dockerfile had changed for the pipeline test, then build locally if so. Then tests would be properly using the correct software.

It may still time out of course.

Generic GUI for nf-core pipelines

Hi everyone,

an idea which was already briefly discussed here:

We could produce a generic tool in tools that can produce a graphical user interface for end users to generate an appropriate configuration for the pipeline:

I'd like to use Kivy for this https://kivy.org/#home :Anybody has a better idea/candidate?
I'd like to produce a working nextflow config in the end, that can be used for running the actual pipeline

Definition could be like Phil proposed it here:

nf-core/eager#11 (comment)

Happy for any feedback - I thought about this for EAGER2.0, but would be happy to make it generic for all kinds of nf-core pipelines!

@ewels @andreas-wilm @sven1103 @maxulysse (and others who are interested?)
Let me know what you think!

Better handling of container addresses

From @ewels on July 30, 2018 6:8

Singularity and docker container addresses are currently not handled in the best way. This is made difficult by the fact that we have nfcore on dockerhub and nf-core on singularity hub 🤦‍♂️

This issue continues discussion started on PR nf-core/cookiecutter#42

Copied from original issue: nf-core/cookiecutter#47

Sort out better release tagging for singularity

From @ewels on August 3, 2018 16:6

Docker hub works nicely with automated builds based on GitHub releases which are tagged with the same name.

Singularity hub doesn't support the same method and instead has nasty filename based tagging support. However, there is a very nice CLI that interacts with the API.

We should be able to use this CLI to automatically build and tag singularity hub containers when GitHub releases are prepared, using Travis CI.

Copied from original issue: nf-core/cookiecutter#53

Use linting messages to automatically build website docs

It's currently kind of annoying and easy to forget to update the website docs every time a new lint test is added. It should be possible to automatically generate this somehow.

Nose2 does not execute tests with correct Python environment during CI build

Description: Although the Python environment is loaded correctly during the CI build, it seems that nose2 is running tests under the loaded Python environment, but Python 2.7.

So we are missing Syntax errors during CI testing, which only occur in specific Python versions.

Suggestion: Call nose2 as module in the loaded Python environment with python -m nose2 ...

Allow pip sub-list in conda's environment.yml

Status: Only conda package dependencies are linted at the moment, sub-list, such as derived from additional pipinstallations are not possible at the moment.

Suggestion: Allow pip sub-list for dependency and use the PyPi API

http://pypi.python.org/pypi/<package_name>/<version>/json

Implement some kind of validation tests

From @ewels on July 11, 2018 11:12

It would be nice to add in some checks for pipeline output files when -profile test is used. This will need to be specific for all pipelines, but we could add an example here.

I think the simplest would be to have a bunch of processes which are downstream of the normal ones and have when: workflow.process.contains('test') set so that they don't run normally.

See extensive discussion on nextflow gitter from here.

Copied from original issue: nf-core/cookiecutter#36

Externalize dockerfile for our base image

We currently have our Dockerfile for the nf-core/base image (that we use for all pipelines) in our central nf-core/tools repository. However, that is kind of a bad practice, since we can't tag this properly without conflicting with releases of nf-core/tools.

I, therefore, suggest that we generate a novel repository (e.g. nest, since this is where all of our pipelines "nest" within), to hold the dockerfile. That would make tagging, releasing and building with docker cloud possible quite easily and we don't have these conflicts and can rely on properly tagged base images for our project here.

Any ideas would be appreciated - also comments on this one.

We can even fix certain versions of conda in these base images and have a changelog over here.

Refactor workflow listing

The code in this tool to fetch the pipeline names and releases uses the GitHub APIs. It works fine, but it was from before we had the new website. Now the website does the same calls and provides everything we need in a single JSON file which we can fetch with a single API call.

URL: http://nf-co.re/pipelines.json

Pros:

Simpler code, faster
No GitHub API usage limits
We can collect usage stats in the future on the website
Only need to maintain website, not multiple tools (eg. which repos to ignore etc)

Cons:

None..?

Can keep the existing code as a backup if we want? Though to be honest probably easiest to strip out the GitHub stuff and keep it simple..

manifestHomepage is not correctly linted

When nextflow.config contains this:


manifest {
  homePage = '{{ cookiecutter.pipeline_url }}'
  description = '{{ cookiecutter.pipeline_short_description }}'
  mainScript = 'main.nf'
}

The linting fails unfortunately. This is a problem in cookiecutter, see ticket nf-core/cookiecutter#19 for a start.

nf-core / tools Goto Github PK

tools's Issues

Example

Add paranthesis to except statement

Change f -> files

Recommend Projects

Recommend Topics

Recommend Org

Change `f` -> `files`