Git Product home page Git Product logo

ngi-methylseq's Introduction

Scilifelab modules

Installation

Installation is as simple as

python setup.py install

If you are running several virtual environments, where one (e.g. devel) is used for development, you can install a development version by running

workon devel
python setup.py develop

Documentation

Docs are located in the doc directory. To install, cd to doc and run

make html

Documentation output is found in build.

Running the tests

The modules are shipped with a number of unit tests, located in the tests directory. To run a test, issue the command

python setup.py nosetests

or if you want to run individual tests, cd to tests and run (for example)

nosetests -v -s test_db.py

ngi-methylseq's People

Contributors

ewels avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ngi-methylseq's Issues

Bismark not being run in multicore mode

Hi,

I am trying to get the docker pipeline optimised and running quickly.

I assume the docker.config file is causing my CPU settings to be reset to 1. Where can I override this ?
Do I need to just avoid importing the docker.config ? I already tried setting cpus=48 in the docker.config without success, and increasing limits in the base.config

conf/docker.config

/*
 * NOTE: Not suitable for production use, assumes
 * compute limits of only 16GB memory and 1 CPU core.
 */


params {
  igenomes_base = 's3://ngi-igenomes/igenomes/'
  cpus = 48
}

Thanks, otherwise the pipeline is looking excellent.
Colin

Create SeqMonk Projects

It would be great if we could add a step at the end of the pipeline to automatically create a SeqMonk project.

Not sure if this will be possible due to requiring a SeqMonk installation with reference genomes. Could make it optional? Or just ignore a failure?

SeqMonk importer docs:

      SeqMonk Importer - Creating SeqMonk Projects from the command line
SYNOPSIS
	seqmonk [--(un)spliced] [--mapq=20] --genome "Mus musculus/GRCm38" --outfile out.smk *.bam
DESCRIPTION
	This script allows you to run seqmonk in a non-interactive mode to read in
	a number of BAM or Bismark coverage files and save these into a single project 
	file which you can then transfer to an interactive server to do further 
	downstream analysis.
    
    The options for the program as as follows:
    --genome        The genome to use for the import.  This is specified as
                    species/assembly and must match an existing genome in your
                    seqmonk genomes folder.
	               
    --outfile       The name of the file you want to write the project to
    --spliced       Split spliced reads so you only see the exonic parts. Will
                    be added by default if any of the first 100,000 reads in
                    the first imported BAM file have a splice site in them. 
                    Adding this flag overrides the auto-detection.
	                
    --unspliced     No not split spliced reads even if they are present.  This
                    flag overrides the default auto-detection.
	                
    --mapq          Value to use as a MAPQ cutoff for imported reads.  Defaults
                    to 20 if any of the first 100,000 reads has a value above 20.
    -h --help       Print this help file and exit
    
    -m --memory     Set the starting memory allocation in megabytes. Defaults
                    to 1300. Minimum allowed value is 500 and values above
                    1300 should only be set on systems running a 64-bit JRE
                        

No such variable: bismark_dedup_log_1 error when using --rrbs

I'm getting an error when running the pipeline with the --rrbs switch; run starts trimming etc without --rrbs...

$ nextflow run $NF/bismark.nf -profile base --reads '*_R{1,2}_001.fastq' --genome mm10 --rrbs

N E X T F L O W  ~  version 0.25.4
Launching `NGI-MethylSeq/bismark.nf` [crazy_monod] - revision: 566634b95a
==================================================
 NGI-MethylSeq : Bisulfite-Seq Best Practice v0.2
==================================================
...
[warm up] executor > local
ERROR ~ No such variable: bismark_dedup_log_1

 -- Check script 'bismark.nf' at line: 413 or see '.nextflow.log' file for more details

igenomes.config

Hi @ewels,
First off, thanks once again for making these pipelines available. I'm trying to get our sequencing core to adopt these or something similar. I'm happy to contribute where I can - is it better to create an issue like this, or are you happy for others to contribute pull requests? I'm new to git, but getting the hang of it.

I'm following the RNAseq pipline and got a bit lost in the discussions about curly brackets for igenomes.config: SciLifeLab/NGI-RNAseq#152. I found this pipeline didn't work (java/groovy errors) with curly brackets around paths, but it does without - see attached igenomes.config

compareTo logic fix

Similar to SciLifeLab/NGI-RNAseq#152

nextflow.config needs updating

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
  if(type == 'memory'){
    if(obj.compareTo(params.max_memory) == 1)
      return params.max_memory
    else
      return obj
  } else if(type == 'time'){
    if(obj.compareTo(params.max_time) == 1)
      return params.max_time
    else
      return obj
  } else if(type == 'cpus'){
    return Math.min( obj, params.max_cpus )
  }
}

Another fix is to invert the order of return statements, but I guess testing ==1 is more explicit?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.