Git Product home page Git Product logo

vanheeringen-lab / seq2science Goto Github PK

View Code? Open in Web Editor NEW
145.0 6.0 26.0 1.42 GB

Automated and customizable preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows. Works equally easy with public as local data.

Home Page: https://vanheeringen-lab.github.io/seq2science

License: MIT License

Python 77.03% AngelScript 0.57% Shell 11.01% R 10.88% Perl 0.51%
snakemake bioinformatics bioinformatics-pipeline atac-seq rna-seq chip-seq reproducible-research ngs pipeline sra

seq2science's People

Contributors

bioqxu avatar jgasmits avatar jroubroeks avatar maarten-vd-sande avatar mkolmus avatar rebecza avatar siebrenf avatar simonvh avatar srinzema avatar tdewijs avatar tilschaef avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

seq2science's Issues

Dynamic ascp

An elegant way of incorporating the installation of ascp:
https://github.com/vanheeringen-lab/snakemake-workflows/blob/54f7e5aaca2f410d120a8a1fcace7a4d18797102/rules/get_fastq.smk#L1-L16

Currently the ascp path and the key are hardcoded.

options:

  • install ascp manually, and make sure the hard-coded paths are correct (current situation)
  • install ascp manually, do a lookup
    • lookup once per rule
    • lookup once per workflow (e.g. in onstart)
  • install ascp through a rule, so the hardcoded paths are enforced

MACS2: remove RNEXT flag for peak calling

MACS2 throws away half of the reads when using BAM mode. When using BAMPE the reads get 'interpolated' in between, which we do not want for ATAC-seq.

Ideally align with paired end data, but for peak calling remove the information that the reads are paired end? RNEXT flag in bam

Incremental configurations

Might be confusing to see all the parameters that do not apply. Maybe config.schema.yaml per workflow

When working with local files, print better error messages when these can not be found

Local files can not be found since they might not be in result_dir, or fastq_dir, or their fqsuffix and/or fqext is wrong. The error

Checking if samples are single-end or paired-end...
CalledProcessError in line 59 of /home/sande/Dropbox/Studie/PhD/snakemake-workflows/rules/configuration.smk:
Command 'esearch -api_key ba36a74749126e0d9558b7e19967417c3407 -db sra -query GSM12345555555 | efetch -api_key ba36a74749126e0d9558b7e19967417c3407 | grep -Po "(?<=<LIBRARY_LAYOUT><)[^/><]*"' returned non-zero exit status 1.
  File "/home/sande/Dropbox/Studie/PhD/snakemake-workflows/workflows/atac_seq/Snakefile", line 11, in <module>
  File "/home/sande/Dropbox/Studie/PhD/snakemake-workflows/rules/configuration.smk", line 78, in <module>
  File "/home/sande/Dropbox/Studie/PhD/snakemake-workflows/rules/configuration.smk", line 78, in <dictcomp>
  File "/home/sande/anaconda3/envs/snakemake-workflows/lib/python3.7/multiprocessing/pool.py", line 657, in get
  File "/home/sande/anaconda3/envs/snakemake-workflows/lib/python3.7/multiprocessing/pool.py", line 121, in worker
  File "/home/sande/Dropbox/Studie/PhD/snakemake-workflows/rules/configuration.smk", line 59, in get_layout
  File "/home/sande/anaconda3/envs/snakemake-workflows/lib/python3.7/subprocess.py", line 395, in check_output
  File "/home/sande/anaconda3/envs/snakemake-workflows/lib/python3.7/subprocess.py", line 487, in run

is completely uninformative

multiqc bug

If the multiqc output files already exist (for instance, from a previous test run), then it will automatically create files with a _1 suffix. However, this means the workflow will fail as the "correct" files according to snakemake are not generated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.