openomics / genome-seek Goto Github PK
View Code? Open in Web Editor NEWClinical Whole Genome and Exome Sequencing Pipeline
Home Page: https://openomics.github.io/genome-seek/
License: MIT License
Clinical Whole Genome and Exome Sequencing Pipeline
Home Page: https://openomics.github.io/genome-seek/
License: MIT License
This will prevent any unnecessary compute and will reduce overall runtime.
Filtering from GATK is not working for strelka, resolved by adding these two commands before it goes to norm & splitting:
bcftools concat -Ov -a \
-D somatic.snvs.vcf.gz somatic.indels.vcf.gz \
-o strelka.merge.indel.snps.vcf
java -Xmx16g -cp /data/OpenOmics/references/genome-seek/hmftools/purple_v3.2.jar com.hartwig.hmftools.purple.tools.AnnotateStrelkaWithAllelicDepth -in strelka.merge.indel.snps.vcf -out strelka.merge.indel.snps.annotated.vcf.gz
Next fix is for spliting TUMOR from strelka. When splitting we cannot use "-c1" for strelka as we don't have the tag in vcf file to check for --min-ac/--max-ac
. We need to update the tumor splitting command for strelka to remove -c1:
bcftools view -s TUMOR -Oz -o strelka.tumor.vcf.gz \
strelka.merge.indel.snps.annotated.filtered.vcf.gz
Need to update rule all so that the strelka and muse output files do not get created for tumor-only samples
Hi,
I wanted to use your package, but ran into the issue that I do not want to realign my data - would it be possible to integrate a shortcut to skip alignment as well?
or is their something spacial done within the alignment, on which the rest is building?
If so, I had the impression that my multi-lane fastq was not accepted properly.
Could you set something up with an addition of *L{X}*R{1,2}.fastq.gz?
Cheers!
Add two new cli options to run the pipeline with rules/parameters optimized for WES datasets.
--wes-bed WES_BED
Path to exome targets BED file. This file can be obtained from the manufacturer of the target capture kit that was used. By default, a set of BED files generated from GENCODE's exon annotation for protein coding gene's exon is used.
--wes-mode
Run the whole exome pipeline. By default, the whole genome sequencing pipeline is run. This option allows a user to process and analyze whole exome sequencing data. Please note when this mode is enabled, a sub-set of the WGS rules will run. Please see the option below for more information about providing a custom exome targets BED file.
cnvkit
sequenza
--wes-mode
switch/flag is providedHi,
I installed genome-seek through conda:
mamba create -c conda-forge -c bioconda -p /mycondaEnv/snakemake_singularity snakemake singularity
git clone https://github.com/OpenOmics/genome-seek.git
cd genome-seek
mamba activate /mycondaEnv/snakemake_singularity
./genome-seek --version
genome-seek 0.3.3-alpha
snakemake --version
7.25.0
singularity --version
singularity version 3.8.6
But when I tried to run: /genome-seek/genome-seek cache --sif-cache /sif-cache, I got the following error:
genome-seek (0.3.3-alpha)
Image will be pulled from "/data/OpenOmics/SIFs/ccbr_wes_base_v0.1.0.sif".
Image will be pulled from "/data/OpenOmics/SIFs/deepvariant_1.3.0-gpu.sif".
Image will be pulled from "/data/OpenOmics/SIFs/glnexus_v1.4.1.sif".
Image will be pulled from "/data/OpenOmics/SIFs/ncbr_opencravat_latest.sif".
Image will be pulled from "/data/OpenOmics/SIFs/ncbr_octopus_v0.1.0.sif".
Image will be pulled from "/data/OpenOmics/SIFs/ncbr_sigprofiler_v0.1.0.sif".
Image will be pulled from "/data/OpenOmics/SIFs/ncbr_vcf2maf_v0.1.0.sif".
/projectsp/foran/yc790/apps/genome-seek/src/cache.sh: line 210: SLURM_JOB_ID: unbound variable
/projectsp/foran/yc790/apps/genome-seek/src/cache.sh: line 210: SLURM_JOB_ID: unbound variable
WARNING: Failed to run 'set -euo pipefail; /genome-seek/src/cache.sh local -s '/sif-cache' -i '/data/OpenOmics/SIFs/ccbr_wes_base_v0.1.0.sif,/data/OpenOmics/SIFs/deepvariant_1.3.0-gpu.sif,/data/OpenOmics/SIFs/glnexus_v1.4.1.sif,/data/OpenOmics/SIFs/ncbr_opencravat_latest.sif,/data/OpenOmics/SIFs/ncbr_octopus_v0.1.0.sif,/data/OpenOmics/SIFs/ncbr_sigprofiler_v0.1.0.sif,/data/OpenOmics/SIFs/ncbr_vcf2maf_v0.1.0.sif' -t '/sif-cache/yc790/.singularity/' ' command!
└── Command returned a non-zero exitcode of '1'.
Fatal: Failed to pull all containers. Please try again!
It seems that the image sif files are missing and the SLURM_JOB_ID in cache.sh is not defined. Is there a way to get around?
Thanks a lot!
Ying
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.