duartemolha / convading_reload Goto Github PK

This project forked from molgenis/convading

Fork of the CoNVaDING software - Copy Number Variation Detection In NGS Gene panels was designed for small (single-exon) copy number variation (CNV) detection in high coverage NGS data

License: GNU Lesser General Public License v3.0

Perl 30.24% Shell 0.25% HTML 69.51%

convading_reload's People

Stargazers

Watchers

Forkers

gitter-badger mmterpstra emedgene

convading_reload's Issues

Get off_target complement of input bed tile

as a first step in the detection of copy number on off-target we need to create a complement bed file of input data.

This required the input by the user of the genome assembly or alternatively, we can try and autodetect from the BAM file inputs.

As a possible improvement to this method I will investigate removing from the complement region of homology to the targeted regions (since these will, by definition suffer some amplification bias (on non-amplicon based methods)

Code crashes on Grch38 reference

Code crashes on Grch38 reference available in the GATK bundle. this is due to the fact this contains 'chr' style chromosomes and 'HLA-' type contigs

Ideally the code below should just read the header to an hash and validate each line of input to that hash. and if not the line/linenumber and maybe the chromosomes matched against.

https://github.com/duartemolha/CoNVaDING_reload/blob/master/CoNVaDING.pl#L2845-L2914

Lazy fix:
https://github.com/duartemolha/CoNVaDING_reload/blob/master/CoNVaDING.pl#L2874 change to if ($chr =~ m/^chr.+|^HLA-.+/gs) { and test.

CBS log ratio CNV detection

investigate the possibility of adding CBS log ratio CNV detection using the normalised counts.

I think it would be interesting to add another orthogonal method of CNV detection that can be used to confirm or give additional support to the aberrations detected.

Investigate Using mosdepth for count calculation instead of samtools depth

Currently the software uses samtools depth to calculate read depth...

mosdepth is a worthwhile alternative to consider since it is 2x faster

https://github.com/brentp/mosdepth

method to normalise off-target read counts

We might take the same approach as what CoNVaDING already does to the targeted regions or maybe do some sort of windowing approach

make norm-count files include the description of how those norm counts were calculated

Currently when creating the norm-count files from input bam files, the user can choose to either keep or remove duplicates from the input. For amplicon based methods, we do not want to remove duplicates, however for hybridization-based methods we do.

I think the norm-count files should contain within them the information about how those counts where achieved, especialy if the input was filtered for duplicates or not.

This will in turn enable automatic selection of only controls that where calculated the same way as the sample we want to analyse.

For example, we could have a controls folder with 60 control samples (30 for amplicon based inputs) and 30 of hybridization inputs.

When user would select a new sample to analyse and included -rmdups parameter, the script can then read the controls folder and only select the best controls from the subset of 30 that are compatible.

duartemolha / convading_reload Goto Github PK

convading_reload's People

Stargazers

Watchers

Forkers

convading_reload's Issues

Get off_target complement of input bed tile

Code crashes on Grch38 reference

CBS log ratio CNV detection

Investigate Using mosdepth for count calculation instead of samtools depth

method to normalise off-target read counts

make norm-count files include the description of how those norm counts were calculated

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent