Git Product home page Git Product logo

Comments (3)

smehringer avatar smehringer commented on May 26, 2024

HI @prasundutta87,

thanks for your interest in SViper!
I try to maintain SViper but often I have troubles finding the time.

  1. When I was testing the software with one set of sample, I found that most of the FAILs were FAIL5 ("The variant was polished away."). What is the reason behind this fail?

This means that after polishing the variant is not visible anymore in the data (in the corrected long reads).

  1. What is meant by FAIL3 (The long read regions do not fit)? Can this please be elaborated?

This means that for the given variant, no proper region from the long read could be extracted. E.g. although a long read has a desired deletion of 200, the flanking regions of this deletion are mapped very poorly, or the mapping indicates a complex variant. SViper can only polish simple deletions and insertions.

  1. I am aware that no tags should be present. SViper skips the variants. With bcftools, I am getting the error that SKIP is not defined. If you are still developing the tool, can that please be added? Although I have changed the SVTYPE to INS, SViper checks the variant type by tags rather than SVTYPE.

I'll try to change this! But I can't promise that it will be in the next days.

  1. I observed that the the SViper score is put on the QUAL field. How is the score calculated and does it have any biological significance?

The score is computed here:

// Score computation
// -----------------
double error_rate = ((double)length(record.cigar) - 1.0)/ (config.flanking_region * 2.0);
double fuzzyness = (1.0 - error_rate/0.15) * 100.0;
variant.quality = std::max(fuzzyness, 0.0);
record.mapQ = variant.quality;

It does not have a biological significance! As far as I remember my own code, it was experimentally derived and proved to work well on manual inspection.

Should I filter my SVs based on SViper score again? Is there a threshold based on which I should remove SVs?

Unfortunately this is very hard to answer and heavily depends on your use case. In general I can say that you should filter out variants with a FAIL tag. Those are very unlikely to be true. But variants with a low score might just mean that the polishing didn't work well. I might not filter by the score but only regard this as a confidence score.

I actually filter my SVs using QUAL value of the variant caller (cuteSV). So, replacing this value with the score effects my pipeline.

Can you filter the SVs before polishing them with SViper? Otherwise I might need to see if I can add an option that does not overwrite the quality scores but adds them in the INFO field.

Best,
Svenja

from sviper.

prasundutta87 avatar prasundutta87 commented on May 26, 2024

Hi @smehringer ,

Thank you so much to answer my queries. Is there a document anywhere where the algorithm or working on SViper is mentioned anywhere. I am just trying to get my head around the algorithm (not the numerical/quantification bit, but the general concept of polishing). For example, when you say polishing the variant is not visible anymore in the corrected long reads, can this please be elaborated?

Thanks for the suggestion to use SViper after my final filtering.

Also, do we need to sort the BAMs by name? It seems to be specifically mentioned for utilities_merge_split_alignments tool. Currently I have coordinate sorted my BAMs.

This may be trivial, but the master version of SViper is 2.0.0, but the most recent version is 2.1.0. At least it shows in the help that it is 2.0.0. Could you kindly clarify this?

Regards,
Prasun

from sviper.

smehringer avatar smehringer commented on May 26, 2024

Is there a document anywhere where the algorithm or working on SViper is mentioned anywhere.

I've send you an email with my thesis that hopefully can answer most of your questions.

Also, do we need to sort the BAMs by name? It seems to be specifically mentioned for utilities_merge_split_alignments tool. Currently I have coordinate sorted my BAMs.

For sviper, sorting by coordinate is fine (even required).
Only when using the utility utilities_merge_split_alignments you need to have the BAM sorted by names. But do you want to use this utility at all? It's an rather advanced utility I used in my validation pipelines.

This may be trivial, but the master version of SViper is 2.0.0, but the most recent version is 2.1.0. At least it shows in the help that it is 2.0.0. Could you kindly clarify this?

I'm very sorry for the confusion! This is a documentatio bug. I forgot to change the version in the help page. I'll correct this.

from sviper.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.