Git Product home page Git Product logo

Comments (8)

nickloman avatar nickloman commented on August 16, 2024

Thanks for the report. Firstly please can you check you are using latest nanopolish, as indel behaviour has been slightly adjusted in recent times.

So from what I can see, nanopolish is reporting:

  • BaseCalledReadsWithVariant=64
  • BaseCalledFraction=0.160804
  • SupportFraction=0.59658

This means that 16% of the basecalled reads support an insertion at this position, but the nanopolish model believes that actually 60% of reads support an insertion.

So it could well be that there is an insertion at this position (at some frequency).

So I think the behaviour to mask that region with Ns is intended, particularly as there is uncertainty there.

You can turn off indel calling with the --no-indels flag to artic minion if you desire.

@jts may have an opinion on this too.

from fieldbioinformatics.

jts avatar jts commented on August 16, 2024

Hi @Takadonet ,

The QUAL score in the VCF record is very low relative to the number of reads at the position (133.0, 364 reads) so this variant fails our filters. I think this particular variant is a calling artifact and filtering it is the correct behaviour.

We consider positions that fail our filters as being unreliable to make a definitive call at, so we mask them with an N so they don't confuse phylogenetic analysis.

Thanks for pointing this variant out, I recently tweaked the indel calling behaviour so I appreciate the feedback. I'll keep on an eye these spurious insertions.

Jared

from fieldbioinformatics.

Takadonet avatar Takadonet commented on August 16, 2024

Thanks for quick reply!

We are currently on version 0.13.0 for nanopolish (latest env.yml from artic-ncov2019) and see that you just updated a few days ago to 0.13.1 in this repo.

Will make a new conda env and will report back the results at theses positions.

Sine there such a great disparity between what end user can review in the IGV/Tablet and what nanopolish is calling, is there another method to review base pair being called?

Anyway to address their concerns of the differences? Since there are over 10k samples in gisaid now, no point rushing to be first and more worried of submitting quality and reviewed consensus sequences.

from fieldbioinformatics.

nickloman avatar nickloman commented on August 16, 2024

You could try the medaka pipeline and compare with that?

I am confused about the difference in basecalled frequency (16%) and the IGV view which suggests only 2% (8 insertions at depth 395) though. Would you be able to share the BAM file with me (perhaps privately?).

from fieldbioinformatics.

Takadonet avatar Takadonet commented on August 16, 2024

I will try out medaka as well and see how they differ.

I am confused as well about the differences and that why seeking for help. Just asking permission before sharing the BAM file with you privately.

from fieldbioinformatics.

Takadonet avatar Takadonet commented on August 16, 2024

Sent bam file link to your email @nickloman

from fieldbioinformatics.

nickloman avatar nickloman commented on August 16, 2024

Thanks. So from what I can see from the alignment there is some support for that variant (judging from Tablet), but not sufficient for it to be called a variant. I agree you could confidently call reference there. One way of doing that easily is to add '--no-indels' to artic minion (but be cautious, you won't get any indels with that). I might add another optional argument to the pipeline to let it drop those low confidence variants instead of masking it.

from fieldbioinformatics.

Takadonet avatar Takadonet commented on August 16, 2024

Thanks for taking a look. Using medaka beta addressed this particular issue for these strains but failed for others. We are investigating what is going on.

For moment we are just going to fix them 'by hand' since only 3 SNV's across a dozen strains but in future will use the optional argument when it is available.

from fieldbioinformatics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.