Git Product home page Git Product logo

Comments (5)

andrewjpage avatar andrewjpage commented on August 20, 2024

The recombination regions are masked out with Ns.

On 16 Aug 2016 03:03, "cam09" [email protected] wrote:

Hi,

Is there a reason behind the filtered_polymorphic_sites.phylip file
containing 'N' bases in some of the sequences? The input file does not
contain any, so I'm unsure why N's are generated.

Thanks,
Cam


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#178, or mute the
thread
https://github.com/notifications/unsubscribe-auth/AABeV2XZFBDOMeCQ9xpgBREdsCGgr31kks5qgRqGgaJpZM4Jk8_W
.

from gubbins.

bforde avatar bforde commented on August 20, 2024

Hi Andrew can you clarify please. In my alignment files the number of Ns
does not correlate to the number of SNPs identified in recombinant regions.
Also Ns in the alignment file are skewing the topology of the final tree
and introducing branches where there should be none.

Cheers

Brian

On Tue, Aug 16, 2016 at 4:53 PM, andrewjpage [email protected]
wrote:

The recombination regions are masked out with Ns.

On 16 Aug 2016 03:03, "cam09" [email protected] wrote:

Hi,

Is there a reason behind the filtered_polymorphic_sites.phylip file
containing 'N' bases in some of the sequences? The input file does not
contain any, so I'm unsure why N's are generated.

Thanks,
Cam


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#178, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/
AABeV2XZFBDOMeCQ9xpgBREdsCGgr31kks5qgRqGgaJpZM4Jk8_W>
.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#178 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEZMMcsAssVhNQ9efNc-pw0i4UyRENpsks5qgV5xgaJpZM4Jk8_W
.

from gubbins.

andrewjpage avatar andrewjpage commented on August 20, 2024

Hi Brian,
The whole recombination region gets masked out with Ns, not just the SNPs. You mentioned that there are branches where you expect there to be none? Is the scale the same between your original tree and the output of gubbins (so were the branches always there but just compressed due to recombination)?
Unfortunately without looking at the data I cant really give you an answer.
Regards,
Andrew

from gubbins.

bforde avatar bforde commented on August 20, 2024

Hi Andrew,

Thanks for getting back. It is actually the opposite that we are seeing i.e
there are not enough masked regions in the filtered alignment file.

With regards to the branch it might best be explained with an example.

Based on the filtered alignment file StrainB has diverged directly from
strainA. However, in the tree we see strainA and B diverging from a common
ancestor. I can only concluded that Ns in the sequence of strainB are
introducing some uncertainty.

I am happy to send you the data if it would help

Brian

On Thu, Aug 18, 2016 at 1:10 AM, andrewjpage [email protected]
wrote:

Hi Brian,
The whole recombination region gets masked out with Ns, not just the SNPs.
You mentioned that there are branches where you expect there to be none? Is
the scale the same between your original tree and the output of gubbins (so
were the branches always there but just compressed due to recombination)?
Unfortunately without looking at the data I cant really give you an answer.
Regards,
Andrew


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#178 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEZMMTfpuXOYn8g8KpM9Tl5bzoYTXeclks5qgyRRgaJpZM4Jk8_W
.

from gubbins.

areejalsheikh avatar areejalsheikh commented on August 20, 2024

Hi Andrew,

I have a similar question to Brian's. I found the N's in my phy file, which now I understand are masked recombinant regions. But what I don't understand is why do some strains have those N's but not others. If these are core SNPs, shouldn’t recombinant SNPs be masked in all strains?

E.g.

ACGGGACAGGGAGGTCTCACAATGCAA
ANNNNNNNNNNNNNNNNNNNNNNNNNA
ACGGGACAGGGAGGTCTCACAATGCAA

My concern is that the number of SNPs (shown at the header of phy file) is larger than that expected by 1000, which I think is accounting for those N's.

Thanks for your help.

from gubbins.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.