Hi, First of all, thanks for developing this tool! I have a few

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

DivNet on rRNA gene counts derived from metagenomes? about divnet HOT 7 OPEN

mgabriell1 commented on July 17, 2024

DivNet on rRNA gene counts derived from metagenomes?

from divnet.

Comments (7)

scubalaina commented on July 17, 2024

Hi there,

I also have a similar question and just wanted to boost this!
I'm working with RNAP (B and B' subunit genes separately) which are single-copy markers, so I don't have to worry about copy-numbers skewing thigns, but I'm wondering how the diversity calculations are implemented and interpreted with metagenomic data in which the whole composition of the single-gene community only accounts for a very small portion of the reads/members of the community - in other words, their relative abundances will not sum to 1?

Thanks,
Alaina :)

from divnet.

mooreryan commented on July 17, 2024

@scubalaina I have used DivNet in a similar way to you. When you are running it on the subcommunity (ie just the rna pol seqs) you are passing the data to DivNet as counts right? If so, it will go through its process treating that as samples/community in the right way.

from divnet.

scubalaina commented on July 17, 2024

Hi Ryan, Ok great! I am using the reads per kilobase because each gene has a different length, and I need to normalize for that, but that's great it has worked for you with using a subcommunity of the data so it should work for mine similarly. Thanks, Alaina :)

…

On Thu, Jun 22, 2023 at 2:31 PM Ryan Moore ***@***.***> wrote: @scubalaina <https://github.com/scubalaina> I have used DivNet in a similar way to you. When you are running it on the subcommunity (ie just the rna pol seqs) you are passing the data to DivNet as counts right? If so, it will go through its process treating that as samples/community in the right way. — Reply to this email directly, view it on GitHub <#128 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFRYVGLWHUGPQNXUFKBOGELXMSFPXANCNFSM54XJJAJA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from divnet.

mooreryan commented on July 17, 2024

@scubalaina something to keep in mind about normalizations ...you will be changing the read counts which could have an affect on variance estimations. Check out this tiny example. It's a silly contrived example where each gene has the same gene length, but the counts are still normalized by the gene length (ie reducing the count equally for all sample/genes in this particular example, and so increasing the variance). Of course this is just a silly example, but the point is that normalizing could impact variance estimations. Though, in practice, I'm not sure how much of an issue it will be. Someone from the Willis lab will have to comment on that.

One other thing if you're doing some normalization, you could think of a gene in a sample that has a low count like 2, but it is a 4kb gene, so its "per kilobase" count would be 0.5. Depending on your choice of pseudocount (for example, 0.5 was chosen in the DivNet manuscript for the analysis) that could be around the sam as that normalized count. Another thing to keep in mind.

divnet_rpk_variance.R.txt

(Not relevant to this discussion, but I work in a viral ecology lab, so I know some of your papers! Just a cool coincidence 😄)

from divnet.

scubalaina commented on July 17, 2024

Hi Ryan, Ah I see! That makes sense! Thank you for taking the time to demonstrate. I really appreciate your help in understanding this all. I clearly needed to take more stats classes in grad school haha I wonder how one could avoid compromising variance calculations without overestimating the abundance of longer genes if gene length isn't accounted for? I did notice when I ran divnet on my normalized read counts that differences in Shannon's diversity were no longer significant - or at least the divnet output had overlapping confidence intervals. I attached the example below of divnet vs a Wilcox test of vegan's Shannon's diversity calculation. Should I be interpreting this as no difference between the diversity of these groups? Sorry to take up more of your time! I really, really appreciate the help! Awesome you're in viral ecology! I think I saw you're in Eric Wommack's lab? Super cool! Thanks again for your time and help! Alaina :)

…

On Mon, Jun 26, 2023 at 12:56 PM Ryan Moore ***@***.***> wrote: @scubalaina <https://github.com/scubalaina> something to keep in mind about normalizations ...you will be changing the read counts which could have an affect on variance estimations. Check out this tiny example. It's a silly contrived example where each gene has the same gene length, but the counts are still normalized by the gene length (ie reducing the count equally for all sample/genes in this particular example, and so increasing the variance). Of course this is just a silly example, but the point is that normalizing could impact variance estimations. Though, in practice, I'm not sure how much of an issue it will be. Someone from the Willis lab will have to comment on that. One other thing if you're doing some normalization, you could think of a gene in a sample that has a low count like 2, but it is a 4kb gene, so its "per kilobase" count would be 0.5. Depending on your choice of pseudocount (for example, 0.5 was chosen in the DivNet manuscript for the analysis) that could be around the sam as that normalized count. Another thing to keep in mind. divnet_rpk_variance.R.txt <https://github.com/adw96/DivNet/files/11871614/divnet_rpk_variance.R.txt> [image: alpha_div] <https://user-images.githubusercontent.com/3172014/248871802-872a19b5-5489-4a7f-b19a-b05c684b083d.png> (Not relevant to this discussion, but I work in a viral ecology lab, so I know some of your papers! Just a cool coincidence 😄) — Reply to this email directly, view it on GitHub <#128 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFRYVGJBGA3IC23DM5R6FX3XNG5LTANCNFSM54XJJAJA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

from divnet.

mooreryan commented on July 17, 2024

I wonder how one could avoid compromising variance calculations without overestimating the abundance of longer genes if gene length isn't accounted for?

^ Yeah, that's a good question...as far as I know that is still an open research question. Someone from the Willis lab will have to weigh in here.

I attached the example below

^ I think you may have forgotten the attachment...I'm not seeing it.

(Yep in Wommack's lab...small world haha!)

from divnet.

scubalaina commented on July 17, 2024

Hi Ryan,

Sorry I was corresponding via email so the attachment probably didn't work through github. Here it is!

from divnet.

DivNet on rRNA gene counts derived from metagenomes? about divnet HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent