Comments (7)
Hi there,
I also have a similar question and just wanted to boost this!
I'm working with RNAP (B and B' subunit genes separately) which are single-copy markers, so I don't have to worry about copy-numbers skewing thigns, but I'm wondering how the diversity calculations are implemented and interpreted with metagenomic data in which the whole composition of the single-gene community only accounts for a very small portion of the reads/members of the community - in other words, their relative abundances will not sum to 1?
Thanks,
Alaina :)
from divnet.
@scubalaina I have used DivNet in a similar way to you. When you are running it on the subcommunity (ie just the rna pol seqs) you are passing the data to DivNet as counts right? If so, it will go through its process treating that as samples/community in the right way.
from divnet.
from divnet.
@scubalaina something to keep in mind about normalizations ...you will be changing the read counts which could have an affect on variance estimations. Check out this tiny example. It's a silly contrived example where each gene has the same gene length, but the counts are still normalized by the gene length (ie reducing the count equally for all sample/genes in this particular example, and so increasing the variance). Of course this is just a silly example, but the point is that normalizing could impact variance estimations. Though, in practice, I'm not sure how much of an issue it will be. Someone from the Willis lab will have to comment on that.
One other thing if you're doing some normalization, you could think of a gene in a sample that has a low count like 2, but it is a 4kb gene, so its "per kilobase" count would be 0.5. Depending on your choice of pseudocount (for example, 0.5 was chosen in the DivNet manuscript for the analysis) that could be around the sam as that normalized count. Another thing to keep in mind.
(Not relevant to this discussion, but I work in a viral ecology lab, so I know some of your papers! Just a cool coincidence 😄)
from divnet.
from divnet.
I wonder how one could avoid compromising variance calculations without overestimating the abundance of longer genes if gene length isn't accounted for?
^ Yeah, that's a good question...as far as I know that is still an open research question. Someone from the Willis lab will have to weigh in here.
I attached the example below
^ I think you may have forgotten the attachment...I'm not seeing it.
(Yep in Wommack's lab...small world haha!)
from divnet.
Hi Ryan,
Sorry I was corresponding via email so the attachment probably didn't work through github. Here it is!
from divnet.
Related Issues (20)
- use speedyseq instead of phyloseq HOT 1
- vignette detailing general guidance for using covariates... HOT 3
- Can phylodivnet be added to betaDiversity vignette? HOT 5
- Problem with simplifyBeta() HOT 6
- Estimating Shannon Evenness with uncertainty and other parameters associated with diversity values HOT 1
- Cannot allocate vector of size with suspiciously large phyloseq object HOT 6
- simplifyBeta() missing functionality for aitchison distance HOT 3
- Error in default_network(sigma) HOT 5
- testBetaDiversity pseudo F-statistic calculation HOT 2
- Choice of base ASV and influence on reproducibility HOT 7
- rename master to main HOT 1
- code coverage audit HOT 2
- Diversity estimates not plausible HOT 16
- R "killed" or cores not returning data HOT 4
- X covariate table and the intercept HOT 2
- DivNet on Transcripts per Million (TPM) values derived from metagenomic data HOT 4
- Is there a minimum number of samples required for testBetaDiversity?
- unit tests HOT 2
- problems with beta_diversity.Rmd HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from divnet.