Comments (6)
Hi Jacob,
I am not a DivNet developer, so take my comments with that caveat in mind. Also, I'm not sure about network methods so I'll leave any comment on that to a developer.
That taxa count does seem quite high for 69 samples though without knowing your study system it's hard to say if you should be suspicious of that. You might comment on the DADA2 page for usage of DADA2 with your data set.
The error, "cannot allocate vector of size...", occurs when the object you are trying to load into divnet() has a vector size larger than your memory limit. From what you said, it sounds like the memory limit on the node you're using is 190Gb and your data, or your data plus whatever other objects are in your R environment exceeds that.
Some of these might help...
- Request more memory from the cluster for your calculation.
- Clean out unneeded objects in your R environment before running divnet(), functions here https://www.programmingr.com/r-error-messages/cannot-allocate-vector-of-size/
- Use tax_glom() to reduce the taxa count if you think that high taxa count is representative of the environment sampled.
- Subset your data into a smaller data set.
Best,
-Mike
from divnet.
Yeah I would also be suspicious of ~70,000 ASV from 70 samples...it seems abnormally high.
But for arguments sake, let's assume you do have 70,000 good ASVs. The number of samples isn't what's causing the huge resource usage...it's the high number of taxa. Even the rust version will take a while and use a decent amount of memory on a dataset with 70,000 taxa. If I have more than a few 1000 taxa, I generally switch to the Rust version.
Alternatively, you can try collapsing your ASVs to a higher level with tax_glom or somthing similar, to get the taxa to a more manageable level.
from divnet.
Thanks! Yeah, the rust version has been crashing on me too. I'll take it up with those developers next. I'm beginning to think that there may actually be 70,000 ASVs, since I found another dataset from the chesapeake bay that has 300k ASVs in it.
Not a huge fan of using tax glom, since I'd really prefer ASV level shannon index, rather than some other level shannon index.
from divnet.
Oh, wait, I'm talking to @mooreryan -- you are the divnet-rs developer. I'm having the same problem in divnet-rs. That again overloads the memory allocation that I give it on the cluster (~180TB) and then the job gets killed. Is it worth opening an issue over on divnet-rs, or should I just not try to calculate divnet indexes on these highly "diverse" datasets?
from divnet.
Regarding @msmcfarlin 's suggestion about subsetting. Is it ok to run divnet or divnet-rs on each sample (or small set of samples) seperately (assuming I'm not using the network features)? That might get me around the memory issues.
from divnet.
If you would like, feel free to open an issue on the divnet-rs github and we can try and figure out what's going on there.
from divnet.
Related Issues (20)
- use speedyseq instead of phyloseq HOT 1
- vignette detailing general guidance for using covariates... HOT 3
- Can phylodivnet be added to betaDiversity vignette? HOT 5
- Problem with simplifyBeta() HOT 6
- Estimating Shannon Evenness with uncertainty and other parameters associated with diversity values HOT 1
- simplifyBeta() missing functionality for aitchison distance HOT 3
- Error in default_network(sigma) HOT 5
- testBetaDiversity pseudo F-statistic calculation HOT 2
- Choice of base ASV and influence on reproducibility HOT 7
- rename master to main HOT 1
- code coverage audit HOT 2
- Diversity estimates not plausible HOT 16
- DivNet on rRNA gene counts derived from metagenomes? HOT 7
- R "killed" or cores not returning data HOT 4
- X covariate table and the intercept HOT 2
- DivNet on Transcripts per Million (TPM) values derived from metagenomic data HOT 4
- Is there a minimum number of samples required for testBetaDiversity?
- unit tests HOT 2
- problems with beta_diversity.Rmd HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from divnet.