Comments (7)
Currently the framework for dealing with multiple batches would be to instanciate the dataset using the "get_attributes_from_list" method (and giving it a gene-expression matrix for each batch), that yields a batch-specific prior. I agree that it is particularly important when dealing with different technologies that might have very different library size distributions.
from scvi-tools.
Probably we also want a "prior" on library size that depends on batch id, aka a conditional prior. I'm not sure whether we already have that. In the paper, library size doesn't depend on batch id but it probably should -- especially if one batch is scRNA-seq data and the other is smFISH data.
from scvi-tools.
@jeff-regier, I implemented what we discussed to try to improve the variational distribution (namely dropping the log-nomal for a standard normal), but it gives poorer and more unstable scores for log-likelihood (probably because there's not the KL term anymore to roughly guide the encoder).
To be sure that it was worth it to get a better variational distribution, I ran scVI without trying to learn the library size (takin the exact value everytime rather than encoding/decoding it).
It converges faster but overall does not give a better likelihood than the one in Romain's paper.
from scvi-tools.
Isn't there still a KL term for library size? It's just a KL divergence btw a normal (the variational dist) and a standard normal (the prior), rather a KL divergence between a log normal and a log normal, right?
from scvi-tools.
Yes there is still a KL term, what I wanted to say was that in the case of the log-normal prior the KL term actually gives the model information on the mean and variance of the log library size, while the KL between the normal and standard normal doesn't.
from scvi-tools.
OK, let's make sure we're calculating log p(x) correctly, as discussed. If it still isn't better, we can probably conclude that a log normal isn't that bad of a variational distribution for library size.
from scvi-tools.
@maxime1310 should we close this and stick with the variational distribution we have? It seems like log normal may be pretty good after all.
from scvi-tools.
Related Issues (20)
- solo.predict() should take an argument to return Non-softmax transformed values, and return logits like previous iterations to allow for non-breaking change that was implemented in 1.1.3 HOT 2
- Update release checklist and documentation for new release workflow HOT 1
- scvi-tools 1.2.0 HOT 3
- Autotuning with Ray: DeprecationWarning: The `RunConfig(local_dir)` argument is deprecated. You should set the `RunConfig(storage_path)` instead. HOT 2
- Use explicit configs instead of kwargs
- autotune throws error for ContrastiveVI (and possibly other external modules) HOT 1
- Fix custom dataloader registry HOT 2
- ModelTuner.fit Returns error: The `RunConfig(local_dir)` argument is deprecated. HOT 10
- Error in model training HOT 2
- Reproducibility issue of MRVI HOT 4
- In Jupyter Notebook on Windows, the kernel keeps crashing when running Model hyperparameter tuning with scVI. HOT 22
- How to increase num_workers in pytorch DataLoader? HOT 1
- error in scanvi_ref HOT 8
- Memory error running CellAssign depending on size of marker gene set reference HOT 8
- lightning dependency critical CVE - Update dependency list, especially for conda which is pinned below the CVE patch HOT 1
- Bugs related to MultiVI HOT 6
- n_proteins Parameter in MultiVI Class HOT 3
- Incompatibility with new anndata 0.10.9 HOT 2
- installation error when import scvi in conda/pip HOT 3
- Fix the progress bar during training models trough tout the tutorials
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scvi-tools.