proper log likelihood calculation (and assess alternate variational distribution for library size) about scvi-tools HOT 7 CLOSED

jeff-regier commented on September 13, 2024

proper log likelihood calculation (and assess alternate variational distribution for library size)

from scvi-tools.

Comments (7)

maxime-langevin commented on September 13, 2024 1

Currently the framework for dealing with multiple batches would be to instanciate the dataset using the "get_attributes_from_list" method (and giving it a gene-expression matrix for each batch), that yields a batch-specific prior. I agree that it is particularly important when dealing with different technologies that might have very different library size distributions.

from scvi-tools.

jeff-regier commented on September 13, 2024

Probably we also want a "prior" on library size that depends on batch id, aka a conditional prior. I'm not sure whether we already have that. In the paper, library size doesn't depend on batch id but it probably should -- especially if one batch is scRNA-seq data and the other is smFISH data.

from scvi-tools.

maxime-langevin commented on September 13, 2024

@jeff-regier, I implemented what we discussed to try to improve the variational distribution (namely dropping the log-nomal for a standard normal), but it gives poorer and more unstable scores for log-likelihood (probably because there's not the KL term anymore to roughly guide the encoder).
To be sure that it was worth it to get a better variational distribution, I ran scVI without trying to learn the library size (takin the exact value everytime rather than encoding/decoding it).
It converges faster but overall does not give a better likelihood than the one in Romain's paper.

from scvi-tools.

jeff-regier commented on September 13, 2024

Isn't there still a KL term for library size? It's just a KL divergence btw a normal (the variational dist) and a standard normal (the prior), rather a KL divergence between a log normal and a log normal, right?

from scvi-tools.

maxime-langevin commented on September 13, 2024

Yes there is still a KL term, what I wanted to say was that in the case of the log-normal prior the KL term actually gives the model information on the mean and variance of the log library size, while the KL between the normal and standard normal doesn't.

from scvi-tools.

jeff-regier commented on September 13, 2024

OK, let's make sure we're calculating log p(x) correctly, as discussed. If it still isn't better, we can probably conclude that a log normal isn't that bad of a variational distribution for library size.

from scvi-tools.

jeff-regier commented on September 13, 2024

@maxime1310 should we close this and stick with the variational distribution we have? It seems like log normal may be pretty good after all.

from scvi-tools.

Recommend Projects

proper log likelihood calculation (and assess alternate variational distribution for library size) about scvi-tools HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent