Hi, I was hoping to get the derived pseudobulk samples returned from

Return pseudobulk samples values from test_de about glmgampoi HOT 3 CLOSED

Al-Murphy commented on August 27, 2024

Return pseudobulk samples values from test_de

from glmgampoi.

Comments (3)

const-ae commented on August 27, 2024

Hi Alan,

no there is currently no way to extract the pseudobulk sample from test_de(). However, it is not too difficult to form the pseudo-bulk yourself. See for example this unit test where I check that the manual way of forming the pseudobulk is consistent with the result from test_de() (the relevant code is here).

The (possible) issue is I am observing very large LFC values (in the thousands)

Hm, this can happen if for one condition you only have zeros. However, the large LFC shouldn't disturb the statistical test. But if there actually is a problem. Please let me know, I am happy to help :)

Best,
Constantin

from glmgampoi.

Al-Murphy commented on August 27, 2024

Hey Constanin,

Thanks once again for this, it was very helpful!

Just one question, I noted in your unit test you aggregate your pseudobulk data by patient (sample):

splitter <- split(seq_len(ncol(se)), SummarizedExperiment::colData(se)$sample)
  pseudobulk_mat <- do.call(cbind, lapply(splitter, function(idx){
    matrixStats::rowSums2(assay(se), cols = idx)
  }))

For my analysis, I'm looking at differential expression per cell type, so should pass both the cell type and patient ID to splitter get derive pseudobulk data?

I'm assuming the package does something like this in test_de with the conditions passed from glm_gp? I wonder this as when I ran test_de on my data, I set the pseudobulk_by to the patient ID. Is this as you would expect or should it be the combination of patient ID and cell type?

Thanks,
Alan.

from glmgampoi.

const-ae commented on August 27, 2024

For my analysis, I'm looking at differential expression per cell type, so should pass both the cell type and patient ID to splitter get derive pseudobulk data?

Good question and yes, I would recommend providing both the celltype and the sample label. For example pseudobulk_by = paste0(sample, "-", celltype) and splitter <- split(seq_len(ncol(se)), paste0(se$sample, "-", se$celltype)).

It would also be a valid test to aggregate only by sample, but in that case you have less power to answer the question you are interested in, because the celltype indicator variables in the model matrix are converted to fractions (what percentage of cells came originally from celltype A, B, etc.). I realize that the documentation has been lacking so far in this respect, so I will try to update it soon.

Best,
Constantin

from glmgampoi.

Recommend Projects

Return pseudobulk samples values from test_de about glmgampoi HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent