Git Product home page Git Product logo

Comments (3)

const-ae avatar const-ae commented on August 27, 2024

Hi Alan,

no there is currently no way to extract the pseudobulk sample from test_de(). However, it is not too difficult to form the pseudo-bulk yourself. See for example this unit test where I check that the manual way of forming the pseudobulk is consistent with the result from test_de() (the relevant code is here).

The (possible) issue is I am observing very large LFC values (in the thousands)

Hm, this can happen if for one condition you only have zeros. However, the large LFC shouldn't disturb the statistical test. But if there actually is a problem. Please let me know, I am happy to help :)

Best,
Constantin

from glmgampoi.

Al-Murphy avatar Al-Murphy commented on August 27, 2024

Hey Constanin,

Thanks once again for this, it was very helpful!

Just one question, I noted in your unit test you aggregate your pseudobulk data by patient (sample):

splitter <- split(seq_len(ncol(se)), SummarizedExperiment::colData(se)$sample)
  pseudobulk_mat <- do.call(cbind, lapply(splitter, function(idx){
    matrixStats::rowSums2(assay(se), cols = idx)
  }))

For my analysis, I'm looking at differential expression per cell type, so should pass both the cell type and patient ID to splitter get derive pseudobulk data?

I'm assuming the package does something like this in test_de with the conditions passed from glm_gp? I wonder this as when I ran test_de on my data, I set the pseudobulk_by to the patient ID. Is this as you would expect or should it be the combination of patient ID and cell type?

Thanks,
Alan.

from glmgampoi.

const-ae avatar const-ae commented on August 27, 2024

For my analysis, I'm looking at differential expression per cell type, so should pass both the cell type and patient ID to splitter get derive pseudobulk data?

Good question and yes, I would recommend providing both the celltype and the sample label. For example pseudobulk_by = paste0(sample, "-", celltype) and splitter <- split(seq_len(ncol(se)), paste0(se$sample, "-", se$celltype)).

It would also be a valid test to aggregate only by sample, but in that case you have less power to answer the question you are interested in, because the celltype indicator variables in the model matrix are converted to fractions (what percentage of cells came originally from celltype A, B, etc.). I realize that the documentation has been lacking so far in this respect, so I will try to update it soon.

Best,
Constantin

from glmgampoi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.