songdongyuan1994 / scdesign3 Goto Github PK

View Code? Open in Web Editor NEW

69.0 69.0 22.0 17.03 MB

scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics

Home Page: https://songdongyuan1994.github.io/scDesign3/docs/index.html

License: MIT License

R 100.00%

scdesign3's People

Contributors

Stargazers

Watchers

Forkers

saidiwang zhiqianz jsb-ucla chrisycd genostack jamesthesnake rimanb lgl0522 billychen123 sqsun xuhangli aquamono

scdesign3's Issues

no applicable method for 'family' applied to an object of class "NULL"

Hello,
When I run scdesign3, the following error occurs

Input Data Construction Start
Input Data Construction End
Start Marginal Fitting
Marginal Fitting End
Start Copula Fitting
Convert Residuals to Multivariate Gaussian
Error in UseMethod("family") : 
  no applicable method for 'family' applied to an object of class "NULL"

How can I solve it, thank you!

GAMLSS warnings during marginal fitting

Hello,

While fitting marginal distributions for my data, I am getting a couple warning messages (not errors though) that seem to originate from the GAMLSS fitting.

1: In RS() : Algorithm RS has not yet converged
2: In digamma(y + (1/sigma)) : NaNs produced
3: In digamma(1/sigma) : NaNs produced

I assume these are resulting from poor fitting for certain genes. Should such genes be excluded from modeling? Or can these warnings be disregarded?

Thank you,
Connor

camelCase Bioconductor Convention is Not Used

I see this has been submitted to Bioconductor for evaluation. One of the conventions which is different to what Bioconductor expects is function and variable naming. See Functions Names and Variable Names sections of the developer's guide. See also Spaltter simulator, which is accepted into Bioconductor and conforms to the guide. Using a consistent convention for all packages in Bioconductor ecosystem makes it a bit easier for users to memorise parameters and switch between packages.

Copula model underfitting

Hello there,

I'm trying to use scDesign3 to simulate single-cell ATAC-seq data, and I like your cool results shown in FigureS6 and S7 of the paper. But so far I cannot repeat that on another dataset, because the copula model is always underfitting. I attached my code and some results here. Do you have any idea why the model is not working?

My code:
simu <- scdesign3( sce = sce, assay_use = "counts", celltype = "cell_type", pseudotime = NULL, spatial = NULL, other_covariates = NULL, mu_formula = "cell_type", sigma_formula = "1", family_use = "zip", n_cores = 2, usebam = FALSE, corr_formula = "cell_type", copula = "gaussian", DT = TRUE, pseudo_obs = FALSE, return_model = TRUE, nonzerovar = FALSE )

My results:

The BIC and AIC for the copula model are Inf
The marginal BIC and AIC is also very large (with aic.marginal=4431524, bic.marginal=4819040)
The similarity of peak-peak correlation matrices between the real (training) data and the synthetic data is low (see below attachment).

selected.pdf

My training data:
They're two cell groups from a public sci-ATAC-seq atlas, where I select about 7000 peaks and 1000 cells.

Could you let me know for simulating ATAC-seq data, how many peaks you usually use/ would recommend to use?

Question on setting batch effect strength

Hi Dongyuan,

in case I have 3 batches in total and I call the mean of the normalisation distribution 'strength':

num_batch <- 3
batch_strength <- 1
BATCH_marginal_alter <- lapply (BATCH_marginal, function(x) {
     lh <- length(x$fit$coefficients)
    x$fit$coefficients [ lh ] <- rnorm(1, mean = batch_strength, sd = 2)
    x$fit$coefficients [ lh-1 ] <- rnorm(1, mean = batch_strength+1, sd = 2)
    Х
})

Then I can separate all the 3 batches.

I am thinking about one thing now: Why don't we sample one value for a batch, and apply this value for all the features(genes) in this batch?

i.e.

batch_strength <- 1
co_1 <- rnorm(1, mean = batch_strength, sd = 2)
co_2 <- rnorm(1, mean = batch_strength + 1, sd = 2)
BATCH_marginal_alter <- lapply (BATCH_marginal, function(x) {
     lh <- length(x$fit$coefficients)
    x$fit$coefficients [ lh ] <- co_1
    x$fit$coefficients [ lh-1 ] <- co_2
    Х
})

Looking forward to your reply!

Best,
Danyang

Issue with adjusting for library size

I was following the tutorial page describing how to adjust for library size:
https://songdongyuan1994.github.io/scDesign3/docs/articles/scDesign3-librarySize-vignette.html

I formatted my 'sce' file identically from the sample data from the DuoClustering2018 library, where "cell_type" was a factor and "library" was numeric (no NAs). However, when I attempted to include this information in the 'mu' formula:

mu_formula = "cell_type + offset(log(library))"

I got the following error message:

Error in log(library) : non-numeric argument to mathematical function

Is there anything you could suggest to resolve this issue? I was able to avoid the error by specifying the 'sce' variable:

mu_formula="cell_type + offset(log(colData(sce)$library))"

However, I'm concerned that might cause issues in the background that I'm not able to see.

A bug existing in the example code

Hi, I found that there is a bug for the example scdesign code used in the readme.md file.

The original code is:

example_simu <- scdesign3(
    sce = example_sce,
    assay_use = "counts",
    celltype = "cell_type",
    pseudotime = "pseudotime",
    spatial = NULL,
    other_covariates = NULL,
    mu_formula = "s(pseudotime, k = 10, bs = 'cr')",
    sigma_formula = "s(pseudotime, k = 5, bs = 'cr')",
    family_use = "nb",
    n_cores = 2,
    usebam = FALSE,
    corr_formula = "1",
    copula = "gaussian",
    fastmvn = FALSE,
    DT = TRUE,
    pseudo_obs = FALSE,
    family_set = c("gauss", "indep"),
    important_feature = rep(TRUE, dim(sce)[1]),
    nonnegative = TRUE,
    return_model = FALSE,
    nonzerovar = FALSE,
    parallelization = "mcmapply",
    BPPARAM = NULL，
    trace = FALSE
  )

However, the dot for BPPARAM is incorrect, so it should be:

example_simu <- scdesign3(
    sce = example_sce,
    assay_use = "counts",
    celltype = "cell_type",
    pseudotime = "pseudotime",
    spatial = NULL,
    other_covariates = NULL,
    mu_formula = "s(pseudotime, k = 10, bs = 'cr')",
    sigma_formula = "s(pseudotime, k = 5, bs = 'cr')",
    family_use = "nb",
    n_cores = 2,
    usebam = FALSE,
    corr_formula = "1",
    copula = "gaussian",
    fastmvn = FALSE,
    DT = TRUE,
    pseudo_obs = FALSE,
    family_set = c("gauss", "indep"),
    important_feature = rep(TRUE, dim(sce)[1]),
    nonnegative = TRUE,
    return_model = FALSE,
    nonzerovar = FALSE,
    parallelization = "mcmapply",
    BPPARAM = NULL,
    trace = FALSE
  )

Thanks a lot.

Error in as.data.frame.default(x[[i]], optional = TRUE)

Hi, thank you for your amazing work :)
I was trying to generate some in-silicon data with my spatial transcriptomic clustered data and I faced an error that I am not really sure how to solve. This is my output and error message:

Loading required package: viridisLite
class: SingleCellExperiment
dim: 140 16094
metadata(0):
assays(2): counts logcounts
rownames(140): eGFP PTPRC-exon4-6-introns ... FRMD5 mCherry2
rowData names(0):
colnames(16094): 183816621152166943455673244778245380484
198194891178223149109664428303794678308 ...
45312782973589652226064002619926238148
62901559251622040146779944472043861759
colData names(23): fov volume ... nFeature_RNA ident
reducedDimNames(3): PCA UMAP SPATIAL
mainExpName: RNA
altExpNames(0):
Input Data Construction Start
Input Data Construction End
Start Marginal Fitting
Marginal Fitting End
Start Copula Fitting
Convert Residuals to Multivariate Gaussian
Converting End
Copula group Myofibroblast starts
Copula group vCM starts
Copula group Myeloid starts
Copula group aCM starts
Copula group Pericyte starts
Copula group Endothelial starts
Copula group Fibroblast starts
Copula Fitting End
Start Parameter Extraction
Parameter
Extraction End
Start Generate New Data
Use Copula to sample a multivariate quantile matrix
Sample Copula group Myofibroblast starts
Sample Copula group vCM starts
Sample Copula group Myeloid starts
Sample Copula group aCM starts
Sample Copula group Pericyte starts
Sample Copula group Endothelial starts
Sample Copula group Fibroblast starts
New Data Generating End
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ‘structure("dgCMatrix", package = "Matrix")’ to a data.frame
Calls: %>% ... data.frame -> as.data.frame -> as.data.frame.default
Execution halted

It seems like it is the last step but I do not get any output data out of it. If you have any suggestions on how to solve it I would be very grateful!

Also one question out of interest: Have you ever considered the possibility of generating data out of MERFISH experiment and not only spot-based spatial transcriptomic data? If yes will be the process of data generation similar to this tutorial: https://songdongyuan1994.github.io/scDesign3/docs/articles/scDesign3-spatial-vignette.html?

Thank you in advance and looking forward to hearing from you :)

Best way to simulate noise

Dear developers,

I was wondering whether there's a way to simulate noise for multimodal (CITE, multiome) data. Basically, just a worse fit with more artifacts I guess? I could not find anything in the documentation (from what I've seen).

Thank you very much!

possibility to know the cell type if I manually set the ncell parameter

Dear Dongguan,

I am wondering, if I want to have more or less cells compared to the reference dataset, can I know the cell type for each cell? (and also the batch information needed)

Looking forward to your reply.

Best,
Danyang Xia

Simulate batch effect and condition effect simutaneously.

Hi,

Thank you for presenting this excellent package. I wonder whether it is possible to simulate the batch effect and condition effect simultaneously. For example, if I use the ifnb reference to simulate the condition effect, can I also simulate some batch effect without providing the reference for batch effect simultaneously?

Best,
Larry

Error when simulating spatial transcriptomics data for benchmarking DE genes

Hi,
I want to simulate a spatial dataset based on Human DLPFC dataset to benchmark clustering and DE analysis. Here is my code:

  example_sce <- getRDS("2020_maynard_prefrontal-cortex", "151507")
  # print(example_sce)
  
  set.seed(101)
  dec <- scran::modelGeneVar(example_sce)
  top <- scran::getTopHVGs(dec, n = 2000)
  
  example_sce <- example_sce[top, !is.na(example_sce@colData$layer_guess_reordered)]
  
  mt_idx<- grep("mt-",rownames(example_sce))
  if(length(mt_idx)!=0){
    example_sce   <- example_sce[-mt_idx,]
  }
  
  set.seed(1)
  example_data <- construct_data(
    sce = example_sce,
    assay_use = "counts",
    celltype = "layer_guess_reordered",
    pseudotime = NULL,
    spatial = c("row", "col"),
    other_covariates = NULL,
    corr_by = "1"
  )
  
  example_marginal <- fit_marginal(
    data = example_data,
    predictor = "gene",
    mu_formula = "layer_guess_reordered",
    sigma_formula = "1",
    family_use = "nb",
    n_cores = 10,
    usebam = FALSE
  )
  
  set.seed(1)
  example_copula <- fit_copula(
    sce = example_sce,
    assay_use = "counts",
    marginal_list = example_marginal,
    family_use = "nb",
    copula = "gaussian",
    n_cores = 10,
    new_covariate = NULL,
    input_data = example_data$dat
  )
  
  
  example_para <- extract_para(
    sce = example_sce,
    marginal_list = example_marginal,
    n_cores = 10,
    family_use = "nb",
    new_covariate = NULL,
    data = example_data$dat
  )
  
  diff <- apply(example_para$mean_mat, 2, function(x){max(x)-min(x)})
  diff_ordered <- order(diff, decreasing = TRUE)
  diff <- diff[diff_ordered]
  num_de <- 1000
  de_idx <- names(diff[1:num_de])
  non_de_idx <- names(diff[-(1:num_de)])
  non_de_mat <- apply(example_para$mean_mat[,non_de_idx], 2, function(x){
    avg <- (max(x)+min(x))/2
    new_mean <- rep(avg, length(x))
    return(new_mean)
  })
  example_para$mean_mat[,non_de_idx] <- non_de_mat
  
  set.seed(1)
  example_newcount <- simu_new(
    sce = example_sce,
    mean_mat = example_para$mean_mat,
    sigma_mat = example_para$sigma_mat,
    zero_mat = example_para$zero_mat,
    quantile_mat = NULL,
    copula_list = example_copula$copula_list,
    n_cores = 10,
    family_use = "nb",
    input_data = example_data$dat,
    new_covariate = example_data$newCovariate,
    important_feature = example_copula$important_feature
  )

The code will generate this error:

Use Copula to sample a multivariate quantile matrix
Sample Copula group 1 starts
Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent

If I don't manually set DE and no DE genes, that is to say if I delete this part, then the simulation works well.

  diff <- apply(example_para$mean_mat, 2, function(x){max(x)-min(x)})
  diff_ordered <- order(diff, decreasing = TRUE)
  diff <- diff[diff_ordered]
  num_de <- 1000
  de_idx <- names(diff[1:num_de])
  non_de_idx <- names(diff[-(1:num_de)])
  non_de_mat <- apply(example_para$mean_mat[,non_de_idx], 2, function(x){
    avg <- (max(x)+min(x))/2
    new_mean <- rep(avg, length(x))
    return(new_mean)
  })
  example_para$mean_mat[,non_de_idx] <- non_de_mat

Do you have any suggestions? Thanks.
Best,
Tian

simulating a 2000cell*150000peak scATAC

It nearly spends a whole day to simulate, which is inconvenient.
I encountered an error like this when I try to simulate a 2000cell*150000peak scATAC. Could you tell me how to fix it?

Input Data Construction Start

Warning message in asMethod(object):
“sparse->dense coercion: allocating vector of size 2.4 GiB”
Input Data Construction End

Start Marginal Fitting

Warning message in mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule, :
“scheduled cores 1, 2 did not deliver results, all values of the jobs will be affected”

Error in names(answer) <- dots[[1L]]: attempt to set an attribute on NULL
Traceback:

1. scdesign3(sce = sce_seurat, assay_use = "counts", celltype = "cell_type", 
 .     pseudotime = NULL, spatial = NULL, other_covariates = NULL, 
 .     mu_formula = "cell_type", sigma_formula = "1", family_use = "zip", 
 .     n_cores = 2, usebam = FALSE, corr_formula = "cell_type", 
 .     copula = "gaussian", DT = TRUE, pseudo_obs = FALSE, return_model = FALSE, 
 .     nonzerovar = FALSE)
2. fit_marginal(mu_formula = mu_formula, sigma_formula = sigma_formula, 
 .     n_cores = n_cores, data = input_data, family_use = family_use, 
 .     usebam = usebam, parallelization = parallelization, BPPARAM = BPPARAM)
3. suppressMessages(paraFunc(fit_model_func, gene = feature_names, 
 .     family_gene = family_use, mc.cores = n_cores, MoreArgs = list(dat_use = dat_cov, 
 .         mgcv_formula = mgcv_formula, mu_formula = mu_formula, 
 .         sigma_formula = sigma_formula, predictor = predictor, 
 .         count_mat = count_mat), SIMPLIFY = FALSE))
4. withCallingHandlers(expr, message = function(c) if (inherits(c, 
 .     classes)) tryInvokeRestart("muffleMessage"))
5. paraFunc(fit_model_func, gene = feature_names, family_gene = family_use, 
 .     mc.cores = n_cores, MoreArgs = list(dat_use = dat_cov, mgcv_formula = mgcv_formula, 
 .         mu_formula = mu_formula, sigma_formula = sigma_formula, 
 .         predictor = predictor, count_mat = count_mat), SIMPLIFY = FALSE)

Error while simulating spot-resolution spatial data for cell-type deconvolution

Hi,

I am trying to run the tutorial for simulating spot-resolution spatial data for cell-type deconvolution but I am getting error while running this block of code:

Error: Error in xtfrm.data.frame(x) : cannot xtfrm data frames

MOBSP_data <- construct_data(
  sce = MOBSP_sce,
  assay_use = "counts",
  celltype = NULL,
  pseudotime = NULL,
  spatial = c("spatial1", "spatial2"),
  other_covariates = NULL,
  corr_by = "1"
)

Error when simulating batch effects and library size

I am trying to create a simulation including both batch effects and library size based on the tutorials. Each one individually works fine but I get an error when I try to include both. Example code (reference data is the PBMC data from the batch effects tutorial):

reference <- readRDS((url("https://figshare.com/ndownloader/files/40581965")))
sim <- scDesign3::scdesign3(
    sce = reference, 
    celltype = "cell_type",
    pseudotime = NULL,
    spatial = NULL,
    other_covariates = c("batch", "nCount_RNA"), 
    mu_formula = "cell_type + batch + offset(log(nCount_RNA))",
    corr_formula = "1"
)
#> Input Data Construction Start
#> Input Data Construction End
#> Start Marginal Fitting
#> Marginal Fitting End
#> Start Copula Fitting
#> Convert Residuals to Multivariate Gaussian
#> Error in `colnames<-`(`*tmp*`, value = rownames(sce)) : 
#>  attempt to set 'colnames' on an object with less than two dimensions
#> In addition: Warning message:
#> In mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule,  :
#>   all scheduled cores encountered errors in user code

Should this work or am I doing something wrong here? There is also a warning about parallelization so maybe that is involved somehow? Thanks!

Can we use total gene single cell data to generate the simulated spatial data

Thanks for the work, In the tutorial for cell-type deconvolution, the top marker gene was selected firstly. I have tried to use the total genes to generate the simulated data. But the result was weird, and it was very slow. So I want to whether we can use the total genes of single cell to simulate?

A question about simulating multi-omic data from single-omic data

Hi, I intend to clarify my understanding about this function, shown in this tutorial:

https://songdongyuan1994.github.io/scDesign3/docs/articles/scDesign3-multiomics-vignette.html

It seems that we still need paired multi-omic data to estimate two sets of parameters, and then we can simulate arbitary omic data based on the joint parameters. Is my understanding correct? Is it possible for me to directly simulate one omic data from the other omic (say, based on scRNA-seq data to simualte scATAC-seq data)? Thanks.

Out-of-memory issue

Hi Dongyuan,

Thank you for your great tool!

I am using scDesign3 to simulate some scDNAm data summarized on the gene body. The sample size is around 1,000 cells, and the feature space is the same as the gene space around 20K. Here are the parameters I used:

simulated_met_mat = scdesign3(sce=sce,
                              assay_use="counts",
                              celltype="stage",
                              pseudotime=NULL,
                              spatial=NULL,
                              other_covariates=NULL,
                              mu_formula="stage",
                              sigma_formula="stage",
                              family_use='poisson',
                              usebam=FALSE,
                              corr_formula="stage",
                              n_cores=4)

Here, "stage" indicates cell types in the sce object. But I would encounter an out-of-memory issue Some of your processes may have been killed by the cgroup out-of-memory handler.

I wonder if it is because I used the wrong parameters or it is because my feature space is too large. Would you please help give some suggestions on it? Thank you for your time and I look forward to your reply.

Sincerely,
Wenjing

Long runtime for simulating spatial data

Hi scDesign3 developers,

Thank you for maintaining the amazing tool!
When I was simulating the data from https://www.10xgenomics.com/resources/datasets/mouse-brain-serial-section-2-sagittal-anterior-1-standard-1-1-0, scDesign3 seems to stuck in 'Marginal Fitting' for a long time. Doesn't matter if I just simulate for 10 genes, it stucks at Marginal Fitting. I will very much appreciate your help!! Thanks :)

Best,
Chit Tong

Here is the code I used:

print("read reference data")
brain <- Load10X_Spatial(data.dir="/anterior_data/",
                        filename="filtered_feature_bc_matrix.h5",
                        assay="Spatial",
                        slice="slice1"
)
brain <- SCTransform(brain, assay = "Spatial", verbose = FALSE)
brain <- RunPCA(brain, assay = "SCT", verbose = FALSE)
brain <- FindNeighbors(brain, reduction = "pca", dims = 1:30)
brain <- FindClusters(brain, verbose = FALSE)
brain <- RunUMAP(brain, reduction = "pca", dims = 1:30)
SpatialFeaturePlot(brain, features = "nCount_Spatial") + theme(legend.position = "right")

brain[['spatial1']] <- GetTissueCoordinates(brain)$imagerow 
brain[['spatial2']] <- GetTissueCoordinates(brain)$imagecol
brain.sce <- as.SingleCellExperiment(brain)

brain.sce <- brain.sce[1:10]

print("simulation")
set.seed(123)

example_simu <- scdesign3(
    sce = brain.sce,
    assay_use = "counts",
    celltype = "ident", ##ident or cell_type
    pseudotime = NULL,
    spatial = c("spatial1", "spatial2"),
    other_covariates = NULL,
    mu_formula = "s(spatial1, spatial2, bs = 'gp', k= 400)",
    sigma_formula = "1",
    family_use = "nb",
    n_cores = 30,
    usebam = FALSE,
    corr_formula = "1",
    copula = "gaussian",
    DT = TRUE,
    pseudo_obs = FALSE,
    return_model = FALSE,
    nonzerovar = FALSE,
    parallelization = "pbmcapply"
  )

Totally SAME count matrix(generated) compared to the reference data

Hi Dongyuan,

Good news is that I run tutorial simulating batch effect again(take a new subset of my own data) and codes work for my own data now. But a new problem is that the generated count matrix is totally the same as before. I am wondering if the reason is the many 0 in the original matrix. (Codes and data source below)

example_sce <- read_h5ad('./cite_500gene.h5ad')#one modality-several cell types with batch
example_sce<-AnnData2SCE(example_sce)

BATCH_data <- construct_data(
  sce = example_sce, #SCE object, ref data
  assay_use = "counts",#'counts','celltype', 'pseudotime' or 'spatial'.
  celltype = "cell_type", 
  pseudotime = NULL,
  spatial = NULL,
  other_covariates = c("batch"),#provided in example_sce colData
  corr_by = "1"
)
BATCH_marginal <- fit_marginal(
  data = BATCH_data,
  predictor = "gene", #intermediate variable name, not matter
  mu_formula = "cell_type + batch", 
  sigma_formula = "1",
  family_use = "nb",
  n_cores = 1, #can be changed to more, some errors may appear, and large value -> save time
  usebam = FALSE
)
set.seed(123)
BATCH_copula <- fit_copula(
  sce = example_sce,
  assay_use = "counts",
  marginal_list = BATCH_marginal,
  family_use = "nb",
  copula = "vine",#determine the copula function, can be changed to gaussian. decide the dependence structure of genes
  n_cores = 1,
  new_covariate = NULL,
  input_data = BATCH_data$dat
)

BATCH_marginal_alter <- lapply(BATCH_marginal, function(x) {
  x$fit$coefficients[length(x$fit$coefficients)] <- rnorm(1, mean = 5, sd = 2) #change to 0 for batch correction
  x
})

BATCH_para_alter <- extract_para(
  sce = example_sce,
  marginal_list = BATCH_marginal_alter,
  n_cores = 1,
  family_use = "nb",
  new_covariate = NULL,
  data = BATCH_data$dat
)

set.seed(123)
BATCH_newcount_alter <- simu_new(
  sce = example_sce,
  mean_mat = BATCH_para_alter$mean_mat,
  sigma_mat = BATCH_para_alter$sigma_mat,
  zero_mat = BATCH_para_alter$zero_mat,
  quantile_mat = NULL,
  copula_list = BATCH_copula$copula_list,
  n_cores = 1,
  family_use = "nb",
  input_data = BATCH_data$dat,
  new_covariate = BATCH_data$newCovariate,
  important_feature = BATCH_copula$important_feature
)
BATCH_para_alter <- extract_para(
  sce = example_sce,
  marginal_list = BATCH_marginal_alter,
  n_cores = 1,
  family_use = "nb",
  new_covariate = NULL,
  data = BATCH_data$dat
)

set.seed(123)
BATCH_newcount_alter <- simu_new(
  sce = example_sce,
  mean_mat = BATCH_para_alter$mean_mat,
  sigma_mat = BATCH_para_alter$sigma_mat,
  zero_mat = BATCH_para_alter$zero_mat,
  quantile_mat = NULL,
  copula_list = BATCH_copula$copula_list,
  n_cores = 1,
  family_use = "nb",
  input_data = BATCH_data$dat,
  new_covariate = BATCH_data$newCovariate,
  important_feature = BATCH_copula$important_feature
)
simu_sce <- example_sce
counts(simu_sce) <- BATCH_newcount_alter
logcounts(simu_sce) <- log1p(counts(simu_sce))
´´´

I use the CITE-seq data downloaded from here, and take the subset of first 500 gex and first 2 batches 's1d1' and 's1d2'.

Thanks a lot in advance!

Best,
Danyang

Error when setting `ncell` with a continuous convariate

I get the following error if I set the ncell argument when there is a continuous covariate in the model.

reference <- readRDS((url("https://figshare.com/ndownloader/files/40581965")))
sim <- scDesign3::scdesign3(
    sce = reference, 
    assay_use = "counts", 
    celltype = "cell_type",
    pseudotime = NULL, 
    spatial = NULL, 
    other_covariates = "nCount_RNA", 
    mu_formula = "cell_type + offset(log(nCount_RNA))", 
    sigma_formula = "1", 
    family_use = "nb", 
    n_cores = 2, 
    usebam = FALSE, 
    corr_formula = "1", 
    copula = "gaussian", 
    DT = TRUE, 
    pseudo_obs = FALSE, 
    return_model = FALSE,
    ncell = 100
)
#> Input Data Construction Start
#> Error in dimnames(x) <- dn : 
#>   length of 'dimnames' [1] not equal to array extent
#> In addition: Warning message:
#> In mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule,  :
#>   all scheduled cores encountered errors in user code

(P.S. I thought I submitted this last week but apparently not, sorry for the delay)

attempt to set an attribute on NULL error when running scdesign3()

Hi,

I am just starting working with scDesign3 package and following the tutorial about CITE-seq data. Two weeks ago everything worked, but now I get this error message "Error in names(answer) <- dots[[1L]] :
attempt to set an attribute on NULL" after
example_simu <- scdesign3( sce = example_sce, assay_use = "counts", celltype = "cell_type", pseudotime = NULL, spatial = NULL, other_covariates = NULL, mu_formula = "cell_type", sigma_formula = "cell_type", family_use = "nb", n_cores = 2, usebam = FALSE, corr_formula = "cell_type", copula = "vine", DT = TRUE, pseudo_obs = FALSE, return_model = FALSE, nonzerovar = TRUE, nonnegative = TRUE )

it is so weird, maybe do you have any idea for that? Thanks!

songdongyuan1994 / scdesign3 Goto Github PK

scdesign3's People

Contributors

Stargazers

Watchers

Forkers

scdesign3's Issues

Recommend Projects

Recommend Topics

Recommend Org