Git Product home page Git Product logo

Comments (55)

WT215 avatar WT215 commented on June 9, 2024 2

Yes, I noticed that on my site too. I am trying to make sp_mat object from RcppArmadillo to handle large sparse matrix.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 2

Thanks for that. Please have a try the new version 1.5.5. It seems that now only the memory size of the machine matters.

My laptop has memory 32G. Currently it failed on matrix of dimension 32738 * 737280 (gene * cell), but seems to be ok with 32738 * 10000.

If as.matrix only correlates with the memory size of machine, then I probably change back to use matrix instead of dgCMatrix object for processing the data in the future.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

Thank you for trying bayNorm!

The following code may help you to insert any normalized data, which can be obtained outside Seurat framework, into a Seurat object. Then you can find clusters based on that normalized data:

library(bayNorm)
library(Seurat)

data('EXAMPLE_DATA_list')

library(Seurat)
bay_out<-bayNorm(EXAMPLE_DATA_list$inputdata,mean_version = TRUE)
x.seurat <- CreateSeuratObject(counts =bay_out$Bay_out,assay = 'bayNorm')
# x.seurat <- NormalizeData(x.seurat)
x.seurat <- ScaleData(x.seurat)
x.seurat <- FindVariableFeatures(x.seurat, do.plot = FALSE)
#Specifying: assay='bayNorm'
x.seurat <- RunPCA(x.seurat, pc.genes = [email protected],
                   pcs.compute = 50, do.print = FALSE,assay='bayNorm')

x.seurat <- RunUMAP(x.seurat, dims = 1:10,assay='bayNorm')
#x.seurat <- JackStraw(x.seurat, prop.freq = 0.06)
x.seurat <- FindNeighbors(x.seurat, dims = 1:10)
x.seurat <- FindClusters(x.seurat, resolution = 0.5)
head(Idents(x.seurat), 5)
#It is a toy example. Here only one cluster was found
plot(x.seurat@[email protected],pch=16,col=as.factor(Idents(x.seurat)))
#Double check that the assay we used comes from bayNorm
x.seurat@[email protected]

Hope this would be helpful.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

Hi @kvshams ,
Maybe have a try bayNorm(Data=cbind(case, control)) (pseudocode), and then look at the clustering of this normalized data? The prior parameters would be estimated across all the cells.
Is the left panel based on Harmoney?

@WT215 Thanks for the suggestion. Just tried different ways. Why not the GG. Got a good alignment . Do I need to keep anything in mind while using this method?
image

@kvshams Thanks for the feedback! Basically, if you want to minimize difference between groups of cells, you can try GG procedure. LL procedure could amplify the inter-groups noisy while minimizing the intrinsic noisy within each group.

However, I don't think there is a "once for all" procedure for dealing with the scRNA-seq data. It depends on what answers you would like to find from the data, like DE detection, noisy genes detection or other goals.

In addition to the different procedures, if you have an idea about the mean capture efficiency of your data (\bar{\beta}, a scalar ranges between 0 and 1), or even have a relevant smFISH data, then you can estimate \beta accurately. This may lead to a better performance of bayNorm.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

@kvshams I think you can directly try GG at first (for any purposes), then do cluster detection to find interesting clusters which may due to KO/treatment. Then LL for any subgroups of cells if you want to look at noisy genes in that subgroup.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

@kvshams

I did not test these two factors rigorously. My suggestion is to filter out most mitochondria related genes at the very beginning (if they are not of interest in your study).

For cell cycles, the GG procedure may mitigate the difference between cells which belong to different cell cycles. Whether this works maybe checked by visualizing the 2D umap, see if cells were clustered w.r.t to the cell cycles.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

I am looking at this discussion: https://stackoverflow.com/questions/40592054/large-matrices-in-rcpparmadillo-via-the-arma-64bit-word-define, but I got the issue posted here when I tried to build the package: https://stackoverflow.com/questions/55248883/why-does-rcpp-function-segfault-only-when-called-from-r-package-but-not-when-sou.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

@kvshams By learning from the R package liger, I am using dgCMatrix throughout the process. Please have a try the 1.5.9. I can run bayNorm on a sparse matrix of dimension 32738 genes * 737280 cells on my laptop (32G memory, 5 cores specified in bayNorm. Though that test was not completed, it arrived at 1% at time-consuming part. Hence it seems to work).

The output is still of type matrix. If it works, I may allow it to be of type dgCMatrix too. In case that the output is too large to be stored.

The error %Error in serialize(data, node$con) : ignoring SIGPIPE signal is probably due to that we used matrix. So the data which was distributed to each cores is still too large, hence the core is highly likely to be terminated. Now with dgCMatrix, it may works.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

Glad to hear that! I will also allow mean version to return sparse matrix later.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

I just updated the 1.5.13. Now it allows dgCMatrix throughout the whole process (Input which is not of dgCMatrix will be converted to dgCMatrix).

Have a try at_GG <- bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, Prior_type="GG",out.sparse = TRUE). Then you will get

> class(at_GG$Bay_out)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"

from baynorm.

kvshams avatar kvshams commented on June 9, 2024 1

The rerun with new version getting killed

at_GG <- bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, parallel = TRUE,NCores = 50, Prior_type="GG", out.sparse = TRUE)

Priors estimation based on MME method has completed.
Start optimization using spg from BB package.
This part may be time-consuming.
|======================================================================| 100%
Prior estimation has completed!
**********************************************************
***********
Killed

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

Yes, the output is no more sparse. I don't think replacing <1 with 0 can help. It seems that there is no solution, I need to think about it.

How about filtering out some cells according to the total counts at the very beginning?

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

I don't think replacing <1 with 0 is correct. Does '1.5.14' work without replacing <1?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024 1

Yes, it works without replacing it. I am saving that object and now will see how does it sync with Seurat

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

Basically, as far as I think, there are following ways:

  1. Filtering out more cells/genes.
  2. Saver the normalized data. Then restart R and load that data into Seurat. By doing so, we make sure that there is only one big data loaded in the environment (exclude the raw data).
  3. Using scanpy in python.
  4. spam package in R... still look into it https://stackoverflow.com/questions/24236426/how-to-get-a-big-sparse-matrix-in-r-231-1 .

from baynorm.

kvshams avatar kvshams commented on June 9, 2024 1

I agree that it would work in other packages such as scnapy. Even it works very well with bigmemory (verified) in R but unfortunately the Seurat is still at dgCMatrix based data structure. Thanks a lot for working on it. As we have reached now on a dead end, I am closing this issue

from baynorm.

WT215 avatar WT215 commented on June 9, 2024 1

@WT215 sorry to bug you again. How easy to return Arcsine normalized dgcMatrix as an output. That would solve the memory issue as it keep the zero value as zero.

Thanks for the suggestion! This idea sounds interesting. However the normalized data is no longer sparse, and the R itself cannot handle big matrix. So I am not very sure whether that idea works, but I will have a look.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Thanks @WT215 for the reply. That worked!.

I have a doubt regarding the case control data. If use the baynorm assay for the clustering, there is huge batch effect (baynorm is done separately for case and control to preserve the structure, is there any other way ?). Please see the Inf data from seurat-data . Harmoney is used for integration.

image

Any suggestion?

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

Hi @kvshams ,

Maybe have a try bayNorm(Data=cbind(case, control)) (pseudocode), and then look at the clustering of this normalized data? The prior parameters would be estimated across all the cells.

Is the left panel based on Harmoney?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

@WT215 Was thinking to do so, but did not do by anticipating that would create a true bias in the data structure itself (or am I missing some thing?). For eg, if the case is a KO, then we would see an imputed data for the KO gene also, right?. Thats the reason I was doing the case control baynorm separately (in this case STIM and CTRL). Did not test the KO scenario, but will test and get back to you with an appropriate test data.

Both the clustering above is done by Harmoney in same way. The only difference is that the assay fed in for the process, ie RNA and Baynorm imputed assays respectively.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

@WT215, hopefully I can use the baynorm output ( ie the bay_out$Bay_out) directly to the edgeR differential expression without any additional steps right?.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

@WT215, hopefully I can use the baynorm output ( ie the bay_out$Bay_out) directly to the edgeR differential expression without any additional steps right?.

I have not tried to applied edgeR to the bayNorm normalized data. MAST could be a good choice, see example here: Section: Downstream analysis: DE genes detection.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Hi @kvshams ,

Maybe have a try bayNorm(Data=cbind(case, control)) (pseudocode), and then look at the clustering of this normalized data? The prior parameters would be estimated across all the cells.

Is the left panel based on Harmoney?

@WT215 Thanks for the suggestion. Just tried different ways. Why not the GG. Got a good alignment . Do I need to keep anything in mind while using this method?
image

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Hi @kvshams ,
Maybe have a try bayNorm(Data=cbind(case, control)) (pseudocode), and then look at the clustering of this normalized data? The prior parameters would be estimated across all the cells.
Is the left panel based on Harmoney?

@WT215 Thanks for the suggestion. Just tried different ways. Why not the GG. Got a good alignment . Do I need to keep anything in mind while using this method?
image

@kvshams Thanks for the feedback! Basically, if you want to minimize difference between groups of cells, you can try GG procedure. LL procedure could amplify the inter-groups noisy while minimizing the intrinsic noisy within each group.

However, I don't think there is a "once for all" procedure for dealing with the scRNA-seq data. It depends on what answers you would like to find from the data, like DE detection, noisy genes detection or other goals.

In addition to the different procedures, if you have an idea about the mean capture efficiency of your data (\bar{\beta}, a scalar ranges between 0 and 1), or even have a relevant smFISH data, then you can estimate \beta accurately. This may lead to a better performance of bayNorm.

Thanks @WT215 for the detailed and quick reply.
My main goal for the analysis is to find trajectory, DE and find any excited cells due to the KO/treatment. According to my understanding,

  1. GG would be best for DE & Trajectory
  2. LL would best for finding excited cells/signatures

Is that correct?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Thank you for trying bayNorm!

The following code may help you to insert any normalized data, which can be obtained outside Seurat framework, into a Seurat object. Then you can find clusters based on that normalized data:

library(bayNorm)
library(Seurat)

data('EXAMPLE_DATA_list')

library(Seurat)
bay_out<-bayNorm(EXAMPLE_DATA_list$inputdata,mean_version = TRUE)
x.seurat <- CreateSeuratObject(counts =bay_out$Bay_out,assay = 'bayNorm')
# x.seurat <- NormalizeData(x.seurat)
x.seurat <- ScaleData(x.seurat)
x.seurat <- FindVariableFeatures(x.seurat, do.plot = FALSE)
#Specifying: assay='bayNorm'
x.seurat <- RunPCA(x.seurat, pc.genes = [email protected],
                   pcs.compute = 50, do.print = FALSE,assay='bayNorm')

x.seurat <- RunUMAP(x.seurat, dims = 1:10,assay='bayNorm')
#x.seurat <- JackStraw(x.seurat, prop.freq = 0.06)
x.seurat <- FindNeighbors(x.seurat, dims = 1:10)
x.seurat <- FindClusters(x.seurat, resolution = 0.5)
head(Idents(x.seurat), 5)
#It is a toy example. Here only one cluster was found
plot(x.seurat@[email protected],pch=16,col=as.factor(Idents(x.seurat)))
#Double check that the assay we used comes from bayNorm
x.seurat@[email protected]

Hope this would be helpful.

Do we need to regress out % mitochondria and cell cycle or is it already account the same in any of your prior estimate?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

All works good with relatively small data. When I try the actual data it exit with some error

BNorm_LL_Sample <- bayNorm(raw_data@assays$RNA@counts,mean_version = TRUE,
+                   [email protected]$orig.ident, Prior_type="LL")
Error in asMethod(object) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
BNorm_LL_Batch <- bayNorm(raw_data@assays$RNA@counts,mean_version = TRUE, [email protected]$Batch, Prior_type="
LL")
Error in asMethod(object) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105

This data has 480K cells across 4 groups and 20 samples. As this error may be related to the memory, I tried with 3.8T RAM system also (sadly, we cant get more memory), still get the error. If I split the groups, that works fine but that would be more like to be LL. Is there any work around for this issue?

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

I updated the package, try install it via:

#version 1.5.4
library(devtools)
devtools::install_github("WT215/bayNorm")

Now bayNorm support sparse matrix. Let me know if it still does not work.

BTW, how many genes do you keep?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

@WT215, thanks for the fix and update. It does seems works and will update how is the performance.

I use all the remnant genes based on the Seurat default filtering (thats ~15k). Is there any other criterion for the genes to keep?

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

@WT215, thanks for the fix and update. It does seems works and will update how is the performance.

I use all the remnant genes based on the Seurat default filtering (thats ~15k). Is there any other criterion for the genes to keep?

15k is a reasonable number. Seurat is already good to be used for this purpose. Another way is, for each gene, look at the proportion of cells which have non-zero counts. Then you can set up a threshold.

I am still updating the package. Currently accessing the row of dgCMatrix is slower than accessing the row of matrix in R https://stackoverflow.com/questions/47997184/extraction-speed-in-matrix-package-is-very-slow-compared-to-regular-matrix-class.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

LL done without any error. But GG is having an error

Error in t_sp(Data): SpMat::init(): requested size is too large
Traceback:

1. bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, Conditions = [email protected]$orig.ident, 
 .     Prior_type = "GG")
2. Prior_fun(Data = do.call(cbind, DataList_sr), BETA_vec = do.call(c, 
 .     BETAList), parallel = parallel, NCores = NCores, FIX_MU = FIX_MU, 
 .     GR = GR, BB_SIZE = BB_SIZE, verbose = verbose)
3. t_sp(t_sp(Data)/BETA_vec)
4. t_sp(Data)

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

how to fix this issue?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

I think its worth looking this thread

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

@WT215 it still persist!. I tried the highest possible memory in GCP (n1-ultramem-160 3.8TB) :(

Error in asMethod(object) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
Calls: bayNorm ... EstPrior -> <Anonymous> -> - -> - -> as -> asMethod
Execution halted

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

What is the error message?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

What is the error message?
updated the previous comment with the error

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

Maybe we need to debug in the following steps:

  1. Can rout=Matrix::rowMeans(raw_data@assays$RNA@counts) be run?
  2. Can qq=as.matrix(raw_data@assays$RNA@counts) be run successfully on your site?
  3. If the step 2 is ok, then how about rout=Matrix::rowMeans(qq)?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Step1 is working fine.
step to have the error

qq=as.matrix(raw_data@assays$RNA@counts)
Error in asMethod(object) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105

as step is not working, the step 3 cant be tested

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

If as.matrix only correlates with the memory size of machine, then I probably change back to use matrix instead of dgCMatrix object for processing the data in the future.

Falling to matrix is the option here I guess?

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

@kvshams I just updated the 1.5.7, hopefully it works.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Error still persists with rccp (even with the highest RAM system).

error: Mat::init(): requested size is too large
Error in EstPrior_rcpp(Data = Data) : 
  Mat::init(): requested size is too large
Calls: bayNorm -> Prior_fun -> EstPrior -> EstPrior_rcpp
Execution halted

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

I replaced the Rcpp function with original one in 1.5.8. You can directly try:

MME_est<-EstPrior(Data=raw_data@assays$RNA@counts,verbose=TRUE)

Can that be run?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Thank you for the update. Now I get this.

at_GG <- bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, [email protected]$orig.ident, Prior_type="GG")

 Priors estimation based on MME method has completed.
Start optimization using spg from BB package.
This part may be time-consuming.
  |                                                                      |   0%Error in serialize(data, node$con) : ignoring SIGPIPE signal
MME_est<-EstPrior(Data=raw_data@assays$RNA@counts,verbose=TRUE)
Priors estimation based on MME method has completed.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

UPDATE: now it seems working... will update the performance onece it will get complete (as it may take few hours to get the result, I keeping the log here). Thanks a lot for the quick fix!

at_GG <- bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, [email protected]$orig.ident, Prior_type="GG")
Priors estimation based on MME method has completed.
Start optimization using spg from BB package.
This part may be time-consuming.
  |=============                                                         |  18%

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Yes it works now!. Thanks a lot.

at_GG <- bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, [email protected]$orig.ident, Prior_type="GG")
Priors estimation based on MME method has completed.
Start optimization using spg from BB package.
This part may be time-consuming.

|======================================================================| 100%
Prior estimation has completed!
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

I am having very hard time in making the seurat object from the baynorm output (450k Cells across 20 samples). I think the issue is to convert all the samples to one dgCMatrix . If I am trying to have a work around like

Mtrix_from_Baynorm <- cbind(baynorm$Bay_out_lis)
my.counts <- as(Mtrix_from_Baynorm, "dgCMatrix")
Segmentation fault

Is there any way I can put all the sample into a single dgCMatrix in baynorm output?

If I am making one object per sample from the Bay_out_lis and use the Seurat merge option, then also it breaks due to memory error.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

Hi @kvshams ,

The correct way for cbind matricies in a list is Mtrix_from_Baynorm <- do.call(cbind,baynorm$Bay_out_list), maybe try this at first, since you have already had that result.

If that does not solve the problem, then another way could be:

at_GG <- bayNorm(raw_data@assays$RNA@counts, mean_version = TRUE, Prior_type="GG")

By setting Conditions=NULL, the output will be a full matrix instead stored in a list. Hence there is no need to cbind. Since Prior_type="GG", specifying Conditions is not necessary. You can call the subgroup of cells according to the cell names later. This way may save the memory.

Then, x.seurat <- CreateSeuratObject(counts =at_GG$Bay_out,assay = 'bayNorm'). No need to convert it to dgCMatrix, that code (CreateSeuratObject) will convert it inside that function.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Thanks @WT215 for reopening the issue and suggestions. I have already tried both cbind and cbind2 ie as in

 do.call(cbind,baynorm$Bay_out_list)

In both the case seurat fails (I presume it is in dgcmatrix converting step as the input from baynorm is a matrix and the error is exactly similar that I get to convert baynorm output matrix to the dgcmatrix).

Error in .cbind2Csp(x, y) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_sparse.c, line 92

Let me try the GG as you suggested and see how it goes.

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

Just updated 1.5.14, please have a try.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

@WT215 I think the issue is with the matrix. The baynorm out is no more a true sparse one and almost all the col*rows are filled with values imputed by baynorm (many of them are near to zero). Is'nt it worth putting all values <0 to 0 to make it memory efficient?. Is that statistically correct according to your model?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

In the output replacing <1 with 0 helps with the cost of time! (conversion time). And the v 1.5.14 also works now (I just re-installed the Rccp and RcppArmadillo). 1.5.14 is much faster and memory efficient than any previous version!.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

Now it gives a new error

at_GG$Bay_out
22865 x 442010 sparse Matrix of class "dgCMatrix"
Error in validityMethod(as(object, superClass)) :
  long vectors not supported yet: ../../src/include/Rinlinedfuns.h:522

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

Try not to access the whole data directly: at_GG$Bay_out.

Can x.seurat <- CreateSeuratObject(counts =at_GG$Bay_out,assay = 'bayNorm') be run?

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

from baynorm.

WT215 avatar WT215 commented on June 9, 2024

This has been mentioned here: satijalab/seurat#1029 (comment), will look into it.

How about

qq<-unname(at_GG$Bay_out)
rm(at_GG)
x.seurat <- CreateSeuratObject(counts =qq,assay =
 'bayNorm')

Or save at_GG$Bay_out as HDF5Matrix object. Then Read10X_h5 (still, not sure about this).

Another relevant discussion:
cole-trapnell-lab/monocle-release#138 (comment).

A final resort could be using scanpy in python, which is similar to Seurat in R.

from baynorm.

kvshams avatar kvshams commented on June 9, 2024

@WT215 sorry to bug you again. How easy to return Arcsine normalized dgcMatrix as an output. That would solve the memory issue as it keep the zero value as zero.

from baynorm.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.