Thanks for the great job! I try to do some easy calculations on the FBM object, I want

No, your code doesn't work when using ind <- 1 .<br

Properly passing other variables as arguments of the function is necessary only

I didn't get that Y was also an

The result is NA about bigstatsr HOT 12 CLOSED

Knight1995 commented on June 8, 2024

The result is NA

from bigstatsr.

Comments (12)

privefl commented on June 8, 2024

You should start from the warning messages. X.sub is probably shorter than K1[, 2].
Also, ind spans the column indices by default in big_apply(), but here you're using it for the rows.

from bigstatsr.

Knight1995 commented on June 8, 2024

Thanks for your quick reply! Actually,I have edited my code to calculate each row's result,but the result is also NA.
colmeans <- big_apply(X1, ind = rows_along(X),function(X, ind) {
X.sub <- X[ind,1]

K1<-map_dfr(unique(X[,1]),function(i){
S1 <-mean(Y[which(X[,1]==i),1])
data.frame(Value=S1,clu=i)
})

a<-K1[which(K1[,2]==X.sub),1]
b<-min(K1[which(K1[,2]!=X.sub),1])
si=(b-a)/max(b,a)
return(si)
}, a.combine = 'c')

from bigstatsr.

privefl commented on June 8, 2024

Yes, cf. my first comment.

from bigstatsr.

Knight1995 commented on June 8, 2024

Sorry for bothering again.When I test single numble (ind=1), my code works.But I put the code into the big_apply,the results are NA. What is the problem? Does the R algorithm not work in big_apply? Thanks.

from bigstatsr.

privefl commented on June 8, 2024

No, your code doesn't work when using ind <- 1.
It is just that X.sub is of length 1 and gets automatically recycled to match the size of K1[, 2].
Which is probably not what you want.

from bigstatsr.

privefl commented on June 8, 2024

You need to think about what you are trying to achieve here.
If I had to guess, I would say that you need to subset K1[ind, 2].

from bigstatsr.

Knight1995 commented on June 8, 2024

Thanks for your reply. In order to find out the problem,i try a simple test as following.I think it may be that I didn't input one of the two variables, Y, so there is no result. But after I rewrite the code like your multivariate format (https://privefl.github.io/bigstatsr/articles/big-apply.html) , there is still no result output, which is very wired.Could you give me some suggestions? Thanks.

from bigstatsr.

privefl commented on June 8, 2024

Properly passing other variables as arguments of the function is necessary only when using parallelism.
Doing mean(Y[-ind, ]) is very odd (especially the minus). What are you trying to achieve here (in simple English)?
What do you have for summary(Y)?

from bigstatsr.

Knight1995 commented on June 8, 2024

'ind' means the row number, mean(Y[-ind, ]) means that the matrix in this row will be removed, and the mean of new matrix will be calculated.
'Summary(Y)' shows as following.

from bigstatsr.

privefl commented on June 8, 2024

I didn't get that Y was also an FBM. Then summary(Y[]).
You understand that ind is usually a vector of multiple indices, not just one, right?
And you want the full mean() of the matrix? Not something like the rowMeans()?

from bigstatsr.

Knight1995 commented on June 8, 2024

Yes, I probably understand what you mean. I tested the simple example above to know how to rewrite the a.FUN in big_apply step by step.My original R code is below. Because the matrix is too big and it runs too slowly, I want to realize this function by using big_apply.cluster_info and dist, which are the original matrix. Their row names and number of rows are the same.

K3<-future_map_dfr(seq(ncol(cluster_info)),function(Y){
  K2<-map_dfr(seq(nrow(cluster_info)),function(index){
    x <-cluster_info[,Y]
    dist2 <- as.data.frame(cbind(x,dist))[-index,]
    
    K1<-map_dfr(unique(x),function(i){
      d<-mean(dist2[which(dist2$x==i),index+1])  
      #d<-sum(dist2[ which(dist2$x==i),index+1])/length( which(dist2$x==i))
      
      data.frame(Value=d,clu=i) 
    })
    
    si <- (min(K1[K1$clu!=x[index],]$Value)-K1[K1$clu==x[index],]$Value)/max(min(K1[K1$clu!=x[index],]$Value),K1[K1$clu==x[index],]$Value)
    if(is.na(si)){
      data.frame(cluster=x[index],sil_width=0) 
    }else{
      data.frame(cluster=x[index],sil_width=si) 
    }
    
  })
  data.frame(Resolution=colnames(cluster_info)[Y],silhouette_score=mean(K2$sil_width))
})

from bigstatsr.

privefl commented on June 8, 2024

I don't get what you're trying to achieve here; sorry I cannot help.

from bigstatsr.

The result is NA about bigstatsr HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent