It turned out that some rows (genes) with only 0 values caused the issue.
Would be nice to find a solution/recommendation on how to handle this problem (e.g. by simply dropping genes with only 0s before difference calculation or something else).
Another issue related to this problem: after calculating diversity, there are some rows with a few really low values (values < .Machine$double.eps
), that are handled as 0s so these rows also cause errors:
Error in if (ecdf(shuffled[i, ])(log2_fc[i]) >= 0.5) { :
missing value where TRUE/FALSE needed
Calls: calculate_difference -> label_shuffling
In addition: There were 50 or more warnings (use warnings() to see the first 50)
# Convert really small values to 0s:
diversity_data[.Machine$double.eps > diversity_data] <- 0
# Filter out samples with only zeros:
diversity_data_filtered <- diversity_data %>%
mutate(rowsum=rowSums(select(., starts_with("dataset")))) %>%
filter(rowsum != 0) %>%
dplyr::select(-rowsum)