Comments (5)
It depends on which benchmark dataset you used. These cell lines have distinct gene expression patterns, so the single cell data can be easily clustered. This is why we have the mixture dataset, where the difference between clusters are more subtle. Also keep in mind that these dataset provide a baseline for method evaluation, which means having good performance on these dataset does not guarantee the similar performance on other more complicated data. If you want to challege your clustering methods, you can look at scRNAseq data from cells in differenciation, such as hematopietic stem cells (https://www.nature.com/articles/s41467-019-10291-0) or iPSCs (https://doi.org/10.1016/j.cell.2019.01.006). The different between different cell type/state will be more subtle, but the annotation will not be as good as our controlled experiment, due to the limited knowledge of the biological systems.
from sc_mixology.
Hi, I am experimenting with your test data as well. (Nice paper by the way, very comprehensive comparisons). However, I am having difficulties finding out the "true label" for cellmix data.
I understand that for CellLine scRNAseq data you used "cell_line_demuxlet" as your true label, and for RNAmix data you used "mix" column to store the true label info, but for cellmix data there is no such group info in "colData".
It will be great if you can help me with this issue. A bunch of thanks in advance.
from sc_mixology.
Hi, I am experimenting with your test data as well. (Nice paper by the way, very comprehensive comparisons). However, I am having difficulties finding out the "true label" for cellmix data.
I understand that for CellLine scRNAseq data you used "cell_line_demuxlet" as your true label, and for RNAmix data you used "mix" column to store the true label info, but for cellmix data there is no such group info in "colData".
It will be great if you can help me with this issue. A bunch of thanks in advance.
Yes I agree it is not very clear. The true label are the combinations of cell numbers from three cell lines, so is the RNA mixtures. but in RNA mixtures it is the combinations of RNA proportions not cell numbers. I will update the documentation.
from sc_mixology.
Hi @LuyiTian I am curious about the true labels as well. Which file(s) contains the true cluster labels? Thanks in advance.
from sc_mixology.
@sgmccalla I have updated the document. please check recent commit bba1c35.
from sc_mixology.
Related Issues (15)
- All the methods tested in the paper all limited in R language? HOT 1
- How to perform log transformation? HOT 3
- Question regarding 10x UMIs HOT 1
- Is the 10X data 3' or 5'? Which chemistry was used? HOT 1
- can't reproduce the results in the paper HOT 1
- Plans to make into ExperimentHub package? HOT 3
- Number of cells discrepancy HOT 1
- five cancer cell line 10x data barcode+UMI length
- mitochondrial RNA
- A query about theoretical total input of spike-ins HOT 2
- known cell grouping variable HOT 2
- Fix encoding(?) of the *_call.R scripts in script/clustering/Clustering_Algorithms HOT 1
- Rdata file contains variables with unclear names HOT 1
- Loading the data from Python HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sc_mixology.