hemberg-lab / scrna.seq.datasets Goto Github PK
View Code? Open in Web Editor NEWCollection of public scRNA-Seq datasets used by our group
Home Page: https://hemberg-lab.github.io/scRNA.seq.datasets/
License: GNU General Public License v3.0
Collection of public scRNA-Seq datasets used by our group
Home Page: https://hemberg-lab.github.io/scRNA.seq.datasets/
License: GNU General Public License v3.0
Thanks for your works! I am running your code link_ but I found several datasets which should be downloaded form Amazon server were gone. For example, the Pollen dataset, from ./bash/pollen.sh. Is there anyway to download these datasets or to solve this problem? Thanks!
I've noticed that downloading the raw data and running the provided R scripts for the data sets to generate the RDS files outputs different results than what the ones on the website. Specifically, the logcounts
don't match up everywhere.
I discovered this while looking through the scmap paper and going through the data sets. This problem only occurs for the data sets with CPM normalization (in Supplementary Table 2 from scmap) namely: Goolam, Li, Kolodziejczyk, Baron, Segerstolpe, Klein, Zeisel, Shekhar and Macosko.
I've gone through how the logcounts
are computed in create_sce.R
, but it doesn't match up with the actual results. For example, take the Li data set
> log2(calculateCPM(sceset, use_size_factors = FALSE) + 1)[1:5, 1:2]
RHA015__A549__turquoise RHA016__A549__turquoise
TSPAN6 9.009166 9.2430517
TNMD 0.000000 0.0000000
DPM1 4.888420 6.9251494
SCYL3 0.000000 6.7888400
C1orf112 0.000000 0.6565277
while loading li.rds
available at https://hemberg-lab.github.io/scRNA.seq.datasets/human/tissues/
gives
> logcounts(li)[1:5, 1:2]
RHA015__A549__turquoise RHA016__A549__turquoise
TSPAN6 10.352333 10.586473
TNMD 0.000000 0.000000
DPM1 6.203446 8.262799
SCYL3 0.000000 8.125773
C1orf112 0.000000 1.300885
How were these logcounts
computed?
When running treutlein.sh to download the data i get the following error message:
HTTP request sent, awaiting response... 404 Not Found 2020-04-14 16:23:20 ERROR 404: Not Found.
Opening the page manually also leads to a 404 error.
I was wondering how to formally cite this collection of scRNA-seq datasets for academic publication.
Thank you for providing this comprehensive and easy-to-use collection. It saved me a lot of time.
Hi,
It appears the cell annotations for the Biase dataset were on Amazon and the file is no longer accessible. Would it be possible to share the cell annotation file?
Thanks,
Atif
Hello,
Bash links appear to be broken:
scrnaseq]$ wget 'https://s3.amazonaws.com/scrnaseq-public-datasets/manual-data/yan/nsmb.2660-S2.csv'
--2024-03-21 14:26:38-- https://s3.amazonaws.com/scrnaseq-public-datasets/manual-data/yan/nsmb.2660-S2.csv
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.217.200.232, 52.217.103.102, 52.217.117.24, ...
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.200.232|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-03-21 14:26:39 ERROR 404: Not Found.
Is it doable to update it?
When running goolam.R to convert the data into a SingleCellExperiment line 17
sceset <- create_sce_from_counts(d, ann)
causes the following error:
Error in .calculate_cpm(assay(x, exprs_values), ...) : unused argument (use.size.factors = FALSE)
Traceback information:
`Error in .calculate_cpm(assay(x, exprs_values), ...) :
unused argument (use.size.factors = FALSE)
9. .local(x, ...)
8. .nextMethod(x = x, size_factors = size_factors, ...)
7. eval(call, callEnv)
6. eval(call, callEnv)
5. callNextMethod(x = x, size_factors = size_factors, ...)
4. .local(x, ...)
3. calculateCPM(sceset, use.size.factors = FALSE)
2. calculateCPM(sceset, use.size.factors = FALSE) at create_sce_mirror.R#14
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.