Comments (5)
@kstreet13 There might be something that you can do to speed it up. By default, princurve will produce principal curves of length equal to the number of points (N) in your dataset. This means at each iteration, N points from your dataset will be projected to the curve, which means n × n comparisons (right here).
I added a parameter to princurve so that for large n, the curve is approximated by a curve of a fixed size (e.g. 100 - 1000 points): princurve:principal_curve.R#L174-L186. Check this commit to see the changes I made to achieve this result.
from slingshot.
Update: I'm still working on it, but the approx
branch now has a version of getCurves()
that accepts the approx_points
argument. This provides a nice speed-up, especially for branching lineages!
from slingshot.
Hey @pangxueyu233 ,
Thanks very much! I have a few ideas, but it depends on what the goals of your analysis are.
Unfortunately, I don't think there's much I can do to speed up getCurves()
/slingshot()
(although I'm open to suggestions!). However, if you are interested in exploring the overall branching structure in a dataset, you can do this quickly by only using getLineages()
and examining the output of that function over many possible inputs. Then, once you have settled on a particular structure, you should only need to run getCurves()
once.
As for testing the associations between genes and pseudotime, there are lots of ways this could be done and the GAM code we provide in the vignette is just one example. If you want something that will run faster, you could try fitting polynomials rather than smoothing splines, as this can be vectorized. For example, you could fit 4th degree polynomials for each gene like:
degree <- 4
pst <- t - mean(t)
x <- sapply(seq_len(degree), function(p){ pst^p })
fit <- lm(t(Y) ~ x)
s <- summary(fit)
fs <- t(sapply(s, function(s){ s$fstatistic }))
poly.pval <- pf(fs[,1], df1 = fs[,2], df2 = fs[,3], lower.tail = FALSE)
Hope this helps and let me know if you have any other questions or suggestions!
Kelly
from slingshot.
@rcannood Wow, this is great! Thanks very much!
@pangxueyu233 I'll work on adding this argument to getCurves()
and hopefully have a new version up soon.
from slingshot.
Hey @pangxueyu233 ,
The new version is up! v1.3.1 now on the master
branch here and the devel
branch on Bioconductor. I did some quick benchmarking and it looks like the approx_points
argument provides a considerable speedup for trajectories with branching lineages, along with a modest speedup for individual lineages (thanks again, @rcannood, this is great for working with larger 10X datasets!).
Also, if you're interested, I wanted to let you know about our recently released package, tradeSeq! It provides functionality for performing downstream testing for association between genes and lineages and I would definitely recommend it over the GAM code from the Slingshot vignette.
Hope this helps!
Kelly
from slingshot.
Related Issues (20)
- Average curve weights and Pseudotime HOT 1
- slingshot analysis on PCA but visualization on UMAP HOT 3
- how to analyze with 3 different treatments HOT 4
- how to calclate the POS to compare other algorithm? HOT 2
- Pulling and plotting Differential pseudotime data across different conditions/metadata slots HOT 1
- `embedCurves` should also map MST into new space HOT 1
- Slingshot followed by CellRank HOT 1
- is slingshot using only the variable genes? HOT 2
- Pick one lineage from many for conditiontest HOT 3
- How do I extract feature_info in my rds file? HOT 4
- What does a curve represent biologically? HOT 7
- Overlapping curve issue - change to code needed? HOT 3
- Removing a lineage from a `PseudotimeOrdering` object? HOT 3
- Error in graph.adjacency.dense: Adjacency matrix should be symmetric to produce an undirected graph. Invalid value HOT 5
- pseudotime in two lineage? HOT 2
- Slingshot error HOT 1
- Error while plotting the Minimum Distances Between Cell-Type Centroids using Slingshot in r HOT 6
- Error: (converted from warning) useNames = NA is deprecated. Instead, specify either useNames = TRUE or useNames = FALSE. HOT 1
- Slingshot package ------ Error in crossprod(out) : "crossprod" is not BUILTIN function HOT 1
- confusion about the end of a lineage HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from slingshot.