Comments (9)
This is a very interesting question and although I don't currently have one, I'm beginning to think I should put together a vignette to summarize my thoughts on the topic. I've performed this sort of analysis before by combining the two conditions and only producing one set of trajectories (which we could do because the two conditions shared a common starting cell type). This way, we could directly compare, e.g. "Condition A along Lineage 1" to "Condition B along Lineage 1" and know that these things were comparable. This could allow you to detect certain topological differences (if Lineage 1 ends in a cell fate that cannot be reached by cells in Condition A, then we would see that all cells after a certain pseudotime were from Condition B). And if you wanted to assign a p-value to this, you could try something like a rank sum test (comparing the ranks of the Lineage 1 pseudotime values in Condition A vs. Condition B) or logistic regression (where the predictor is pseudotime and the outcome is a binary indicator of condition).
If that paradigm doesn't work for your data, you could also estimate the percentages of cells that fall along each lineage for a given dataset. The slingCurveWeights
matrix provides values for weighting cells in downstream analysis of individual lineages, but if you divide this by the rowSums
, you could interpret it as probabilistic assignment of cells to lineages and get an overall percentage of cells assigned to each lineage. However, this assumes that you can directly map the lineages in one condition to the lineages in another, in order to compare their percentages.
Let me know if that doesn't make sense or if you have any other thoughts!
from slingshot.
Thanks for the reply, this is a helpful place to start. Can you clarify how we would quantify "Condition A/B along Lineage 1" using the data stored in these various objects? It seems pretty complex. I think in our case, we likely can combine both conditions and call just one trajectory, but it will only make sense if we can do some comparisons of our conditions from there.
Thanks!
from slingshot.
Of course, sorry that was a bit vague. The idea is that you would use all the cells to construct the lineages and then compare the conditions based on their distributions of pseudotime values. For the example I described above, I'm picturing something like this, where the two conditions share a starting point and have one similar lineage and another that is only seen in Condition B:
Then for each lineage, you could do some sort of test for independence between pseudotime and condition (I think a rank sum test makes sense here, but it isn't perfect because it doesn't incorporate the weights assigned by Slingshot).
# test lineage 1
wilcox.test(slingPseudotime(sds)[condition=='A',1], slingPseudotime(sds)[condition=='B',1])
# test lineage 2
wilcox.test(slingPseudotime(sds)[condition=='A',2], slingPseudotime(sds)[condition=='B',2])
There are definitely other ways you could approach this last part, but I think this makes sense if your goal is to detect large, topological differences between the conditions.
from slingshot.
Hmm so maybe this is a feature unique to our dataset, but our trajectories seemingly make this tricky. Essentially we have 3 trajectories, and the first 4-5 points on the curves are shared among all 3 (ie it's a late fork). The one we care about most (traj2) has just one unique-ish offshoot as the endpoint. By making a boxplot of the 3 like you did above, the values in slingPseudotime(x)
don't distinguish our two conditions, and actually, show the opposite trend. We also can't just find cells unique to traj2 since that'll just give us cells from that offshoot cluster (which we already know), because the rest of the cells will be shared with traj1 and 3.
Do you have any other suggestions on how we might see this? Happy to share some plots if any of this is unclear
from slingshot.
Yeah, I'm not sure if I understand your situation, sorry. Would you mind sharing some plots?
from slingshot.
My apologies. Here is our merged trajectory. So there are 2 conditions here, and the trajectory as you can see starts on the bottom left, proceeding through the arc towards the right. Each node here is a different cluster we identified, but they fall into 3 broader cell types (blue/cyan = progenitors, green/yellow = differentiated cells, and red = possible intermediate). What I was attempting to explain above is that the red cluster itself has a significantly different abundance between our conditions, and as you can see, the path to get to the red cluster is the mostly held in common with the other 2 trajectories.
Unless we're doing something wrong, we can quantify the trajectories like you described above using boxplots:
however you can see that condition 2 (the one with more red cells) actually goes down in the 2nd trajectory (the one including the red cluster) compared to condition 1. We expected the opposite trend since again there are more cells along the trajectory (mainly in the red cluster) in condition 2.
Does that make sense?
from slingshot.
Yes, thanks that was very helpful!
The first thing that comes to mind is that, judging by the plot of the top two PCs, it's not clear that there even is a branching point (if it weren't for the colors, it would look like a single lineage). Is the distinction between the various endpoints more obvious in higher dimensions? And relatedly, how many dimensions are you using as input to Slingshot? It looks like the red lineage has a very sharp angle, which can sometimes lead to weird results (specifically, it may be bending back on itself and picking up some of the blue cells as "late" rather than "intermediate"). A similar plot of the smooth curves could potentially help to diagnose this.
from slingshot.
Oooookay well, it turns out we were incorrectly using the $rotation
rather than $x
output of prcomp
so this whole thing is different now- we only get one trajectory instead of 3 and everything looks pretty clean.
So first off I'll thank you for your help and apologize for now pivoting to the following related question: aside from looking just at the proportions of cells in each of these clusters between condition 1 vs condition 2, do the weights or any other information from Slingshot's output tell us how the differentiation trajectory may be different across conditions? Again, we know that the blue/cyan clusters are fewer in number in condition 2, whereas the red cluster has a significantly higher proportion in condition 2. So if there's something here that tells us that condition 2 are "more differentiated" than condition 1, that would be helpful.
We're also looking at temporally-regulated genes and how those may differ but I see that as a somewhat separate question.
Thanks again!
from slingshot.
Ok, no worries! I think the same kinds of ideas apply as from our conversation above, just the differences will be less obvious. The weights are probably no longer relevant, since there is only one lineage, so every cell should have a weight of one along it.
Basically, it sounds like we want to test for a relationship between the binary condition variable and the continuous pseudotime variable. Or alternatively, a difference between the distributions of pseudotime values between the two conditions. Taking this second approach (and apologies for revising my answer from above), something like a Kolmogorov-Smirnov Test might be most appropriate. This should theoretically allow you to detect any difference in the distributions of pseudotimes, not just mean/median shifts. Implementation would be similar to the Wilcox test above, but with the ks.test
function.
from slingshot.
Related Issues (20)
- Is there any way that I could fit the NB-GAM for all the gene in my large dataset? HOT 1
- getCurves() Missing values HOT 8
- Curves and lineages HOT 3
- Error in igraph::shortest_paths(tree, from = l, to = ends, mode = "out", : At core/paths/dijkstra.c:360 : Weight vector must not contain NaN values, Invalid value HOT 3
- Average curve weights and Pseudotime HOT 1
- slingshot analysis on PCA but visualization on UMAP HOT 3
- how to analyze with 3 different treatments HOT 4
- how to calclate the POS to compare other algorithm? HOT 2
- Pulling and plotting Differential pseudotime data across different conditions/metadata slots HOT 1
- `embedCurves` should also map MST into new space HOT 1
- Slingshot followed by CellRank HOT 1
- is slingshot using only the variable genes? HOT 2
- Pick one lineage from many for conditiontest HOT 3
- How do I extract feature_info in my rds file? HOT 4
- What does a curve represent biologically? HOT 7
- Overlapping curve issue - change to code needed? HOT 3
- Removing a lineage from a `PseudotimeOrdering` object? HOT 3
- Error in graph.adjacency.dense: Adjacency matrix should be symmetric to produce an undirected graph. Invalid value HOT 5
- pseudotime in two lineage? HOT 2
- Slingshot error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from slingshot.