Git Product home page Git Product logo

Comments (9)

kstreet13 avatar kstreet13 commented on July 21, 2024

Hi @jeremymsimon

This is a very interesting question and although I don't currently have one, I'm beginning to think I should put together a vignette to summarize my thoughts on the topic. I've performed this sort of analysis before by combining the two conditions and only producing one set of trajectories (which we could do because the two conditions shared a common starting cell type). This way, we could directly compare, e.g. "Condition A along Lineage 1" to "Condition B along Lineage 1" and know that these things were comparable. This could allow you to detect certain topological differences (if Lineage 1 ends in a cell fate that cannot be reached by cells in Condition A, then we would see that all cells after a certain pseudotime were from Condition B). And if you wanted to assign a p-value to this, you could try something like a rank sum test (comparing the ranks of the Lineage 1 pseudotime values in Condition A vs. Condition B) or logistic regression (where the predictor is pseudotime and the outcome is a binary indicator of condition).

If that paradigm doesn't work for your data, you could also estimate the percentages of cells that fall along each lineage for a given dataset. The slingCurveWeights matrix provides values for weighting cells in downstream analysis of individual lineages, but if you divide this by the rowSums, you could interpret it as probabilistic assignment of cells to lineages and get an overall percentage of cells assigned to each lineage. However, this assumes that you can directly map the lineages in one condition to the lineages in another, in order to compare their percentages.

Let me know if that doesn't make sense or if you have any other thoughts!

from slingshot.

jeremymsimon avatar jeremymsimon commented on July 21, 2024

Thanks for the reply, this is a helpful place to start. Can you clarify how we would quantify "Condition A/B along Lineage 1" using the data stored in these various objects? It seems pretty complex. I think in our case, we likely can combine both conditions and call just one trajectory, but it will only make sense if we can do some comparisons of our conditions from there.

Thanks!

from slingshot.

kstreet13 avatar kstreet13 commented on July 21, 2024

Of course, sorry that was a bit vague. The idea is that you would use all the cells to construct the lineages and then compare the conditions based on their distributions of pseudotime values. For the example I described above, I'm picturing something like this, where the two conditions share a starting point and have one similar lineage and another that is only seen in Condition B:

image

image

Then for each lineage, you could do some sort of test for independence between pseudotime and condition (I think a rank sum test makes sense here, but it isn't perfect because it doesn't incorporate the weights assigned by Slingshot).

# test lineage 1
wilcox.test(slingPseudotime(sds)[condition=='A',1], slingPseudotime(sds)[condition=='B',1])
# test lineage 2
wilcox.test(slingPseudotime(sds)[condition=='A',2], slingPseudotime(sds)[condition=='B',2])

There are definitely other ways you could approach this last part, but I think this makes sense if your goal is to detect large, topological differences between the conditions.

from slingshot.

jeremymsimon avatar jeremymsimon commented on July 21, 2024

Hmm so maybe this is a feature unique to our dataset, but our trajectories seemingly make this tricky. Essentially we have 3 trajectories, and the first 4-5 points on the curves are shared among all 3 (ie it's a late fork). The one we care about most (traj2) has just one unique-ish offshoot as the endpoint. By making a boxplot of the 3 like you did above, the values in slingPseudotime(x) don't distinguish our two conditions, and actually, show the opposite trend. We also can't just find cells unique to traj2 since that'll just give us cells from that offshoot cluster (which we already know), because the rest of the cells will be shared with traj1 and 3.

Do you have any other suggestions on how we might see this? Happy to share some plots if any of this is unclear

from slingshot.

kstreet13 avatar kstreet13 commented on July 21, 2024

Yeah, I'm not sure if I understand your situation, sorry. Would you mind sharing some plots?

from slingshot.

jeremymsimon avatar jeremymsimon commented on July 21, 2024

My apologies. Here is our merged trajectory. So there are 2 conditions here, and the trajectory as you can see starts on the bottom left, proceeding through the arc towards the right. Each node here is a different cluster we identified, but they fall into 3 broader cell types (blue/cyan = progenitors, green/yellow = differentiated cells, and red = possible intermediate). What I was attempting to explain above is that the red cluster itself has a significantly different abundance between our conditions, and as you can see, the path to get to the red cluster is the mostly held in common with the other 2 trajectories.

2019-07-29

Unless we're doing something wrong, we can quantify the trajectories like you described above using boxplots:

2019-07-29-2

however you can see that condition 2 (the one with more red cells) actually goes down in the 2nd trajectory (the one including the red cluster) compared to condition 1. We expected the opposite trend since again there are more cells along the trajectory (mainly in the red cluster) in condition 2.

Does that make sense?

from slingshot.

kstreet13 avatar kstreet13 commented on July 21, 2024

Yes, thanks that was very helpful!

The first thing that comes to mind is that, judging by the plot of the top two PCs, it's not clear that there even is a branching point (if it weren't for the colors, it would look like a single lineage). Is the distinction between the various endpoints more obvious in higher dimensions? And relatedly, how many dimensions are you using as input to Slingshot? It looks like the red lineage has a very sharp angle, which can sometimes lead to weird results (specifically, it may be bending back on itself and picking up some of the blue cells as "late" rather than "intermediate"). A similar plot of the smooth curves could potentially help to diagnose this.

from slingshot.

jeremymsimon avatar jeremymsimon commented on July 21, 2024

Oooookay well, it turns out we were incorrectly using the $rotation rather than $x output of prcomp so this whole thing is different now- we only get one trajectory instead of 3 and everything looks pretty clean.

Screen Shot 2019-07-30 at 3 32 00 PM

So first off I'll thank you for your help and apologize for now pivoting to the following related question: aside from looking just at the proportions of cells in each of these clusters between condition 1 vs condition 2, do the weights or any other information from Slingshot's output tell us how the differentiation trajectory may be different across conditions? Again, we know that the blue/cyan clusters are fewer in number in condition 2, whereas the red cluster has a significantly higher proportion in condition 2. So if there's something here that tells us that condition 2 are "more differentiated" than condition 1, that would be helpful.

We're also looking at temporally-regulated genes and how those may differ but I see that as a somewhat separate question.

Thanks again!

from slingshot.

kstreet13 avatar kstreet13 commented on July 21, 2024

Ok, no worries! I think the same kinds of ideas apply as from our conversation above, just the differences will be less obvious. The weights are probably no longer relevant, since there is only one lineage, so every cell should have a weight of one along it.

Basically, it sounds like we want to test for a relationship between the binary condition variable and the continuous pseudotime variable. Or alternatively, a difference between the distributions of pseudotime values between the two conditions. Taking this second approach (and apologies for revising my answer from above), something like a Kolmogorov-Smirnov Test might be most appropriate. This should theoretically allow you to detect any difference in the distributions of pseudotimes, not just mean/median shifts. Implementation would be similar to the Wilcox test above, but with the ks.test function.

from slingshot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.