Git Product home page Git Product logo

Comments (13)

zktuong avatar zktuong commented on September 17, 2024

Hi Sara,

  1. This is an excerpt of what i wrote for the method's section of https://www.nature.com/articles/s41591-021-01329-2#Sec8

B cell clone/clonotype network
Single-cell BCR networks were constructed using adjacency matrices computed from pairwise Levenshtein distance of the full amino acid sequence alignment for BCR(s) contained in every pair of cells. Construction of the Levenshtein distance matrices were performed separately for heavy-chain and light-chain contigs, and the sum of the total edit distance across all layers/matrices was used as the final adjacency matrix. To construct the BCR neighborhood graph, a minimum-spanning tree was constructed on the adjacency matrix for each clone/clonotype, creating a simple graph with edges indicating the shortest edit distance between a B cell and its nearest neighbor. Cells with identical BCRs, that is, cells with a total pairwise edit distance of zero, were then connected to the graph to recover edges trimmed off during the minimum-spanning-tree construction step. Fruchterman–Reingold graph layout was generated using a modified method to prevent singletons from flying out to infinity in ‘networkx’ (v2.5). Visualization of the resulting single-cell BCR network was achieved via transfer of the graph to relevant ‘anndata’ slots, allowing for access to plotting tools in scanpy.

As this is reliant oncell_id in the airr.tsv (or parenthetically dandelion.data.cell_id/dandelion.metadata.index) matching with anndata.obs_names, there would be a connection if the contigs was present in corresponding cells. I would suggest for you to check through your cell barcodes (cell ids, contig ids, sequence ids etc) and ensure that they are named correctly. Perhaps try it just for one sample first where minimal formatting of the barcodes is required and see if it works.

  1. there no way to know which cell is closest to the germline from dandelion visualization -> you would have to run lineage trees separately (as per immcantation's suite, or with some other methods).

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

Thanks Kelvin.
I have another question about the BCR clustering clone_id. There are four parts {A}{B}{C}_{D}.

So I did the clustering and I want to know how each value is assigned to each part of the clone_id.
For example I have for samples with same {A}= 11:

11_10_4_47
11_10_4_47
11_10_4_47
11_10_4_47

My question is how 11 is calculated? And for other parts of the clone_id? How the numbers are calculated?

Thanks,
Sara

from dandelion.

zktuong avatar zktuong commented on September 17, 2024

My question is how 11 is calculated? And for other parts of the clone_id? How the numbers are calculated?

Hi Sara, this is already described in detail in the documentation/tutorial:

https://sc-dandelion.readthedocs.io/en/latest/notebooks/3_dandelion_findingclones-10x_data.html

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

Hi Kelvin. I already rad all the tutorial. But it is not clear for me if a clone_id is: 11_10_4_47;
then how "11" is CALCULATED. I know the meaning of each sub_id. But how it is calculated?

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

In other words, how I should interpret the "11"? or other sub_ids?

from dandelion.

zktuong avatar zktuong commented on September 17, 2024

It’s just a random number - you don’t have to overinterpret it. Just know that if a cell/contig has 11, it means it’s shares the same sub-id as other contigs that have 11.

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

O, okay. That was good to know. Thank you so much.

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

Kelvin,

Thank you again.

I have two other questions after reading the tutorial, and other references you provided:
1- Is it correct to say: In the process of generating the BCR network, ANY nodes which their cdr3 junction sequences have at least 85% similarity with other sequences, will generate an edge between them?
From my understanding first all "inter cluster edges" are generated, and then "intra clusters edges" are generated IF ANY sequences in a cluster has more than 85% similarity with ANY of the sequences in other clusters.

Is this correct?

2- Also, in the visualization of the network we could see that each node is probably representing more than a cell. Is there any way that we make the node size larger according to the number of cell it is including?

Thank you so much.

Sara

from dandelion.

zktuong avatar zktuong commented on September 17, 2024

Hi Sara,

the networks are constructed only within each clone/cluster, hence there’s no intercluster edges.
This is controlled by the ‘clone_id’ column - so for example, a single network will be constructed between cells that are tagged as clone ‘1_1_1_1’ and a separate network is constructed for clone ‘1_2_1_2’.

The construction of the edges is as described above:
A) for a given clone, a minimum spanning tree is constructed and only these edges are kept.
B) if two bcrs have 100% identity, then there would be additional edges that are added to the network. The 85% similarity is only for clone definition.

the only time you would see edges between ‘1_1_1_1’ and ‘1_2_1_2’ is if a cell contains more than one pair of contigs i.e. the cell’s clone_id is ‘1_1_1_1 | 1_2_1_2’ because there’s two possible combinations.

hence, to partly answer your 2nd question, each node is a cell, and not a contig. There’s no immediate plans to construct a version of the plot that you described but it’s potentially through scirpy. However, there’s a couple of things i will need to implement for it to work properly. See scverse/scirpy#286

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

from dandelion.

zktuong avatar zktuong commented on September 17, 2024

Can you show me what the plot looks like, and dataframe? It’s difficult for me to imagine how that is possible unless they have the same clone id

from dandelion.

saramoein372 avatar saramoein372 commented on September 17, 2024

from dandelion.

zktuong avatar zktuong commented on September 17, 2024

Sure. you can also inspect it in:

vdj = ddl.read_h5('dandelion_results.h5'
vdj.edges

As i'm unable to see what's wrong with your plot/data and you can not provide me with the requisite info that i asked for, i will close this issue now.

from dandelion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.