Hi Kelvin, I have 2 question about the way the network is generated:

Hi Sara, This is an excerpt of what i wrote for the method's s

Network generation about dandelion HOT 13 CLOSED

zktuong commented on September 17, 2024

Network generation

from dandelion.

Comments (13)

zktuong commented on September 17, 2024

Hi Sara,

This is an excerpt of what i wrote for the method's section of https://www.nature.com/articles/s41591-021-01329-2#Sec8

B cell clone/clonotype network
Single-cell BCR networks were constructed using adjacency matrices computed from pairwise Levenshtein distance of the full amino acid sequence alignment for BCR(s) contained in every pair of cells. Construction of the Levenshtein distance matrices were performed separately for heavy-chain and light-chain contigs, and the sum of the total edit distance across all layers/matrices was used as the final adjacency matrix. To construct the BCR neighborhood graph, a minimum-spanning tree was constructed on the adjacency matrix for each clone/clonotype, creating a simple graph with edges indicating the shortest edit distance between a B cell and its nearest neighbor. Cells with identical BCRs, that is, cells with a total pairwise edit distance of zero, were then connected to the graph to recover edges trimmed off during the minimum-spanning-tree construction step. Fruchterman–Reingold graph layout was generated using a modified method to prevent singletons from flying out to infinity in ‘networkx’ (v2.5). Visualization of the resulting single-cell BCR network was achieved via transfer of the graph to relevant ‘anndata’ slots, allowing for access to plotting tools in scanpy.

As this is reliant oncell_id in the airr.tsv (or parenthetically dandelion.data.cell_id/dandelion.metadata.index) matching with anndata.obs_names, there would be a connection if the contigs was present in corresponding cells. I would suggest for you to check through your cell barcodes (cell ids, contig ids, sequence ids etc) and ensure that they are named correctly. Perhaps try it just for one sample first where minimal formatting of the barcodes is required and see if it works.

there no way to know which cell is closest to the germline from dandelion visualization -> you would have to run lineage trees separately (as per immcantation's suite, or with some other methods).

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin.
I have another question about the BCR clustering clone_id. There are four parts {A}{B}{C}_{D}.

So I did the clustering and I want to know how each value is assigned to each part of the clone_id.
For example I have for samples with same {A}= 11:

11_10_4_47
11_10_4_47
11_10_4_47
11_10_4_47

My question is how 11 is calculated? And for other parts of the clone_id? How the numbers are calculated?

Thanks,
Sara

from dandelion.

zktuong commented on September 17, 2024

My question is how 11 is calculated? And for other parts of the clone_id? How the numbers are calculated?

Hi Sara, this is already described in detail in the documentation/tutorial:

https://sc-dandelion.readthedocs.io/en/latest/notebooks/3_dandelion_findingclones-10x_data.html

from dandelion.

saramoein372 commented on September 17, 2024

Hi Kelvin. I already rad all the tutorial. But it is not clear for me if a clone_id is: 11_10_4_47;
then how "11" is CALCULATED. I know the meaning of each sub_id. But how it is calculated?

from dandelion.

saramoein372 commented on September 17, 2024

In other words, how I should interpret the "11"? or other sub_ids?

from dandelion.

zktuong commented on September 17, 2024

It’s just a random number - you don’t have to overinterpret it. Just know that if a cell/contig has 11, it means it’s shares the same sub-id as other contigs that have 11.

from dandelion.

saramoein372 commented on September 17, 2024

O, okay. That was good to know. Thank you so much.

from dandelion.

saramoein372 commented on September 17, 2024

Kelvin,

Thank you again.

I have two other questions after reading the tutorial, and other references you provided:
1- Is it correct to say: In the process of generating the BCR network, ANY nodes which their cdr3 junction sequences have at least 85% similarity with other sequences, will generate an edge between them?
From my understanding first all "inter cluster edges" are generated, and then "intra clusters edges" are generated IF ANY sequences in a cluster has more than 85% similarity with ANY of the sequences in other clusters.

Is this correct?

2- Also, in the visualization of the network we could see that each node is probably representing more than a cell. Is there any way that we make the node size larger according to the number of cell it is including?

Thank you so much.

Sara

from dandelion.

zktuong commented on September 17, 2024

Hi Sara,

the networks are constructed only within each clone/cluster, hence there’s no intercluster edges.
This is controlled by the ‘clone_id’ column - so for example, a single network will be constructed between cells that are tagged as clone ‘1_1_1_1’ and a separate network is constructed for clone ‘1_2_1_2’.

The construction of the edges is as described above:
A) for a given clone, a minimum spanning tree is constructed and only these edges are kept.
B) if two bcrs have 100% identity, then there would be additional edges that are added to the network. The 85% similarity is only for clone definition.

the only time you would see edges between ‘1_1_1_1’ and ‘1_2_1_2’ is if a cell contains more than one pair of contigs i.e. the cell’s clone_id is ‘1_1_1_1 | 1_2_1_2’ because there’s two possible combinations.

hence, to partly answer your 2nd question, each node is a cell, and not a contig. There’s no immediate plans to construct a version of the plot that you described but it’s potentially through scirpy. However, there’s a couple of things i will need to implement for it to work properly. See scverse/scirpy#286

from dandelion.

saramoein372 commented on September 17, 2024

Thank you Kelvin. Related to my first question: in the network generated by dandelion, I could see part of the network that there is no clone_id for some of the nodes. How is this possible? I am confused actually. I actually don't know how the nodes and edges are connected, when there is no clone_id for them. Thanks, Sara

…

On Thu, Oct 14, 2021 at 12:15 PM Kelvin ***@***.***> wrote: Hi Sara, the networks are constructed only within each clone/cluster, hence there’s no intercluster edges. This is controlled by the ‘clone_id’ column - so for example, a single network will be constructed between cells that are tagged as clone ‘1_1_1_1’ and a separate network is constructed for clone ‘1_2_1_2’. The construction of the edges is as described above: A) for a given clone, a minimum spanning tree is constructed and only these edges are kept. B) if two bcrs have 100% identity, then there would be additional edges that are added to the network. The 85% similarity is only for clone definition. the only time you would see edges between ‘1_1_1_1’ and ‘1_2_1_2’ is if a cell contains more than one pair of contigs i.e. the cell’s clone_id is ‘1_1_1_1 | 1_2_1_2’ because there’s two possible combinations. hence, to partly answer your 2nd question, each node is a cell, and not a contig. There’s no immediate plans to construct a version of the plot that you described but it’s potentially through scirpy. However, there’s a couple of things i will need to implement for it to work properly. See scverse/scirpy#286 <scverse/scirpy#286> — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AVVJONU5B3PFZLAR7DIOMYLUG36YTANCNFSM5FL7PPBQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

from dandelion.

zktuong commented on September 17, 2024

Can you show me what the plot looks like, and dataframe? It’s difficult for me to imagine how that is possible unless they have the same clone id

from dandelion.

saramoein372 commented on September 17, 2024

Thanks Kelvin. Unfortunately, the data is confidential and I can not share. But I think I can ask my question in different way: When I read the dandelion object " dandelion_results.h5" I get the below keys: ['/data', '/edges', '/metadata', '/metadata/meta/values_block_0/meta', '/graph/graph_0', '/graph/graph_1', '/distance/VDJ_1', '/distance/VDJ_2', '/distance/VJ_1', '/distance/VJ_2'] When I read: f = pd.read_hdf('/Users/saramoein/Documents/BCR/dandelion_results.h5', key='/edges') Is it true that I claim the f.edges defines the network edges? Thanks again Kelvin.

…

On Thu, Oct 14, 2021 at 2:05 PM Kelvin ***@***.***> wrote: Can you show me what the plot looks like, and dataframe? It’s difficult for me to imagine how that is possible unless they have the same clone id — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AVVJONQWWRZ3QCM3KZC3K6DUG4LXFANCNFSM5FL7PPBQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

from dandelion.

zktuong commented on September 17, 2024

Sure. you can also inspect it in:

vdj = ddl.read_h5('dandelion_results.h5'
vdj.edges

As i'm unable to see what's wrong with your plot/data and you can not provide me with the requisite info that i asked for, i will close this issue now.

from dandelion.

Network generation about dandelion HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent