I have some questions related the routing table: <p dir="auto

Hello Aleksandar, 1/ From the <a href="https://github.com/DavidKelle

1/ There is two kind of IDs involved: Peer IDs.<ul dir="auto"

Routing Table maintenance about kademlia HOT 4 OPEN

davidkeller commented on September 4, 2024

Routing Table maintenance

from kademlia.

Comments (4)

DavidKeller commented on September 4, 2024

Hello Aleksandar,

1/ From the paper,

To join the network, a node u must have a contact to an already participating node w. u inserts w into the appropriate k-bucket. u then performs a node lookup for its own node ID. Finally, u refreshes all k-buckets further away than its closest neighbor. During the refreshes, u both populates its own k-buckets and inserts itself into other nodes's k-buckets as necessary.

My understand of it is that messages have to be sent in order to be known on the network.
Each peer should have a single entry within the routing table, because of the routing_table::push() check for uniqueness.
Have you seen duplicates when debugging ?

2/ Yes, when a response is received with a random <request id> to an <old node id>, the response_router should not only check that the response's <request id> has been previously generated but also ensure the responder matches the <old node id> and remove the <old node id> from the routing_table if not ! (the <new node id> will be already be inserted by the engine::handle_new_message()).

3/ My understanding of the paper is that peer are lazy purged. When a lookup is performed, peers are contacted, peers that fail to respond in a timely manner are purged at this moment. This is not implemented ATM but isn't a big deal.
See this TODO .

I believe the sweet spot would be lookup_task::flag_candidate_as_invalid(). A reference to the routing_table must be added as a lookup_task member, and routing_table::remove() called

from kademlia.

aleks-f commented on September 4, 2024

Hi David, thanks for answering.

Please see my replies below.

1/ ... Have you seen duplicates when debugging ?

Yes, since the IDs sent are not the real peer ID, and only IDs are checked to be unique (and not IP:PORT entries), the result is that the ID uniqueness check finds no duplicates during the discovery process (but then, again - what sense would it make to send multiple real peer IDs anyway); so, bootstrap ends up with a bunch of peers with different (non-existent) IDs, but with same IP:PORT. I could be missing something important but I can't make sense of the discovery process logic.

2/ Yes, when a response is received with a random to an , the response_router should not only check that the response's has been previously generated but also ensure the responder matches the and remove the from the routing_table if not ! (the will be already be inserted by the engine::handle_new_message()).

The only thing I see is check for the non-existent message ID. Duplicate nonexistent peer IDs remain in the bootstrap routing table and keep propagating back to the peer that faked them (and to other peers) - every save/load adds duplicate peers to peers' routing tables.

3/ My understanding of the paper is that peer are lazy purged. When a lookup is performed, peers are contacted, peers that fail to respond in a timely manner are purged at this moment. This is not implemented ATM but isn't a big deal.

OK, makes sense.

I believe the sweet spot would be [lookup_task::flag_candidate_as_invalid()]

Speaking generally, I think I understand the purpose of this, but before we go there, I'd like to first clarify and properly understand the neighbor discovery on boot.

from kademlia.

DavidKeller commented on September 4, 2024

1/ There is two kind of IDs involved:

Peer IDs.
- Stored within the routing table.
Key IDs
- Matched against stored Peer IDs to figure out which peer to contact.

Each time a peer p contacts a node n (request & response), n store the p Peer ID within its routing table

The purpose of refresh_id[ i ] = ! refresh_id[ i ]; id to generate at each iteration Key IDs further away from our Node IDs to contact nodes that belong to different each bucket (and fill all of them).

2/ Yes, this is something I should implement. That's clearly a bug because it makes the library kind of memleak.

*/ On a side note, I was wondering if rewriting the library to play with C++20 coroutine would be fun and could simplify the code. Would it be a problem for you to require a C++20 compiler ?

from kademlia.

aleks-f commented on September 4, 2024

The purpose of refresh_id[ i ] = ! refresh_id[ i ]; id to generate at each iteration Key IDs further away from our Node IDs to contact nodes that belong to different each bucket (and fill all of them).

OK, I understand this, I was mixing peer and key IDs

2/ Yes, this is something I should implement. That's clearly a bug because it makes the library kind of memleak.

Implementing this is simple. Implementing it efficiently, not so much without <endpoint, bucket_index> caching ; proactively remove existing IP:PORT entries I think makes sense, because if a peer with existing IP:PORT shows up, the previous one must be gone. To do it efficiently with the existing structure, pairs of <endpoint, bucket index> should be cached, otherwise, one has to traverse the entire routing table in order to find duplicates (which most of the time wont exist). Which brings me to another question: why is routing table implemented around vector<list<pair<id, endpoint>>>. Shouldn't routing table be a binary tree?

*/ On a side note, I was wondering if rewriting the library to play with C++20 coroutine would be fun and could simplify the code. Would it be a problem for you to require a C++20 compiler ?

Coroutines would definitely make sense. However, at this point I'm working on an entirely divergent port of your library, completely decoupled from boost and asio, using POCO and gtest instead. In case you want to join that effort, let me know, and we can explore the coroutines possibility there (but it really depends on the folks requesting this port and how they feel about coroutines and the compiler support thereof)

from kademlia.

Routing Table maintenance about kademlia HOT 4 OPEN

Comments (4)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent