Git Product home page Git Product logo

Comments (4)

DavidKeller avatar DavidKeller commented on September 4, 2024

Hello Aleksandar,

1/ From the paper,

To join the network, a node u must have a contact to an already participating node w. u inserts w into the appropriate k-bucket. u then performs a node lookup for its own node ID. Finally, u refreshes all k-buckets further away than its closest neighbor. During the refreshes, u both populates its own k-buckets and inserts itself into other nodes's k-buckets as necessary.

My understand of it is that messages have to be sent in order to be known on the network.
Each peer should have a single entry within the routing table, because of the routing_table::push() check for uniqueness.
Have you seen duplicates when debugging ?

2/ Yes, when a response is received with a random <request id> to an <old node id>, the response_router should not only check that the response's <request id> has been previously generated but also ensure the responder matches the <old node id> and remove the <old node id> from the routing_table if not ! (the <new node id> will be already be inserted by the engine::handle_new_message()).

3/ My understanding of the paper is that peer are lazy purged. When a lookup is performed, peers are contacted, peers that fail to respond in a timely manner are purged at this moment. This is not implemented ATM but isn't a big deal.
See this TODO .

I believe the sweet spot would be lookup_task::flag_candidate_as_invalid(). A reference to the routing_table must be added as a lookup_task member, and routing_table::remove() called

from kademlia.

aleks-f avatar aleks-f commented on September 4, 2024

Hi David, thanks for answering.

Please see my replies below.

1/ ... Have you seen duplicates when debugging ?

Yes, since the IDs sent are not the real peer ID, and only IDs are checked to be unique (and not IP:PORT entries), the result is that the ID uniqueness check finds no duplicates during the discovery process (but then, again - what sense would it make to send multiple real peer IDs anyway); so, bootstrap ends up with a bunch of peers with different (non-existent) IDs, but with same IP:PORT. I could be missing something important but I can't make sense of the discovery process logic.

2/ Yes, when a response is received with a random to an , the response_router should not only check that the response's has been previously generated but also ensure the responder matches the and remove the from the routing_table if not ! (the will be already be inserted by the engine::handle_new_message()).

The only thing I see is check for the non-existent message ID. Duplicate nonexistent peer IDs remain in the bootstrap routing table and keep propagating back to the peer that faked them (and to other peers) - every save/load adds duplicate peers to peers' routing tables.

3/ My understanding of the paper is that peer are lazy purged. When a lookup is performed, peers are contacted, peers that fail to respond in a timely manner are purged at this moment. This is not implemented ATM but isn't a big deal.

OK, makes sense.

I believe the sweet spot would be [lookup_task::flag_candidate_as_invalid()]

Speaking generally, I think I understand the purpose of this, but before we go there, I'd like to first clarify and properly understand the neighbor discovery on boot.

from kademlia.

DavidKeller avatar DavidKeller commented on September 4, 2024

1/ There is two kind of IDs involved:

  • Peer IDs.
    • Stored within the routing table.
  • Key IDs
    • Matched against stored Peer IDs to figure out which peer to contact.

Each time a peer p contacts a node n (request & response), n store the p Peer ID within its routing table

The purpose of refresh_id[ i ] = ! refresh_id[ i ]; id to generate at each iteration Key IDs further away from our Node IDs to contact nodes that belong to different each bucket (and fill all of them).

2/ Yes, this is something I should implement. That's clearly a bug because it makes the library kind of memleak.

*/ On a side note, I was wondering if rewriting the library to play with C++20 coroutine would be fun and could simplify the code. Would it be a problem for you to require a C++20 compiler ?

from kademlia.

aleks-f avatar aleks-f commented on September 4, 2024

The purpose of refresh_id[ i ] = ! refresh_id[ i ]; id to generate at each iteration Key IDs further away from our Node IDs to contact nodes that belong to different each bucket (and fill all of them).

OK, I understand this, I was mixing peer and key IDs

2/ Yes, this is something I should implement. That's clearly a bug because it makes the library kind of memleak.

Implementing this is simple. Implementing it efficiently, not so much without <endpoint, bucket_index> caching ; proactively remove existing IP:PORT entries I think makes sense, because if a peer with existing IP:PORT shows up, the previous one must be gone. To do it efficiently with the existing structure, pairs of <endpoint, bucket index> should be cached, otherwise, one has to traverse the entire routing table in order to find duplicates (which most of the time wont exist). Which brings me to another question: why is routing table implemented around vector<list<pair<id, endpoint>>>. Shouldn't routing table be a binary tree?

*/ On a side note, I was wondering if rewriting the library to play with C++20 coroutine would be fun and could simplify the code. Would it be a problem for you to require a C++20 compiler ?

Coroutines would definitely make sense. However, at this point I'm working on an entirely divergent port of your library, completely decoupled from boost and asio, using POCO and gtest instead. In case you want to join that effort, let me know, and we can explore the coroutines possibility there (but it really depends on the folks requesting this port and how they feel about coroutines and the compiler support thereof)

from kademlia.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.