Git Product home page Git Product logo

Comments (1)

Pedrobc89 avatar Pedrobc89 commented on September 17, 2024

Proposal

This is a design proposal of the Distributed AtomSpace (DAS) Node.

Initial design will focus on Messaging and Leader Election.

Motivation

Goals

  • Ensure the DAS Node is resilient to network failures.
  • Ensure the DAS Node is resilient to leader failures.
  • Ensure the DAS Node is resilient to job failures.
  • Ensure the DAS Node will satisfy the use case.
  • Ensure Nodes run inside docker containers.
  • Ensure speedup of at least 70% * N considering the
    execution in a network with N (1 < N < 6) equally resourced NODEs against the
    execution in a single NODE.

Use Case

For this use case we will assume that the nodes are already up and running,
the network topology is full mesh. All nodes can communicate with each other
and are all aware of each other. All nodes remain in a latent state until a
job is requested.

  • USER: a person using the software we provide to execute JOBs in a network of
    NODEs.

  • NODE: a server process running on a Docker container encompassing all
    components required to run a DASNode.

  • JOB: a data structure designed to contain a script in one of the script
    languages supported by DASNode. This script contains all the code required to
    execute a given task.

  • MESSAGE: a data structure to encapsulate pieces of information we want to
    transport from one NODE to another.

    1. USER submit a JOB to any of the NODEs in the network.
    2. NODE start processing the JOB. Eventually, other NODEs are contacted via
      MESSAGE exchanging to share the burden of processing the JOB.
    3. Once the JOB finishes, results are collected by the same NODE originally
      contacted by the USER and delivered to the USER.

Architecture

Node Structure

  • The initial implementation will use Python language.
  • Each node will be a Docker container.
  • Nodes will have the following components:
    • Reliable message exchanging: Manages communication between nodes.
    • Leader election: Manages the process of electing a leader.
    • Job management: Manages the process of submitting and processing jobs.
    • Atomic commit operations:
    • Mutual exclusion in the use of shared resources
    • Consensus
    • Replication
  • Each component will be a separate module, that way we can easily swap modules
    for alternatives, different communication protocols, leader election
    algorithms, etc.

Messaging Layer

  • Nodes will communicate using a messaging protocol that supports reliable
    and ordered delivery.
  • The messaging system will abstract the underlying network topology.
  • The messaging layer will be an external library. That way we can easily
    change it, without changing the code of the DAS Node.

Considered Topologies

  1. Full Mesh:

    • Protocol: gRPC or HTTP/2
    • Reason: Full mesh requires each node to communicate directly with
      every other node. gRPC supports HTTP/2, which provides multiplexing and
      efficient binary communication, making it suitable for full mesh networks.
  2. Ring:

    • Protocol: HTTP/1.1 or TCP
    • Reason: In a ring topology, each node communicates with its two
      neighbors. HTTP/1.1 or raw TCP connections are sufficient for this topology,
      providing simplicity and reliability.
  3. Tree:

    • Protocol: gRPC or MQTT
    • Reason: Tree topologies benefit from protocols that support
      hierarchical communication. gRPC provides efficient communication, while
      MQTT (Message Queuing Telemetry Transport) is designed for lightweight and
      efficient communication, especially in IoT and hierarchical networks.
  4. Torus/Grid:

    • Protocol: gRPC or ZeroMQ
    • Reason: Torus or grid topologies require efficient communication
      between nodes arranged in a grid. gRPC is suitable for its performance,
      while ZeroMQ provides high-performance asynchronous messaging, making it
      ideal for grid networks.
  5. Arbitrary/Custom Topologies:

    • Protocol: gRPC or AMQP
    • Reason: For arbitrary topologies, gRPC offers flexibility and
      performance. AMQP (Advanced Message Queuing Protocol) provides robust
      message queuing and routing capabilities, supporting complex topologies
      effectively.

Choosing a Protocol

For the initial implementation, MQTT was chosen, since it is easy
to start using, and it is very low latency, with high performance. It is also
a good match for a number of topologies. MQTT does not guarantee the ordering
of messages, so that would be required in the message itself. Note that the
default max size of a message is 256 MB.

For this initial version, the messaging layer will have to implement methods for:
leader election, job management, and telemetry.

Messaging layer API:

  1. LeaderElection:

    • Election Message: Sent by a node to all nodes with higher IDs when it
      starts an election.
    • OK Message: Sent by a node with a higher ID in response to an Election
      message to indicate that it is still alive and will start its own election.
    • Coordinator Message: Sent by the node with the highest ID (the new leader)
      to all other nodes to announce itself as the leader.
  2. Telemetry:

    • Heartbeat Messages: Heartbeat messages are sent periodically by each node
      to indicate that it is alive and functioning.
    • Status Messages: Status messages provide more detailed information about
      the node's current state, including resource usage and operational status.
    • Metrics Messages: Metrics messages provide detailed performance data for
      monitoring and analysis.
    • Alert Messages: Alert messages are sent when a node detects a condition
      that requires immediate attention.
    • Log Messages: Log messages provide detailed logging information for
      debugging and auditing purposes.
  3. JobManagement:

    • Job Submission Message: Sent by the user to any node to submit a job for
      processing.
    • Job Assignment Message: Sent by the leader node to assign job parts to
      other nodes.
    • Job Progress Message: Sent by a node to report the progress of a job
      part.
    • Job Result Message: Sent by a node to report the result of a job part.
    • Job Cancellation Message: Sent by the user or leader node to cancel a
      job.

Leader Election

Leader election is a fundamental problem in distributed systems. It involves
designating a single node as the coordinator (leader) of some task distributed
among several nodes. The leader coordinates the work among the other nodes,
ensuring efficient and reliable task completion. Below, Iโ€™ll cover some
essential concepts, common algorithms, and implementation considerations for
leader election.

Common Leader Election Algorithms

  1. Bully Algorithm:

    • Description: Nodes have unique IDs. The node with the highest ID
      becomes the leader. If a node suspects the leader has failed, it starts an
      election by sending an election message to all nodes with higher IDs. If
      none of the higher-ID nodes respond, it becomes the leader.
    • Pros: Simple to understand and implement.
    • Cons: High network traffic in large networks, especially during
      elections.
  2. Raft Algorithm:

    • Description: Raft is a consensus algorithm designed to be
      understandable. It breaks down into leader election, log replication, and
      safety.
    • Leader Election: Nodes start in a follower state. If they donโ€™t hear
      from a leader, they become candidates and request votes from other nodes.
      The candidate with the majority votes becomes the leader.
    • Pros: Robust and widely used, well-documented.
    • Cons: More complex than the bully algorithm.
  3. Paxos Algorithm:

    • Description: Paxos is a family of protocols for solving consensus in a
      network of unreliable or failing nodes. Nodes propose leaders, and through
      a series of messages, agree on a single leader.
    • Pros: Proven correctness and used in many critical systems.
    • Cons: Complex to implement and understand.
  4. ZooKeeper:

    • Description: Apache ZooKeeper is a distributed coordination service
      that includes leader election as one of its features. Nodes use ZooKeeper
      to create ephemeral nodes; the node that successfully creates the first
      ephemeral node becomes the leader.
    • Pros: High reliability, widely used in industry.
    • Cons: Requires running and maintaining a ZooKeeper ensemble.

Choosing an Algorithm

For the initial implementation, Bully was chosen for it's simplicity and
ease of development, as the project evolves, a new algorithm can be added.
Since we opted for a modular design, we can add new algorithms as needed.

The Job

Perform a single query to the remote DAS Server, process all the results and
perform some extra computation on each result in order to evaluate each
result's quality.

JOB should be defined as a script in some programming language. DASNode should
be able to support multiple programming languages here so the design must be
flexible. Initially we'll support only Python scripts doing queries to a remote
DAS Server.

This baseline test case should run with speedup of at least 70% * N considering
the execution in a network with N (1 < N < 6) equally resourced NODEs against
the execution in a single NODE.

from das.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.