Git Product home page Git Product logo

node-monitor's People

Contributors

dependabot[bot] avatar spencerjibz avatar

Stargazers

 avatar  avatar

node-monitor's Issues

Node monitor

Create a loop or cron/job queue that pols the Algorand node every 10s. Each pol checks whether the last-round field in the response has incremented from the last one.

If the last round field has incremented since the last poll, we store the new value in a persistent location (file or embedded database). In the subsequent poll, we check this stored field again with the newly polled one.

When the last round value has not been incremented, we start the error reporting process. We store a persistent value of the timestamp of the event and store a persistent value that the node is unhealthy.

When in the error reposting process, the healthy route now returns the 503 error. The body now contains not only the last route value and status but also the timestamp of when the event started for debugging purposes. And reports the latest round value of the other nodes in the cluster to compare against.

Cluster node connections. In the configuration file, we pass in an array of all other nodes in the cluster and which one the current node is. When in the error process, we use the other nodes in the cluster to decide when the local node is synced again.

For each poll in the error process, we check if the last round value has changed. When this happens, we can assume the node has started syncing again. If that is the case, we still report back an error in the health route, but now we also include that the node is catching up.

During catchup, we also fetch the last round value from the other cluster nodes each time we poll. We compare the local value to the other nodes, and if it is within a +- 1 range of the other value, we can assume the node is fully caught up again. The health route should now report back a 200, okay, status again.

Configuration

When starting the monitor, there should be a configuration passed in. This can either be a JSON config file or CLI arguments. We should be able to pass in:

  • Polling rate (amount of seconds between each poll)
  • X-Algo-API-Token (API key for the Algorand node cluster)
  • Valid round range (+- correction for the syncing between local and remote round numbers)
  • local node (url of the local node, to exclude from the remote polling)
  • cluster nodes (the urls of all the nodes in the cluster)
  • port (http port to operate on)
  • node-port (http port of the local node)

HTTP /health route

Add an HTTP route to the service that reports status of 200 if the Algorand node is correctly syncing and 503 if the node is out of sync.

The body of the response should contain additional information about the event. For example, the difference between a full stop and the catchup process after a stop.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.