Expected/Wanted Behavior Allow worker container to update to newer

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Yes, Agee with you <a class="user-mention notranslate" data-hovercard-type="user" data

Self update worker container on deployed remote devices,about nebula-orchestrator/worker

Comments (11)

naorlivne commented on May 24, 2024

Hi @johnmay1 I'm not sure what you mean by this so just to make sure is this a feature request to have some sort of auto update of the worker containers to the latest version or is this an issue you have with updating the containers managed by the workers or something else entirely?

from worker.

johnmay1 commented on May 24, 2024

Hi @naorlivne , yes this is a feature request to have some sort of auto update of the worker containers to the latest version.

from worker.

naorlivne commented on May 24, 2024

Should be possible to be done with patch & minor versions that don't introduce breaking changes to the way workers communicate with the manager.

Thinking something like a cron script that checks if the higher version tagged container then the currently running worker exists and if so kills the worker container and starts a new one with that higher version in it's place, this will likely need to be another component that runs on the worker rather then something that runs inside the worker (because part of the update process involves killing the worker container) but don't see any reason why this optional cron script won't be a Nebula managed app in it's own right so it could be updated like a normal nebula app & in turn update the Nebula worker container so the end result is a system which updates itself fully.

This script will need some way to ensure it doesn't update to anything which introduces potentiality breaking changes to the system with the user express permission for it so thinking it should have a config\envvar option named something like "ALLOW_MAJOR_VERSION_UPDATES" that's by default is set to False and only if set to True will it update major versions as well.

Needless to say that this update service will be an entirely optional add-on rather then a core part of Nebula as some people value stability over updates & don't mind being a few version behind

That's my initial thoughts on the subject, if anyone has another idea feel free to speak up.

This also gave me the idea of #42 which might be used to simplify this service cron usage

from worker.

naorlivne commented on May 24, 2024

Another idea is to have the worker container detect a new version is out (this option will be off by default and will need to be turned on via a "AUTO_UPDATE" parameter in the worker configuration) using the same logic as the "is newer version out" in my comment above & if that's true then follow this update flow:

Download the image of the newer version
Spawn 2 new containers
- The first container sleep X seconds kills the original worker container & starts a new worker with the same configuration options as original worker (this container will be spawned with all the configuration version given to it from the original) then exits
- The 2nd container sleeps Y seconds (where Y>X with a long enough safety span) that kills the new worker and restarts the original worker should it still exists when Y passes then exits (providing failback in case of issues with the upgrade)
The new worker container after starting up successfully removes the 2nd container.

from worker.

johnmay1 commented on May 24, 2024

Another idea is to have the worker container detect a new version is out (this option will be off by default and will need to be turned on via a "AUTO_UPDATE" parameter in the worker configuration) using the same logic as the "is newer version out" in my comment above & if that's true then follow this update flow:

Download the image of the newer version

Spawn 2 new containers

The first container sleep X seconds kills the original worker container & starts a new worker with the same configuration options as original worker (this container will be spawned with all the configuration version given to it from the original) then exits

The 2nd container sleeps Y seconds (where Y>X with a long enough safety span) that kills the new worker and restarts the original worker should it still exists when Y passes then exits

The new worker container after starting up successfully removes the 2nd container.

I like the above suggestion. Would also like to add, report the currently running version of worker to Manager and make it available through API.

from worker.

naorlivne commented on May 24, 2024

Another idea is to have the worker container detect a new version is out (this option will be off by default and will need to be turned on via a "AUTO_UPDATE" parameter in the worker configuration) using the same logic as the "is newer version out" in my comment above & if that's true then follow this update flow:

Download the image of the newer version

Spawn 2 new containers

The first container sleep X seconds kills the original worker container & starts a new worker with the same configuration options as original worker (this container will be spawned with all the configuration version given to it from the original) then exits

The 2nd container sleeps Y seconds (where Y>X with a long enough safety span) that kills the new worker and restarts the original worker should it still exists when Y passes then exits

The new worker container after starting up successfully removes the 2nd container.

I like the above suggestion. Would also like to add, report the currently running version of worker to Manager and make it available through API.

Currently the workers only query the manager to get their state, the manager has a memoized cache which speeds up things considerably (and as a direct result allows a single manager take care of considerably more workers then it would without the cache thus lowering costs), unfortunately said cache also means that we can't have the workers report their state to the manager, they can only pull data from it and then each one of them has it's own internal logic to match it's state to the one it pulled, it's possible to have something like Kafka take care of data ingestion from the workers to a central backend DB which could then be queried from the managers and presented to the enduser\admin & as much as I would like to have that option (your not the first to asked for something similar which tells me this is a needed feature) Nebula is unfortunately still just a pet project of mine with very few other contributors so I don't really have the time to have an entirely new optional "workers status reporting" component added to the mix so if you feel like assisting in that regard I will gladly agree with you about that suggestion but otherwise I think that we'll stick to just updating the workers as a first priority and worry about reporting their current version at a later ticket.

Even the auto update is something that I honestly doubt will happen in the next few months as it will rely on the #42 which in itself is a major update that will take a rather large chunk of time to get production ready.

TL;DR:
Let's start with just the auto update for the this ticket, please open a ticket about wanting to get data from the workers in a centralized fashion from the manager including the current workers version and we'll worry about that in that ticket... that part is too big to worry about in the same ticket as this request.

from worker.

johnmay1 commented on May 24, 2024

Yes, Agee with you @naorlivne . We can do reporting stuff later.

from worker.

naorlivne commented on May 24, 2024

https://github.com/v2tec/watchtower or https://github.com/pyouroboros/ouroboros - seems like this would help simplify this task, currently leaning towards ouroboros more as watchtower seems to be no longer maintained... original thought is to have a container of it start/restart if the "AUTO_UPDATE" param is set to True that is configured to only update the worker at the end of the worker boot process.

from worker.

naorlivne commented on May 24, 2024

Going to go with https://github.com/pyouroboros/ouroboros (guide at https://github.com/pyouroboros/ouroboros/wiki/Usage#core) will likely need to set the following flags on it:

Docker Sockets
Run Once
Self Update
Monitor or Labels Only (so it only updates the worker)
Cleanup
Repository User & Repository Password or mount the docker config file as described in https://github.com/pyouroboros/ouroboros/wiki/Private-Registries

Seems to me that providing a template\script\guide to have that managed as a cron with the assistance of #42 should be sufficient rather then having it built into a custom way inside the worker.

from worker.

naorlivne commented on May 24, 2024

I'll create a full guide as part of the documentation when I have more time but for now you can simply have a cron_job (released in 2.5.0) with the following config:

{
  "env_vars": {"RUN_ONCE": "true", "MONITOR":"worker", "CLEANUP":"true"},
  "docker_image" : "pyouroboros/ouroboros",
  "running": true,
  "volumes": ["/var/run/docker.sock:/var/run/docker.sock"],
  "networks": ["nebula", "bridge"],
  "devices": [],
  "privileged": false,
  "schedule": "0 * * * *"
}

then each device_group which have this cron_job defined as part of will have the worker auto updated to the latest following 3 cavets:

The worker container is named "worker", if not you will need to change the "MONITOR" envvar to match the worker container name.
You're using the "latest" or "arm64v8" tags of the workers.
The version is checked every hour in the following, this may be often\not enough depanding on your use case so you might want to change the schedule cron to match your need.

from worker.

naorlivne commented on May 24, 2024

Guide added at https://nebula.readthedocs.io/en/latest/auto-update-workers/

from worker.

Self update worker container on deployed remote devices about worker HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent