nebula-orchestrator / worker Goto Github PK
View Code? Open in Web Editor NEWThe worker node manager container which manages nebula nodes
Home Page: https://nebula-orchestrator.github.io/
License: GNU General Public License v3.0
The worker node manager container which manages nebula nodes
Home Page: https://nebula-orchestrator.github.io/
License: GNU General Public License v3.0
There should be unit tests to ensure that noting fails
there is the framework for automatic unit tests in place but the worker tests are so far manually done.
Manually exiting the worker-manager should stop running containers, it's important to note that this should only happen if a user manually stops Nebula (ctrl-D) worker-manager and not if the container crash\exits for any reason (even docker stop/kill <container-name>
) as on production use the worker-manager is expected to get back online, reconnect to rabbit, pull latest config from mongo & download the image used before rolling the containers in order to minimize impact on the applications containers.
would be amazing to be able to have a wildcard APP_NAME
for instance APP_NAME=site_*, this way any new apps that are created with site_ will be installed on the workers.
if a vaild nebula app that is somehow missing the fanout exchange the exchange should be recreated ath the connecting to rabbit step after the database is revalideted that the app is a preexisting nebula app, this protected aginst boot crash loops in case the exchange is somehow deleted while the nebula app still has all of it's config valid in the mongo backend db
With Python 2.x nearing it's EOL Nebula should migrate to Python 3.x (with the current minor version target being the latest released version)
Nebula is is currently Python 2.7.x based
support for mounts will help create new options for the worker-manager
The same image should be used by both ARMv8 & X64 architectures (and possibly other ARM types as CI/CD for those permits)
X64 has it's own image, ARMv8 has it's own image, the latest points to X64 alone & no docker manifest file exists.
after #20 is complete it might be a good idea to add a default "nebula" user network that's basically a bridge network so users can route traffic between containers on the same server via container names, the network should be checked if it exists at the worker-manager boot and be created if not so it could later be used by the apps as any other user network
The worker should work even without any config file existing if it has all the needed params given as envvars.
Currently this is worked around by having an empty config file in the root of the repo but a conf.json file is still needed to exist or the start will fail.
We need to check if the conf.json exists before reading from it and if it doesn't exist set the auth_file var to be an empty dict.
Having both api-manager and worker-manager is confusing, the worker-manager should be renamed worker as to better state what it actually does.
This should be in the documentation, the codebase, git repo, docker containers, CI/CD (Docker hub & shippable).
Relates to nebula-orchestrator/manager#21
Hello
I have configured a nebula worker on the Raspberry. I am using AWS ECR
as a registry to store the Images. The AWS ECR
dynamically updates the auth password every 12 hours. I can't update this password every time at the worker. So I have configured AWS credential helper
which automatically updates the auth password every 12 hours on the edge device.
Whenever I push the update, the worker will pull new image from AWS ECR.
It is working perfectly when I add REGISTRY_AUTH_USER
and REGISTRY_AUTH_PASSWORD
manually every 12 hours. The worker is able to pull the update from the AWS ECR registry.
But now when I have configured the AWS ECR credential helper
the nebula work is unable to pull the Image. To test if my AWS ECR credential helper
is working properly I tried the command docker pull <my_registry_url>/<image_name>
and it worked. Note: I also tried this command after 12 hours when my auth_password became invalid and it still worked.
I have added worker docker-compose.yml
and the worker logs
for the reference purpose =>
docker-compose.yml
=>
version: '3'
services:
worker:
container_name: worker
build:
context: .
dockerfile: Dockerfile
volumes:
- /var/run/docker.sock:/var/run/docker.sock
restart: unless-stopped
hostname: worker
environment:
REGISTRY_HOST: < my_regisrty_url >
MAX_RESTART_WAIT_IN_SECONDS: 0
NEBULA_MANAGER_AUTH_USER: nebula
NEBULA_MANAGER_AUTH_PASSWORD: nebula
NEBULA_MANAGER_HOST: < my_manager_url >
NEBULA_MANAGER_PORT: 80
NEBULA_MANAGER_PROTOCOL: http
NEBULA_MANAGER_CHECK_IN_TIME: 30
DEVICE_GROUP: test
KAFKA_BOOTSTRAP_SERVERS: < my_manager_url >:9092
KAFKA_TOPIC: nebula-reports
worker logs
=>
Creating network "nebula_worker_default" with the default driver
Creating worker ... done
Attaching to worker
worker | reading config variables
worker | /usr/local/lib/python3.7/site-packages/parse_it/file/file_reader.py:55: UserWarning: config_folder_location does not exist, only envvars & cli args will be used
worker | warnings.warn("config_folder_location does not exist, only envvars & cli args will be used")
worker | reading config variables
worker | created a bridge type network named nebula
worker | no registry user pass combo defined, skipping registry login
worker | checking nebula manager connection
worker | nebula manager connection ok
worker | stopping all preexisting nebula managed app containers in order to ensure a clean slate on boot
worker | initial start of <my_image> app
worker | pulling image <my_registry_url>/<my_image>:latest
worker | <my_registry_url>/<my_image>
worker | 500 Server Error: Internal Server Error ("Get https://<my_registry_url>/v2/<my_image>/manifests/latest: no basic auth credentials")
worker | problem pulling image <my_registry_url>/<my_image>:latest
make docker
command with the flag TARGET_GOARCH=arm
~/.docker/config.json
as follows =>{
"credHelpers": { "<my_registry_url>": "ecr-login" }
}
Might require a bit of overhull to have rabbit also create an exchange per app that's not a fanout but it should be possible to have a single container run a one time "exec" command and return the result to the user through Nebula
As a lot of IoT devices are ARM based it could be wise to have an ARM version of the worker-manager to allow managing them as well.
Nebula should integrate with Dockerfile based healthchecks so that if a container is defined as unhealthy it restarts it, it should default to on but have a per app flag that allows disabling that feature
It should be noted that if\when this will become included in the Docker engine it should really be used throughout that, either via Nebula passing Docker engine the needed settings or (hopefully) that becoming Docker engine default and means noting will be needed to be done Nebula side.
Currently Nebula\Docker engine only restarts crashed containers & ignores the Docker engine health checks results
in addtion to being able to describe the APP_NAME with a list of apps nebula manages on the server a optional (instead of and\or addition to APP_NAME) param of APP_PODS can help ease management, each APP_POD is basically a group of apps, on the worker-manager side there will need to be support ot reading the APP_PODS at startup from mongo, opening a rabbit APP_PODS queue per pod for that server instnace and listening to any changes in apps in the relevant pods and updating the apps to match it
currently the registry auth is coded into the worker-manager via it's config file or ENVVAR, there should also be support to the standard location dockerfile auth located at <home_folder>/.docker/config.json as well
the following params should be optional/have a default value:
RABBIT_HEARTBEAT - default to 3600
RABBIT_VHOST - default to nebula
REGISTRY_HOST - default to docker hub
REGISTRY_AUTH_USER - should be optional for those who uses only public image with no login - requires code change to skip the registry login step if not set
REGISTRY_AUTH_PASSWORD - should be optional for those who uses only public image with no login - requires code change to skip the registry login step if not set
max_restart_wait_in_seconds - default to 0
Currently Nebula only support one docker registry auth, not yet sure how but support for multiple authenticated registries might be needed in some cases.
currently all apps use the default container CMD command, there should be an ability to optionally change that to something else.
Hello
I have configured the Nebula worker on the Raspberry Pi.
I am using Ubuntu 18.04 VPS on which I have the following containers =>
- Nebula Manager
- Mongo
- Nebula Reporter
- kafka
- zookeeper
The worker sends the current state to a Kafka cluster after every sync with the manager. The reporter component will pull from Kafka and populate the state data into the backend DB. Then the manager can query the new state data from the backend DB to let the admin know the state of managed devices.
When the nebula worker downloads and updates the application while reporting the state using Kafka I get the following error =>
Recreating 3cc087462a4c_worker ... done
Attaching to worker
worker | reading config variables
worker | reading config variables
worker | /usr/local/lib/python3.7/site-packages/parse_it/file/file_reader.py:55: UserWarning: config_folder_location does not exist, only envvars & cli args will be used
worker | warnings.warn("config_folder_location does not exist, only envvars & cli args will be used")
worker | logging in to registry
worker | {'IdentityToken': '', 'Status': 'Login Succeeded'}
worker | checking nebula manager connection
worker | nebula manager connection ok
worker | stopping all preexisting nebula managed app containers in order to ensure a clean slate on boot
worker | stopping container e02f34d03c880a47cc33cb51b5e84578f7e387f305e618843a9c8e229ccd93cb
worker | removing container e02f34d03c880a47cc33cb51b5e84578f7e387f305e618843a9c8e229ccd93cb
worker | initial start of example app
worker | pulling image <my_registry_url>/flask:latest
worker | <my_registry_url>/flask
worker | {
worker | "status": "Pulling from flask",
worker | "id": "latest"
worker | }
worker | {
worker | "status": "Digest: sha256:6f51939e6d3dff3fdfebdeb639ddad00c3671d5f0b241666c9e140d1bfa7883c"
worker | }
worker | {
worker | "status": "Status: Image is up to date for <my_registry_url>/flask:latest"
worker | }
worker | creating container example-1
worker | successfully created container example-1
worker | starting container example-1
worker | completed initial start of example app
worker | starting work container health checking thread
worker | creating reporting kafka connection object
worker | failed creating reporting kafka connection object - exiting
worker | NoBrokersAvailable
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
opened MongoDB connection
starting to digest messages from kafka
Note As Kafka logs are too big I haven't added them but if you need them for debugging I can attach the log file.
Configured worker on Raspberry Pi using docker-compose.yml
and docker custom built mentioned in the Specifications section
Configured Manager, Reporter, Mongo, Kafka, and zookeeper on Ubuntu 18.04 using docker-compose.yml
as mentioned in the Specification section.
Configured a Private Docker Registry for maintaining the Update release and Images.
At the worker side as I am using Raspberry Pi I had to build the Image on the Pi and start the container. For achieving this I did the following steps =>
Dir Structure =>
- Nebula worker
- Dockerfile
- docker-compose.yml
- worker/ ( Directory where all the source code is there )
Docker file =>
# it's official so I'm using it + alpine so damn small
FROM python:3.7.2-alpine3.9
# copy the codebase
COPY . /worker
# install required packages - requires build-base due to psutil GCC complier requirements
RUN apk add --no-cache build-base python3-dev linux-headers
RUN pip install -r /worker/worker/requirements.txt
#set python to be unbuffered
ENV PYTHONUNBUFFERED=1
# run the worker-manger
WORKDIR /worker
CMD [ "python", "worker/worker.py" ]
docker-compose.yml for worker =>
version: '3'
services:
worker:
container_name: worker
build:
context: .
dockerfile: Dockerfile
volumes:
- /var/run/docker.sock:/var/run/docker.sock
restart: unless-stopped
hostname: worker
environment:
REGISTRY_HOST: <my_registry_url>
REGISTRY_AUTH_USER: <my_registry_user>
REGISTRY_AUTH_PASSWORD: <my_registry_password>
MAX_RESTART_WAIT_IN_SECONDS: 0
NEBULA_MANAGER_AUTH_USER: nebula
NEBULA_MANAGER_AUTH_PASSWORD: nebula
NEBULA_MANAGER_HOST: <my_vps_url>
NEBULA_MANAGER_PORT: 80
NEBULA_MANAGER_PROTOCOL: http
NEBULA_MANAGER_CHECK_IN_TIME: 5
DEVICE_GROUP: example
KAFKA_BOOTSTRAP_SERVERS: <my_vps_url>:9092
KAFKA_TOPIC: nebula-reports
docker-compose.yml
=>
version: '3'
services:
mongo:
container_name: mongo
hostname: mongo
image: mongo:4.0.1
ports:
- "27017:27017"
restart: unless-stopped
environment:
MONGO_INITDB_ROOT_USERNAME: nebula
MONGO_INITDB_ROOT_PASSWORD: nebula
manager:
container_name: manager
hostname: manager
depends_on:
- mongo
image: nebulaorchestrator/manager
ports:
- "80:80"
restart: unless-stopped
environment:
MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
SCHEMA_NAME: nebula
BASIC_AUTH_PASSWORD: nebula
BASIC_AUTH_USER: nebula
AUTH_TOKEN: nebula
zookeeper:
container_name: zookeeper
hostname: zookeeper
image: zookeeper:3.4.13
ports:
- 2181:2181
restart: unless-stopped
environment:
ZOO_MY_ID: 1
kafka:
container_name: kafka
hostname: kafka
image: confluentinc/cp-kafka:5.1.2
ports:
- 9092:9092
restart: unless-stopped
depends_on:
- zookeeper
environment:
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_BROKER_ID: 1
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
reporter:
container_name: reporter
hostname: reporter
depends_on:
- mongo
- kafka
image: nebulaorchestrator/reporter
restart: unless-stopped
environment:
MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
SCHEMA_NAME: nebula
BASIC_AUTH_PASSWORD: nebula
BASIC_AUTH_USER: nebula
KAFKA_BOOTSTRAP_SERVERS: kafka:9092
KAFKA_TOPIC: nebula-reports
Expected/Wanted Behaviour
The version of the worker should be auto set to match the branches.
the $TRAVIS_BRANCH envvar will likely be part of the solution but as this should be a change in the codebase itself (rather then just assigning the variable value a more complex solution the just os.getenv(TRAVIS_BRANCH) will be needed.
after there is a version as part of the codebase it should be added to be part of the reports that are sent to the optional reporting system.
Possible solutions include:
Related to #41
Actual Behaviour
Version is manually set before deployments.
I have configured Nebula Worker on the Raspberry Pi and Mongo, Nebula Manager, Reporter, Kafka and Zookeeper on the VPS ( Which is a Ubuntu 18.04 machine )
Status if the Remote device is updated or the update failed using an API call.
I am referring this Nebula Documentation https://nebula.readthedocs.io/en/latest/api/general/
I have tried the List a filtered paginated view of the optional reports system section from this documentation.
I get the following information when I try this API http://<my_vps_url>/api/v2/reports?page_size=1&
=>
{
"data": [
{
"_id": {
"$oid": "5d20785900bb37cdd5352c5c"
},
"memory_usage": {
"total": 926,
"used": 159,
"free": 91,
"available": 680
},
"root_disk_usage": {
"total": 14890,
"used": 2140,
"free": 12115
},
"cpu_usage": {
"cores": 4,
"used_percent": 0.6
},
"cron_jobs_containers": [],
"apps_containers": [
{
"read": "0001-01-01T00:00:00Z",
"preread": "0001-01-01T00:00:00Z",
"pids_stats": {},
"blkio_stats": {
"io_service_bytes_recursive": null,
"io_serviced_recursive": null,
"io_queue_recursive": null,
"io_service_time_recursive": null,
"io_wait_time_recursive": null,
"io_merged_recursive": null,
"io_time_recursive": null,
"sectors_recursive": null
},
"num_procs": 0,
"storage_stats": {},
"cpu_stats": {
"cpu_usage": {
"total_usage": 0,
"usage_in_kernelmode": 0,
"usage_in_usermode": 0
},
"throttling_data": {
"periods": 0,
"throttled_periods": 0,
"throttled_time": 0
}
},
"precpu_stats": {
"cpu_usage": {
"total_usage": 0,
"usage_in_kernelmode": 0,
"usage_in_usermode": 0
},
"throttling_data": {
"periods": 0,
"throttled_periods": 0,
"throttled_time": 0
}
},
"memory_stats": {},
"name": "/example-1",
"id": "dafc6f075726d61a6b2bc3feffe0cecb738bd43d04eca89c6f3fa72dd9d50193"
}
],
"current_device_group_config": {
"status_code": 200,
"reply": {
"apps": [
{
"app_id": 1,
"app_name": "example",
"starting_ports": [
8080
],
"containers_per": {
"server": 1
},
"env_vars": {},
"docker_image": "<my_registry_url>/flask",
"running": true,
"networks": [
"nebula"
],
"volumes": [
"/tmp:/tmp/1",
"/var/tmp/:/var/tmp/1:ro"
],
"devices": [],
"privileged": false,
"rolling_restart": false
}
],
"apps_list": [
"example"
],
"prune_id": 1,
"cron_jobs": [],
"cron_jobs_list": [],
"device_group_id": 1
}
},
"device_group": "example",
"report_creation_time": 1562409049,
"hostname": "worker",
"report_insert_date": {
"$date": 1562409049716
}
}
],
"last_id": {
"$oid": "5d20785900bb37cdd5352c5c"
}
}
I am unable to find which key from the above API can help me if the device is updated or failed or is there another API for finding this ( I am unable to find any other API for this purpose. )
I also checked the database
I got the following results =>
# mongo
> use nebula
switched to db nebula
> show collections
nebula_apps
nebula_cron_jobs
nebula_device_groups
nebula_reports
nebula_user_groups
nebula_users
I have checked the nebula_reports
collection I got the same output What I got with the above API call.
What am I doing wrong here?
Hello
I have a Raspberry Pi ( This is my edge device where I have configured a Worker ) and a Server ( Where I have a docker registry, Nebula Manager, MongoDB. )
The image will be downloaded at the Edge device from the Remote Registry once a new image gets available and a container will start.
Facing an issue when starting the container on the Edge Device.
creating container example-1
successfully created container example-1
starting container example-1
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/docker/api/client.py", line 261, in _raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.39/containers/example-1/start
completed initial start of example app
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/worker/worker/functions/docker_engine/docker_engine.py", line 168, in start_container
return self.cli.start(container_name)
File "/usr/local/lib/python3.7/site-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/docker/api/container.py", line 1093, in start
self._raise_for_status(res)
File "/usr/local/lib/python3.7/site-packages/docker/api/client.py", line 263, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/usr/local/lib/python3.7/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error: Not Found ("network nebula not found")
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/worker/worker/functions/docker_engine/docker_engine.py", line 271, in run_container
self.start_container(container_name)
File "/worker/worker/functions/docker_engine/docker_engine.py", line 169, in start_container
except "APIError" as e:
TypeError: catching classes that do not inherit from BaseException is not allowed
Raspberry Pi ArmV7 arch.
Docker-compose for Manager and MongoDB
version: '3'
services:
mongo:
container_name: mongo
hostname: mongo
image: mongo:4.0.1
ports:
- "27017:27017"
restart: unless-stopped
environment:
MONGO_INITDB_ROOT_USERNAME: <my-password>
MONGO_INITDB_ROOT_PASSWORD: <my-password>
manager:
container_name: manager
hostname: manager
depends_on:
- mongo
image: nebulaorchestrator/manager
ports:
- "80:80"
restart: unless-stopped
environment:
MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
SCHEMA_NAME: nebula
BASIC_AUTH_PASSWORD: <my-password>
BASIC_AUTH_USER: <my-password>
AUTH_TOKEN: <my-password>
For Raspberry Pi, I have cloned the worker Repo from GitHub and I am using this to build the Worker.
My Dir Structure =>
- Nebula worker
- Dockerfile
- docker-compose.yml
- worker/ ( Directory where all the source code is there )
Docker file =>
# it's offical so i'm using it + alpine so damn small
FROM python:3.7.2-alpine3.9
# copy the codebase
COPY . /worker
# install required packages - requires build-base due to psutil GCC complier requirements
RUN apk add --no-cache build-base python3-dev linux-headers
RUN pip install -r /worker/worker/requirements.txt
#set python to be unbuffered
ENV PYTHONUNBUFFERED=1
# run the worker-manger
WORKDIR /worker
CMD [ "python", "worker/worker.py" ]
docker-compose.yml
version: '3'
services:
worker:
container_name: worker
image: nebulaorchestrator/worker-arm64v8:latest
#build:
#context: .
#dockerfile: Dockerfile
volumes:
- /var/run/docker.sock:/var/run/docker.sock
restart: unless-stopped
hostname: worker
environment:
REGISTRY_HOST: < My_Registry_URL >
REGISTRY_AUTH_USER: < Registry_User >
REGISTRY_AUTH_PASSWORD: < Registry_Password >
MAX_RESTART_WAIT_IN_SECONDS: 0
NEBULA_MANAGER_AUTH_USER: <my-password>
NEBULA_MANAGER_AUTH_PASSWORD: <my-password>
NEBULA_MANAGER_HOST: < Manager-URL >
NEBULA_MANAGER_PORT: 80
NEBULA_MANAGER_PROTOCOL: http
NEBULA_MANAGER_CHECK_IN_TIME: 5
DEVICE_GROUP: example
#KAFKA_BOOTSTRAP_SERVERS: kafka:9092
#KAFKA_TOPIC: nebula-reports
Note To check if my configurations are right I configured Nebula Manager, MongoDb and nebula worker on my local machine and tested if everything is working as per the expected Behaviour. It is working properly in that case. But in case of Raspberry Pi I am facing the above mentioned issue
Similar to nebula-orchestrator/manager#16
RabbitMQ connections should be closed whenever it's usage ends and it's attached channel is closed
Channel is closed but the connection is kept until it times out
The rabbit_login function should also return the "rabbit_connection" & not just the "rabbit_connection_channel" and whenever the channel is closed explicitly it needs to close the connection as well.
After Travis CI runs the unit tests successfully it should build the Docker image & push it to Docker hub.
If branch is master the image tag should be latest else it should be the same as the branch name (can look at TRAVIS_BRANCH envvar in Travis CI to get branch name being built)
Docker Hub should not build the images anymore as it ignores the unit tests failures.
Also the Docker Hub build tag should be removed from the README.md file
Docker hub builds the images while Travis CI runs the unit tests, each ignoring what the results of the other is.
as the registry login is kept for the life of the docker_socket connection is alive there really no need to keep re-login to the registry every time the worker-manager access it, it should happen only once after the docker_socket is created, this will simplify things, avoid unneeded API calls to the registry, and allow to removal of a lot of registry user,pass,host variables from a bunch of modules.
should make everything much less ugly & easier to maintain
drone.io is CI/CD with Docker in mind, this meshes well with Nebula, the ARM step should first be moved to be built on it & depanding on the results maybe the x64 build as well (as Travis-CI is also a great tool).
Builds should work\fail on their own merit & not on the build system idiosyncrasies, the current shippable based system have a open bug which affects multiple builds on & off without any attention from shippable support team to being acknowledged, much less to work on resolving.
the rolling restart function is currently just a place holder, this really needs fixing so users will have that option as well as a hard restart of all containers
Upon boot of worker-manager MongoDB should get the data for all the apps said worker is configured to manage then disconnect
worker-manager uses diffrent threads per app to connect to MongoDB which results in multiple simoltuinus connections to Mongo rather then just 1 (1 per app the worker manages)
as mentioned in #19 having user network options might be a good idea to allow inner-pod communication between containers be done with the container hostname DNS resolution.
there are 2 options to go about it I'm still undecided will be better:
It's really a question of customizability vs sane defaults, thoughts?
Allow worker container to update to newer version deployed to remote devices
Not able to update worker container
Upon starting a new worker it should request Via RabbitMQ the newest app config - this should be done via a queue where all the api-managers listens to and reply in a new thread.
Currently upon starting a new worker it connects directly to MongoDB & get the current app config from it - this is not ideal as it requires a read only MongoDB connection from every worker reachable for the intial sync
Currently the requirements.txt is a mess, each repo should only include the requirements it actaully needs to function and not have grabage that was either once needed but not anymore or needed by another repo in the Nebula project but not this repo.
because security is important
container on host network
no network set
used the following config
{
"starting_ports": [],
"containers_per": {"server": 1},
"env_vars": {"ENV": "dev"},
"docker_image" : "mine/sensu-client",
"running": true,
"networks": ["host"],
"privileged": true,
"devices": [],
"volumes": []
}
but inside the container ifconfig just showed lo nic.
not automatically removing images & repulling them might have some uses in cases where you want to reuse local based images, when this is added there should also be a way to force\order GC older images that are unused (one of the original reason why currently Nebula automatically deletes all images so aggressively) otherwise I can imagine IoT devices will get filled with old images quickly.
All prereqs at requirements.txt needs to up to date
Some packages are outdated
some apps (like log aggregations) might require running as a privileged containers, support for that will help
Relating to nebula-orchestrator/manager#2 the worker should be changed to allow to connect to the manager with a Bearer UUID token as another option in addition to using basic auth user\pass.
The worker connect to the manager using basic auth user\pass as the only option.
How is the networking handled when the containers are distributed across host/regions?
say you have an app consisting of two microservices that need to communicate. how is this handled? like subnet or ip assignment..
The issue is opened with the Ref #63
I want the complete device update history and store only those report which contains the status of updates. ( i.e. Fail or Success ) This data can be purged after 11 or 12 months if required.
The reason for this type of mechanism is to avoid the large volume of data getting accumulated in the MongoDB.
To elaborate the above two points I want the current behavior where I get the device state ( for example if container is running, ram, CPU, etc ) continuously according to the NEBULA_MANAGER_CHECK_IN_TIME
which I can purge after some time ( for example six months ) but for another behavior I want the update report ( in this I am expecting when was device was updated and which release it has ) only if the end device is updated successfully or the update failed. I want to maintain the data for second behavior for more time comparatively than the first behavior.
The worker
is continuously sending the data to the reporter
as per the time defined at NEBULA_MANAGER_CHECK_IN_TIME
.
Large Volume of data gets accumulated which makes it difficult to maintain the data for 11 to 12 months. Also, it becomes difficult to keep track of the updates for multiple devices with such a large volume of data.
create new branch -> trigger travis build -> new version of branch deployed
create new branch -> [skip travis] includded so it doesn't trigger travis build -> new version of branch not deployed -> enter the new branch -> make a push in the new branch -> travis now runs
required for some of the newer features of Docker
The Dockerfile should have all the required pip modules dependencies version locked, this avoids having containers build down the line and failing do to updated dependencies breaking changes.
adding support for devices to be used from inside containers will allow simpler usability for externally mounted devices (such as USB devices) which would help in easing IOT implementations
MariaDB Galera replication is a great fit for Nebula huge reads\low writes, it should be added as another option alongside MongoDB
I can see some cases that inheriting envvars from the worker-manager host node will be a good idea, that would allow devices to have some customizability that might be useful for distributed systems yet still managed centrally (same IoT sensor everywhere, but a tag is manually set on each sensor with it's location name, etc...), thinking it could be any combination of the follow:
not sure about how to handle multiple apps each getting different envvars yet, anyone got any ideas in that regards?
support for Docker storage & network plugin will allow using Docker to it's full capabilities.
Following nebula-orchestrator/manager#29 the worker will need to be changed to allow supporting the new cron_jobs option, this will need a few changes:
(Thinking using https://pypi.org/project/croniter/ for the cron parsing to datetime and from there the logic is very simple)
The current workarounds is to either have a cron service managed as a Nebula app that will in turn start containers based on it's cron definitions or have all tasks that need to run based on a schedule each be it's on Nebula app and to have an internal logic in them that waits for the right time to run.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.