Comments (6)
Issue-Label Bot is automatically applying the label bug
to this issue, with a confidence of 0.58. Please mark this comment with
Links: app homepage, dashboard and code for this bot.
from worker.
Hi @Sharvin26
I first want to confirm I understand the issue correctly.
- Your DB & manager works correctly
- You have created an example device group & app
- When you run a worker on a linux x64 based host everything works
- When you run a worker on a Raspberry Pi the worker itself starts but you get the above error when it tries running the example app as part of the example device group it's connected to
- You're using a customized worker your building yourself on the Raspberry Pi because (I assume) it's an ARMv7
If I got anything wrong let me know otherwise I think it's safe to assume that the issue is something with the ARMv7 implementation (as you mention it works on a x64 linux host with the same config) so I have a few things that come to mind that might be the cause:
-
Silly question but are you running docker-compose without root permissions? please try running both with
sudo
and as the root user and\or grant the user your running it as permissions to the docker engine. -
docker.errors.NotFound: 404 Client Error: Not Found ("network nebula not found")
on the logs leads me to believe that either the nebula network was never created on the worker or was deleted at some point, both cases are very weird as the nebula worker will ensure that default network is always created as part of it's boot process so can you provide with the logs of the worker boot as well rather then just the part where the issue is? -
Can you run
docker network ls
on the worker Pi and share the results? I want to see if it created thenebula
network & just have issues connecting to it or if it failed to create it at all -
Can you try running the worker with
docker run
(no docker-compose) to test? if it works that way we know that the issue is how it interect with docker-compose, running a google search onrequests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.39/containers/example-1/start
brought me up a lot of past tickets that revolved around docker-compose & root permissions so it's worth checking to confirm, also docker-compose creating it's own networks might be somehow related.
P.S. In the docker-compose.yml
you use an image named nebulaorchestrator/worker-arm64v8:latest
I think you meant it to be arm64v7?
from worker.
Hello @naorlivne
Thanks for the Response.
- Your DB & manager works correctly
Yes DB & Manager are working correctly.
- You have created an example device group & app
Yes, I have created an example device group and app.
- When you run a worker on a linux x64 based host everything works
When running x64 based host everything works correctly.
- When you run a worker on a Raspberry Pi the worker itself starts but you get the above error when it tries running the example app as part of the example device group it's connected to
The Nebula worker has no issue in starting on Raspberry Pi. But when a new Image is pushed at the Docker registry and Manager is notified of that then Worker Pulls the Image from the Registry and when it starts the Container then I get the Above mentioned Issue.
- You're using a customized worker your building yourself on the Raspberry Pi because (I assume) it's an ARMv7
Yes, I have cloned the Image from this repository and then made changes in the directory structure Which I have mentioned in my First comment.
Note: In my first comment, the docker-compose I mentioned for the worker has one mistake that I have changed here. I am not pulling the image from docker hub I am building the image by cloning the source code from this repo. When Pulling the image from the docker hub I am getting this error
Pulling worker (nebulaorchestrator/worker:arm64v7)...
ERROR: manifest for nebulaorchestrator/worker:arm64v7 not found
and for arm64v8
I get this error =>
Pulling worker (nebulaorchestrator/worker:arm64v8)...
arm64v8: Pulling from nebulaorchestrator/worker
--- Downloading and Extracting ---
Digest: sha256:0f37da08ec05f420a3cc286bef716f98e99442e392e171bd4bdb2848161240da
Status: Downloaded newer image for nebulaorchestrator/worker:arm64v8
Creating worker ... done
Attaching to worker
worker | standard_init_linux.go:211: exec user process caused "exec format error"
I tried both the Options and they are not working When I searched for this issue standard_init_linux.go:211
I found this is an architecture related issue. So I am building it on the Raspberry Pi by cloning this repo.
version: '3'
services:
worker:
container_name: worker
#image: nebulaorchestrator/worker-arm64v8:latest
build:
context: .
dockerfile: Dockerfile
volumes:
- /var/run/docker.sock:/var/run/docker.sock
restart: unless-stopped
hostname: worker
environment:
REGISTRY_HOST: < My_Registry_URL >
REGISTRY_AUTH_USER: < Registry_User >
REGISTRY_AUTH_PASSWORD: < Registry_Password >
MAX_RESTART_WAIT_IN_SECONDS: 0
NEBULA_MANAGER_AUTH_USER: <my-password>
NEBULA_MANAGER_AUTH_PASSWORD: <my-password>
NEBULA_MANAGER_HOST: < Manager-URL >
NEBULA_MANAGER_PORT: 80
NEBULA_MANAGER_PROTOCOL: http
NEBULA_MANAGER_CHECK_IN_TIME: 5
DEVICE_GROUP: example
#KAFKA_BOOTSTRAP_SERVERS: kafka:9092
#KAFKA_TOPIC: nebula-reports
Results for the Steps that you advised me to perform in the above comment =>
-
Yes, I am running the docker as a sudo and as the root user.
-
Booting Result of Nebula worker
=> docker-compose up
Creating worker ... done
Attaching to worker
worker | reading config variables
worker | /usr/local/lib/python3.7/site-packages/parse_it/file/file_reader.py:55: UserWarning: config_folder_location does not exist, only envvars & cli args will be used
worker | warnings.warn("config_folder_location does not exist, only envvars & cli args will be used")
worker | reading config variables
worker | logging in to registry
worker | {'IdentityToken': '', 'Status': 'Login Succeeded'}
worker | checking nebula manager connection
worker | nebula manager connection ok
worker | stopping all preexisting nebula managed app containers in order to ensure a clean slate on boot
worker | initial start of example app
worker | pulling image <my_registry_url>/ubuntu:latest # Note Here My registry url get's print But I have changed it to my_registry_url as a example
worker | <my_registry_url>/ubuntu
worker | {
worker | "status": "Pulling from ubuntu",
worker | "id": "latest"
worker | }
worker | {
worker | "status": "Digest: sha256:8ee703cfd6d7d4d2c69971989bd4d20221ff7f0e7fa459c4de14e814394757b0"
worker | }
worker | {
worker | "status": "Status: Image is up to date for <my_registry_url>/ubuntu:latest"
worker | }
worker | creating container example-1
worker | successfully created container example-1
worker | starting container example-1
worker | Exception in thread Thread-1:
worker | Traceback (most recent call last):
worker | File "/usr/local/lib/python3.7/site-packages/docker/api/client.py", line 261, in _raise_for_status
worker | response.raise_for_status()
worker | File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
worker | raise HTTPError(http_error_msg, response=self)
worker | requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.39/containers/example-1/start
worker |
worker | During handling of the above exception, another exception occurred:
worker |
worker | Traceback (most recent call last):
worker | File "/worker/worker/functions/docker_engine/docker_engine.py", line 168, in start_container
worker | return self.cli.start(container_name)
worker | File "/usr/local/lib/python3.7/site-packages/docker/utils/decorators.py", line 19, in wrapped
worker | return f(self, resource_id, *args, **kwargs)
worker | File "/usr/local/lib/python3.7/site-packages/docker/api/container.py", line 1093, in start
worker | self._raise_for_status(res)
worker | File "/usr/local/lib/python3.7/site-packages/docker/api/client.py", line 263, in _raise_for_status
worker | raise create_api_error_from_http_exception(e)
worker | File "/usr/local/lib/python3.7/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
worker | raise cls(e, response=response, explanation=explanation)
worker | docker.errors.NotFound: 404 Client Error: Not Found ("network nebula not found")
worker |
worker | During handling of the above exception, another exception occurred:
worker |
worker | Traceback (most recent call last):
worker | File "/usr/local/lib/python3.7/threading.py", line 917, in _bootstrap_inner
worker | self.run()
worker | File "/usr/local/lib/python3.7/threading.py", line 865, in run
worker | self._target(*self._args, **self._kwargs)
worker | File "/worker/worker/functions/docker_engine/docker_engine.py", line 271, in run_container
worker | self.start_container(container_name)
worker | File "/worker/worker/functions/docker_engine/docker_engine.py", line 169, in start_container
worker | except "APIError" as e:
worker | TypeError: catching classes that do not inherit from BaseException is not allowed
worker |
worker | completed initial start of example app
worker | starting work container health checking thread
worker | starting device_group example /info check loop, configured to check for changes every 5 seconds
- Results for
docker network ls
command
=> docker network ls
NETWORK ID NAME DRIVER SCOPE
f259bbd96621 bridge bridge local
2d1d68f8ba8f host host local
408f63c676f6 nebula_default bridge local
1138713daa73 nebula_worker_default bridge local
354b0e702495 none null local
- Running the Image with docker
run
command
=> docker build -t nebula-worker .
=> docker run --restart=always -e DEVICE_GROUP="example" -e REGISTRY_HOST="<my_registry_url>" -e REGISTRY_AUTH_USER="<my_registry_user>" -e REGISTRY_AUTH_PASSWORD="<my_registry_password>" -e NEBULA_MANAGER_AUTH_USER="<nebula_user>" -e NEBULA_MANAGER_AUTH_PASSWORD="<nebula_password>" -e NEBULA_MANAGER_HOST="<my_nebula_url>" --name nebula-worker -v /var/run/docker.sock:/var/run/docker.sock nebula-worker
reading config variables
/usr/local/lib/python3.7/site-packages/parse_it/file/file_reader.py:55: UserWarning: config_folder_location does not exist, only envvars & cli args will be used
warnings.warn("config_folder_location does not exist, only envvars & cli args will be used")
reading config variables
logging in to registry
{'IdentityToken': '', 'Status': 'Login Succeeded'}
checking nebula manager connection
nebula manager connection ok
stopping all preexisting nebula managed app containers in order to ensure a clean slate on boot
initial start of example app
pulling image <my_registry_url>/ubuntu:latest
<my_registry_url>/ubuntu
{
"status": "Pulling from ubuntu",
"id": "latest"
}
{
"status": "Pulling fs layer",
"progressDetail": {},
"id": "890bdf70a444"
}
{
"status": "Pull complete",
"progressDetail": {},
"id": "42962dab4cbd"
}
--- Downloading and Extracting the Image
{
"status": "Digest: sha256:8ee703cfd6d7d4d2c69971989bd4d20221ff7f0e7fa459c4de14e814394757b0"
}
{
"status": "Status: Downloaded newer image for <my_registry_url>/ubuntu:latest"
}
creating container example-1
successfully created container example-1
starting container example-1
completed initial start of example app
starting work container health checking thread
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/docker/api/client.py", line 261, in _raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.39/containers/example-1/start
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/worker/worker/functions/docker_engine/docker_engine.py", line 168, in start_container
return self.cli.start(container_name)
File "/usr/local/lib/python3.7/site-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/docker/api/container.py", line 1093, in start
self._raise_for_status(res)
File "/usr/local/lib/python3.7/site-packages/docker/api/client.py", line 263, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/usr/local/lib/python3.7/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error: Not Found ("network nebula not found")
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/threading.py", line 917, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.7/threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "/worker/worker/functions/docker_engine/docker_engine.py", line 271, in run_container
self.start_container(container_name)
File "/worker/worker/functions/docker_engine/docker_engine.py", line 169, in start_container
except "APIError" as e:
TypeError: catching classes that do not inherit from BaseException is not allowed
Result for docker version
and docker-compose version
=>
# docker --version
Docker version 18.09.0, build 4d60db4
# docker-compose --version
docker-compose version 1.24.1, build 4667896
Note: Both docker and docker-compose are installed using docker documentation defined method.
from worker.
Good news, I've found the root cause.
Nebula has a default network creativity named nebula
- upon the worker boot it checks if this network exists & if not it creates it, or at least this is how it should be but apparently there's a bug in the network check code that returns true even if the nebula
network doesn't exist but another network name starts with "nebula" (basically all nebula* wildcard).
Because your docker-compose run file has created a network named nebula_worker_default
the checks (wrongly) returns that the nebula
network exists so it doesn't try to create it but then when it gets time to actually use it (by running a container attached to it) it fails.
I'll push a fix in the next few hours to the worker master branch (& by extension to the next numbered version) but if you don't feel like waiting just create a bridge network named nebula
on your Pi until then.
from worker.
Fixed push to master, can you do the following:
- Pull latest codebase
- Rebuild your image
- Remove the manually created
nebula
network on your Pi (I want to confirm Nebula is able to create it on it's own) - Try rerunning the docker-compose based worker on your PI
- Confirm everything works & close this ticket
As for the optional reporter system it was added in 2.2.0 & the documentation your looking is of a rather old version 1.5.0 so that's why you can't find anything on it, please look at https://nebula.readthedocs.io/en/latest/ for the latest document version to read more about it.
If you have any more issues about the optional reporting or need an hand with it please open another ticket about it, trying to keep things orderly.
from worker.
Thanks, it's working now.
from worker.
Related Issues (20)
- Connecting to a nebula managr hosted at a non root path HOT 3
- Query about a container running standalone script !! HOT 3
- Dependabot couldn't authenticate with https://pypi.python.org/simple/ HOT 1
- Prune image API not working HOT 3
- Cron jobs do not work per schedule if its frequency is shorter than the check-in time of worker HOT 2
- Build breaking due to unavailable package `freeze` HOT 3
- Add automated unit tests HOT 2
- Move automatic Docker imags build from Docker Hub to Travis-CI HOT 1
- have worker have the option to connect to the managers with a UUID token instead of basic auth HOT 1
- Self update worker container on deployed remote devices HOT 11
- Add cron jobs management support HOT 1
- Auto match version to branch on deployment and have it part of the report generated for the optional reporting system HOT 1
- fix creating new branch from last push not starting travis run due to auto added changelog having the [skip travis] flag on the commit message HOT 1
- Facing issue in creating reporting kafka connection object HOT 5
- How to check if edge device is updated successfully? HOT 6
- Feature Request: Get Update status at the reporter from worker only when update is performed ( failed or successful ). HOT 11
- Facing issue in configuring AWS ECR as a registry using credential helper for Nebula worker. HOT 7
- Consider moving ARM CI/CD build to drone.io HOT 1
- Create containers multiarch manifest file to allow single container multiarch support HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from worker.