Comments (7)
Hi Sean,
The two main reasons are:
- Size of the final image
- Simplicity in creating deb/rpm/tar packages that work the same across all distros (portability)
The Docker image serves as a canonical bundle of the agent. We bundle a Python runtime and Java runtime (OpenJDK 1.8) as well as quite a few shared libraries for collectd. For simplicity and consistency, we just gather dependencies on Ubuntu 16.04 and then just dump them into the scratch image, and then use that as the basis for the
RPM and deb packages (basically the filesystem of the Docker image gets installed to /usr/lib/signalfx-agent
if you use the RPM or deb). While deploying the agent in a container is not uncommon (esp in environments like K8s), it is still very common to deploy it directly on hosts as well, so this gives us a single bundle to maintain and gives us library consistency across all distros. There are also some users that require running the agent in highly restrictive environments where they do not have package managers or access to traditional install paths, so we can very easily make a tarball of the Docker image that they can extract to anywhere in the host filesystem and run in a self-contained way.
Install tools into the image at runtime to diagnose issues regarding networking, performance, utilisation
complete linux environment for when debugging issues
We do include several useful utilities in the image that allow at least some basic debugging. One thing that I am thinking about doing is just include a statically linked busybox instance in there to get more utilities for possibly less space usage. Were there any tools in particular you have wanted to use in an agent container that weren't there?
Support for dumb-init to avoid the zombie process problem
If you want to use dumb-init, you can just download the static binary of it with curl (which is included in the current agent image) and make it the entrypoint in your own Dockerfile that FROM
s the agent image. I'm not really that concerned about zombies though since there is generally only a single level of subprocesses under the agent (i.e. collectd
and python
) and the agent should be wait
ing for them whenever it makes them (or else it is an agent bug). If you are using custom collectd plugins or your own Python-based monitors that make a bunch of child procs, it could be a concern -- but you are going to want to make a custom image anyway to bundle those with the agent so you can just thrown in a formal init process like dumb-init there. It is going to create a new layer regardless of how you install it since it isn't part of distros by default.
We are currently in the process of making an image using a RHEL base image in order to get Redhat container certification. For that we are using a minimalistic RHEL distro called RHEL-atomic and even then the overhead of having the full distro environment is around 113MB (or ~20% greater than the normal agent image). When I was first doing it I used plain RHEL7 and it was about 300MB larger (~70%) than the current image.
packages installed can be updated easily
It is generally not a standard practice to upgrade packages in containers at runtime. Even if you could upgrade them, the agent processes using them would have to restart to get the updated libraries, which would make the container die and all the upgrades would be lost. We currently do a docker build --pull
on every build to ensure we have the latest Ubuntu base image (which gets updated pretty regularly), which also causes all of the latest packages to be installed, so they should be fairly up to date already.
I'm all for making the agent image more convenient for users, but I would rather do it by directly putting a larger suite of tools in the agent image and make it a little bit bigger vs. putting a full blown distro in the image and making it a lot bigger.
from signalfx-agent.
I really appreciate your response @keitwb ,
To be honest, I never thought of it within the context listed.
The reason I brought it up was mainly for the dumb-init
part and the frustration I have had dealing with some networking issues in the past.
The reason I want I wanted to have a thing like a package manager is that I extend your agents to include some additional config and scripts so if I can avoid bumping versions to resolve potential security issues would be amazing but I understand if this can't be provided by you guys.
Is there any consideration for using an alpine image?
So there isn't any tools that I can think off hand that aren't part of the final docker image, I think ping was one I was trying to use but wasn't there.
I was more interested bringing in dumb-init just as a base line of protection for our teams that have to run the final image we provide. We are also considering piping log data from the smart agent into a filter for known things we don't care about such as known missing values which generate a log for each poll which is another reason for adding dumb init into the image.
I get where you are coming from and this more coming from the 1% of times where it would be nice not to based of scratch.
from signalfx-agent.
To make sure I understand your use case properly, are you talking about exec
ing into the container at runtime and installing stuff in a shell or are you talking more about making a custom image that extends the image we release on quay.io?
If you find the missing value warning annoying, see #728. Yeah I would agree if you are extending the agent image with additional processes like a log filter in the same container, a proper init process is a good idea. Hopefully that is easy enough for you to add with curl and a static compiled dumb-init though (I understand other tools are easier to get with a package manager however).
Alpine is probably the only distro right now we would consider. I'll look into it when I get some time and see what the size implications of it are and how hard it would be to dump the image contents with only slight modifications to create portable deb/rpm/tar packages (the second main reason we use scratch right now).
from signalfx-agent.
Mainly the second part of the first statement. I have only every needed to get into the container once in order to debug something. The only thing that is custom about our image is that we add config into it.
The issue that we are seeing with logs is that errors from collectd plugins are repeated with every flush / poll that causes rather noisey logs which make it into our logging solution. We wanted to pipe the logs from the smart agent through a filter (basically a rather length grep ...
) for errors we know are apparent but we simply don't care about or that are known issues due to an outdated agent version. This is where my interest in dumb-init
comes in.
I can take a stab at the alpine version of the agent if you like, or at least offer a PR with my suggestions for it?
from signalfx-agent.
I take that last part back... I had a look at that dockerfile, it is unwieldy.
from signalfx-agent.
Yeah the Dockerfile is pretty convoluted due to the bundling of all the runtimes and collectd. I'll look into the Alpine base soon.
from signalfx-agent.
Hey everyone -- there are no plans to change this. As we look to OpenTelemetry Collector, it also leverages a scratch image. Closing this issue as wontfix. Thanks for the dialog!
from signalfx-agent.
Related Issues (20)
- how to get the sfx metric details and its complete description. HOT 3
- SignalFx agent installation on proxy based server HOT 3
- package doesn't include libnss_resolve HOT 2
- From where we can download the signalfx agent latest rpm file to install.
- Signalfx agent is unable to pick the proxy configurations
- how to run a powershell script from exec plugin HOT 2
- Deprecation notice doesn't make sense to me HOT 2
- Issue regarding metric sf.org.num.mutingactive HOT 1
- Error: " Unable to collect username for process " from SignalFx agent HOT 1
- Signalfx agent making IMDSv1 request HOT 1
- StatsD parser doesn't handle tag values that contain colons
- API to get the list of CI reporting to console from any cloud native like EC2 instances HOT 1
- logLevel is not working in signalfx helm chart HOT 1
- Agent Occasionally Drops HTTP POST Connections HOT 1
- Helm chart version 1.9.4 does not work with App version 5.21.0 (error retrieving resource lock during leaderelection) HOT 2
- What is the upcoming release schedule? HOT 2
- Monitor never create with discoveryRule HOT 2
- metrics monitor `container_cpu_utilization` is seconds not percentages HOT 2
- Bump issue in k8s deployment files HOT 1
- check_links action should ignore SQL link HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from signalfx-agent.