Git Product home page Git Product logo

Comments (7)

keitwb avatar keitwb commented on August 13, 2024

Hi Sean,
The two main reasons are:

  • Size of the final image
  • Simplicity in creating deb/rpm/tar packages that work the same across all distros (portability)

The Docker image serves as a canonical bundle of the agent. We bundle a Python runtime and Java runtime (OpenJDK 1.8) as well as quite a few shared libraries for collectd. For simplicity and consistency, we just gather dependencies on Ubuntu 16.04 and then just dump them into the scratch image, and then use that as the basis for the
RPM and deb packages (basically the filesystem of the Docker image gets installed to /usr/lib/signalfx-agent if you use the RPM or deb). While deploying the agent in a container is not uncommon (esp in environments like K8s), it is still very common to deploy it directly on hosts as well, so this gives us a single bundle to maintain and gives us library consistency across all distros. There are also some users that require running the agent in highly restrictive environments where they do not have package managers or access to traditional install paths, so we can very easily make a tarball of the Docker image that they can extract to anywhere in the host filesystem and run in a self-contained way.

Install tools into the image at runtime to diagnose issues regarding networking, performance, utilisation
complete linux environment for when debugging issues

We do include several useful utilities in the image that allow at least some basic debugging. One thing that I am thinking about doing is just include a statically linked busybox instance in there to get more utilities for possibly less space usage. Were there any tools in particular you have wanted to use in an agent container that weren't there?

Support for dumb-init to avoid the zombie process problem

If you want to use dumb-init, you can just download the static binary of it with curl (which is included in the current agent image) and make it the entrypoint in your own Dockerfile that FROMs the agent image. I'm not really that concerned about zombies though since there is generally only a single level of subprocesses under the agent (i.e. collectd and python) and the agent should be waiting for them whenever it makes them (or else it is an agent bug). If you are using custom collectd plugins or your own Python-based monitors that make a bunch of child procs, it could be a concern -- but you are going to want to make a custom image anyway to bundle those with the agent so you can just thrown in a formal init process like dumb-init there. It is going to create a new layer regardless of how you install it since it isn't part of distros by default.

We are currently in the process of making an image using a RHEL base image in order to get Redhat container certification. For that we are using a minimalistic RHEL distro called RHEL-atomic and even then the overhead of having the full distro environment is around 113MB (or ~20% greater than the normal agent image). When I was first doing it I used plain RHEL7 and it was about 300MB larger (~70%) than the current image.

packages installed can be updated easily

It is generally not a standard practice to upgrade packages in containers at runtime. Even if you could upgrade them, the agent processes using them would have to restart to get the updated libraries, which would make the container die and all the upgrades would be lost. We currently do a docker build --pull on every build to ensure we have the latest Ubuntu base image (which gets updated pretty regularly), which also causes all of the latest packages to be installed, so they should be fairly up to date already.

I'm all for making the agent image more convenient for users, but I would rather do it by directly putting a larger suite of tools in the agent image and make it a little bit bigger vs. putting a full blown distro in the image and making it a lot bigger.

from signalfx-agent.

MovieStoreGuy avatar MovieStoreGuy commented on August 13, 2024

I really appreciate your response @keitwb ,

To be honest, I never thought of it within the context listed.
The reason I brought it up was mainly for the dumb-init part and the frustration I have had dealing with some networking issues in the past.

The reason I want I wanted to have a thing like a package manager is that I extend your agents to include some additional config and scripts so if I can avoid bumping versions to resolve potential security issues would be amazing but I understand if this can't be provided by you guys.

Is there any consideration for using an alpine image?
So there isn't any tools that I can think off hand that aren't part of the final docker image, I think ping was one I was trying to use but wasn't there.

I was more interested bringing in dumb-init just as a base line of protection for our teams that have to run the final image we provide. We are also considering piping log data from the smart agent into a filter for known things we don't care about such as known missing values which generate a log for each poll which is another reason for adding dumb init into the image.

I get where you are coming from and this more coming from the 1% of times where it would be nice not to based of scratch.

from signalfx-agent.

keitwb avatar keitwb commented on August 13, 2024

To make sure I understand your use case properly, are you talking about execing into the container at runtime and installing stuff in a shell or are you talking more about making a custom image that extends the image we release on quay.io?

If you find the missing value warning annoying, see #728. Yeah I would agree if you are extending the agent image with additional processes like a log filter in the same container, a proper init process is a good idea. Hopefully that is easy enough for you to add with curl and a static compiled dumb-init though (I understand other tools are easier to get with a package manager however).

Alpine is probably the only distro right now we would consider. I'll look into it when I get some time and see what the size implications of it are and how hard it would be to dump the image contents with only slight modifications to create portable deb/rpm/tar packages (the second main reason we use scratch right now).

from signalfx-agent.

MovieStoreGuy avatar MovieStoreGuy commented on August 13, 2024

Mainly the second part of the first statement. I have only every needed to get into the container once in order to debug something. The only thing that is custom about our image is that we add config into it.

The issue that we are seeing with logs is that errors from collectd plugins are repeated with every flush / poll that causes rather noisey logs which make it into our logging solution. We wanted to pipe the logs from the smart agent through a filter (basically a rather length grep ...) for errors we know are apparent but we simply don't care about or that are known issues due to an outdated agent version. This is where my interest in dumb-init comes in.

I can take a stab at the alpine version of the agent if you like, or at least offer a PR with my suggestions for it?

from signalfx-agent.

MovieStoreGuy avatar MovieStoreGuy commented on August 13, 2024

I take that last part back... I had a look at that dockerfile, it is unwieldy.

from signalfx-agent.

keitwb avatar keitwb commented on August 13, 2024

Yeah the Dockerfile is pretty convoluted due to the bundling of all the runtimes and collectd. I'll look into the Alpine base soon.

from signalfx-agent.

flands avatar flands commented on August 13, 2024

Hey everyone -- there are no plans to change this. As we look to OpenTelemetry Collector, it also leverages a scratch image. Closing this issue as wontfix. Thanks for the dialog!

from signalfx-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.