bobrik / collectd-docker Goto Github PK

View Code? Open in Web Editor NEW

157.0 157.0 53.0 1.1 MB

Collect docker container resource usage

License: MIT License

Go 80.29% Smarty 4.21% Shell 15.50%

collectd-docker's People

Stargazers

Watchers

collectd-docker's Issues

More of a feature request to have an easy way to send test messages e.g., using collectd-tg

I am running this in a docker container and cant figure out a way to send/generate dummy messages. BTW any way I can get metrics from running docker container e.g., via their Stats API etc.. ?

Can't make ENV variables work

I'm trying to set this up to run on top of Mesos and Marathon. The idea is to be able to collect from most of the containers, if not all.

I'm using these ENV variables:
COLLECTD_DOCKER_TASK_ENV=MESOS_TASK_ID
COLLECTD_DOCKER_APP_ENV=MARATHON_APP_ID

I'm also setting these variables to the name of the application deployed:
COLLECTD_DOCKER_APP_ENV_TRIM_PREFIX=app_name
COLLECTD_DOCKER_TASK_ENV_TRIM_PREFIX=app_name

When using COLLECTD_DOCKER_TASK and COLLECTD_DOCKER_APP it works, but It's not practical when having lot of Marathon apps.

Any help is appreciated.

Hi,
It would be great if collectd could be launched as a global service using the new swarm mode in 1.12.
Right now, the only thing preventing me from doing so is I don't know how to set the hostname in this mode.

When launching node by node, I can do something like :
docker run --restart always --name collectd -d -v /var/run/docker.sock:/var/run/docker.sock -e GRAPHITE_HOST=xx.xx.xx.xx -e COLLECTD_HOST=hostname -e APP_LABEL_KEY=collectd_docker_app bobrik/collectd-docker

Now, when trying to run this as a global service from a swarm master, I tried to do this :
docker service create --mode global --name collectd --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock -e GRAPHITE_HOST=xx.xx.xx.xx -e COLLECTD_HOST=hostname -e APP_LABEL_KEY=collectd_docker_app --restart-max-attempts 10 --restart-condition any bobrik/collectd-docker

...which indeed launches a container on each of my swarm nodes, BUT the hostname is set to the swarm leader's name on every host.

I'll try to find a workaround, but if you see an obvious way of doing this, it would be great !

collectd exit after running

confuse about the readme page,not clear about which steps is necessary.
so just run the Minimal command:
docker run -d -v /var/run/docker.sock:/var/run/docker.sock
-e GRAPHITE_HOST= -e COLLECTD_HOST=
bobrik/collectd-docker

container run then exit,exit code is (6).
search answer from existing issue,found that environment variables should be set.

from page,basicly these variables have default value ,but not COLLECTD_DOCKER_APP and COLLECTD_DOCKER_TASK.

Environment variables
COLLECTD_HOST - host to use in metric name, defaults to MESOS_HOST if defined.
COLLECTD_INTERVAL - metric update interval in seconds, defaults to 10.
GRAPHITE_HOST - host where carbon is listening for data.
GRAPHITE_PORT - port where carbon is listening for data, 2003 by default.
GRAPHITE_PREFIX - prefix for metrics in graphite, collectd. by default.
APP_LABEL_KEY - container label to use for app name, collectd_docker_app by default.
APP_ENV_KEY - container environment variable to use for app name, COLLECTD_DOCKER_APP by default.
TASK_LABEL_KEY - container label to use for task name, collectd_docker_task by default.
TASK_ENV_KEY - container environment variable to use for task name, COLLECTD_DOCKER_TASK by default.

so I add COLLECTD_DOCKER_APP ,COLLECTD_DOCKER_TASK environment variables
but it still exit after running.
any idea about this？
thanks in advance

Multi container stats

Option to monitor all containers on a specific host, maybe this already exist and I do not understand your documentation. If so please let me know and I will test. I may also be able to get some help in adding this feature as well if it does not exist, if so I will let you know.

Possibly this would work by setting the COLLECTD_DOCKER_APP="*" or something similar.

Thanks

Binary Release?

Curious if you would be willing to also do a binary release of just the collector binary?

I'd like to use the collector and just drop in into our current collectd environment (which is a docker image) without having to build the go code myself.

kubernetes?

Hi,

have you had much luck getting this working with kubernetes? I've got it mostly working but i get some weird metrics where for example the memory usage of containers jumps between 1GB down to 40mb constantly.

I've manually stream the data from the docker stats api and from there it stays consistent at 1GB used..

Any idea how i can start to t-shoot? I'm new to golang so i was going to try dig in.. are you able to point me in the right direction? First i thought maybe to just dump the raw metrics to the console

read-function of plugin `snmp-graphite-collectd' failed. Will suspend it for 600.000 seconds

I am trying to use snmp plugin of collectd.
Here is my collectd.conf file

Hostname "graphite-collectd"

FQDNLookup true
Interval 10
Timeout 2
ReadThreads 5

LoadPlugin syslog
LoadPlugin logfile

LogLevel info File "/var/log/collectd.log" Timestamp true PrintSeverity false

LoadPlugin cpu
LoadPlugin write_graphite
<Plugin "write_graphite">

Host "172.16.222.3"
Port "2003"
Protocol "tcp"
Prefix "collectd."
StoreRates true
EscapeCharacter "."
AlwaysAppendDS false
SeparateInstances true

LoadPlugin snmp

<Data "total_auth_req_list">
Type "Total_Auth_Requests_List"
Table false
Instance ""
#Values "RADIUS-AUTH-SERVER-MIB::radiusAuthServTotalAccessRequests.0"
Values ".1.3.6.1.2.1.67.1.1.1.1.5.0"

<Host "graphite-collectd">
Address "172.16.111.78"
Version 2
Community "aaa"
Collect "total_auth_req_list"
Interval 300

LoadPlugin exec

Exec "collectd-docker-collector" "/usr/bin/collectd-docker-collector" "-endpoint" "unix:///var/run/docker.sock" "-host" "graphite-collectd" "-interval" "10"

Thanks in advance for help

types.db for usage with logstash or influxdb

Hi,

if you could add a types.db with the custom types it would be possible to use logstash or influxdb as backend. Plus the possibility of adding custom conf files. That would be great.

Fails to collect any data on Ubuntu / docker-machine virtual box

Hi,
i have been trying to get this running for a few hours without any luck.

I start the container as follows...

docker run -d -v /var/run/docker.sock:/var/run/docker.sock
-e GRAPHITE_HOST=192.168.99.100 -e COLLECTD_HOST=monitoring
bobrik/collectd-docker

The collector program doesn't seem to do anything. it is running and i can see it's process.

If i run it from the command line it does nothing... should i expect to see standard output?

I have ruled out an issue with collectd in the container as i ssh'd into the container and enabled cpu stats and wrote these fine to csv and my graphite server. I've also used wireshark to establish that collectd polls graphite but sends no data.

I don't think the IP address is the issue as collectd seems to work fine with it if i configure other sources.... i.e cpu.

I am running ubuntu 14.04, docker 1.10.3. I have ran your container directly in ubuntu and using the docker-machine virtual box (boot2docker). I see the same issue either way ... i.e no data in graphite.

Would you be able to add a diagnostic option maybe?

Do i need to run this as a privileged container?

It's a shame as i'm about to give a monitoring talk to a local devops group and this seemed ideal for a demo on monitoring docker containers stats.

metrics identifier question

The docs say the format should be collectd.<host>.docker_stats.<app>.<task>.<type>.<metric>, but the PUTVAL is PUTVAL %s/docker_stats-%s.%s/gauge-%s %d:%d\n.

From what I gather from https://collectd.org/documentation/manpages/collectd-exec.5.shtml, it looks like the PUTVAL should be PUTVAL %s/docker_stats/%s/%s/gauge-%s %d:%d\n.

If this is the case, I can send a PR for you.

Can't build collector

Hi,

I'm getting the following error when trying to build the collector myself:

github.com/bobrik/collectd-docker/collector
go/src/github.com/bobrik/collectd-docker/collector/writer.go:55: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:56: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:57: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:58: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:59: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:60: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:61: s.Stats.Network undefined (type docker.Stats has no field or method Network)
go/src/github.com/bobrik/collectd-docker/collector/writer.go:62: s.Stats.Network undefined (type docker.Stats has no field or method Network)

SEC: read-only Docker socket (w/ haproxy)

From "ENH,SEC: Create additional sockets with limited permissions" moby/moby#38879 ::

An example use case: securing the Traefik docker driver:

"Docker integration: Exposing Docker socket to Traefik container is a serious security risk" traefik/traefik#4174 (comment)

It seems it only require (read) operations : ServerVersion, ContainerList, ContainerInspect, ServiceList, NetworkList, TaskList & Events.

https://github.com/liquidat/ansible-role-traefik

This role does exactly that: it launches two containers, a traefik one and another to securely provide limited access to the docker socket. It also provides the necessary configuration.

Tecnativa/docker-socket-proxy#13

Creates a HAproxy container that proxies limited access to the docket socket

Can't restart a container

Doing:

start a container, everything is fine
stop the container
start the container again

expectation:

container starts like the first time

what happens:

the container can't start because of useradd: user 'collectd-docker-collector' already exists

Seems like this line in the start-script isn't fine.
I guess it will be all fine if the script checks for the existing user.

Network metrics not collected

Unable to get network metrics.

Tried also with privileged mode, network mode=host

CentOS 7.1.1503 - selinux disabled
Docker version

[root@myhost ~]# docker version
Client:
Version: 1.8.1
API version: 1.20
Go version: go1.4.2
Git commit: d12ea79
Built: Thu Aug 13 02:19:43 UTC 2015
OS/Arch: linux/amd64

[root@myhost ~]# docker run -d -e GRAPHITE_HOST=mygraphite --net=host --name collectd-mesos --privileged=true -e COLLECTD_HOST=myhost -v /var/run/docker.sock:/var/run/docker.sock bobrik/collectd-docker
7aa44ca3760945ab5fade20949b437d8ea8dba67f82cc88fd0c53142901de4fc

When I run it manually I get all metrics but network.

[root@myhost ~]# docker exec -ti collectd-mesos /bin/bash
root@myhost:/#
root@myhost:/# /collector -endpoint="unix:///var/run/docker.sock" --host=mygraphite --interval=100
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-cpu.user 1451263065:20780000000
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.active_anon 1451263065:54046720
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.pg_fault 1451263065:13080
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.rss 1451263065:54046720
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.active_file 1451263065:225280
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.pg_in 1451263065:13825
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.pg_out 1451263065:12483
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.unevictable 1451263065:0
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.inactive_file 1451263065:11923456
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-cpu.system 1451263065:17780000000
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-cpu.total 1451263065:46558770625
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.limit 1451263065:268435456
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.max 1451263065:91095040
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.usage 1451263065:66195456
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.cache 1451263065:12148736
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.inactive_anon 1451263065:0
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.mapped_file 1451263065:4476928
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.rss_huge 1451263065:50331648
PUTVAL mygraphite/docker_stats-myapp01.default/gauge-memory.writeback 1451263065:0

lxc-docker >= 1.7.1 data not collecting

Hi,
the last docker daemon version that runs collectd-docker and really collects and sends data to graphite is lxc-docker-1.6.2.
can I expect fix for collectd-docker for lxc-docker-1.7.1 + ?

Integrate the plugin in collectd

Hi,

I am looking for a collectd plugin which can provide resource monitoring data about the individual containers running on the host. In this process, I came across your plugin.

I have a few questions which I have listed them below:
a) What are the steps that I have to follow in order to build the source code?
b) Is there a way that the plugin can be integrated with an existing version of collectd
c) where is the output of the collectd-docker stored?

Regards,
Krishnaprasad

COLLECTD_DOCKER_APP_ENV_TRIM_PREFIX COLLECTD_DOCKER_TASK_ENV_TRIM_PREFIX don't seem to work

I am trying to use this image in a Rancher environment. Outside the orchestration perspective, Rancher is just another Docker launching service except some special docker labels which greatly help identify resources running on the cluster.

I've been trying to use these resources (e.g: stack-name and/or container-name) to setup APP_LABEL_KEY & TASK_LABEL_KEY ENV variables but there seems to be an issue with the 63 chars limitation. Although I've raised the respective ENV variables to more than 63 chars, containers falling under the category of more than 63 chars cannot be monitored.

For example I have a container with the following ENV variables:
COLLECTD_DOCKER_APP_ENV_TRIM_PREFIX=128
COLLECTD_DOCKER_TASK_ENV_TRIM_PREFIX=128

And the following values for APP and TASK:
APP_LABEL_KEY="aaaaaaaa-aaaaaaaaa-aaaaaaaa"
TASK_LABEL_KEY="aaaaaaaa-aaaaaaaaa-aaaaaaaa-aaaaaaaa-aaaaaa-1"

Char lenght of string "<APP_LABEL_KEY>.<TASK_LABEL_KEY>" equals 74 so, I do not see a reason for this not to work as described in the README.

Could you help provide some info or any suggestion?

SSL Error?

Hi,

I seem to be getting the below when starting the container? Not sure why?

exec plugin: exec_read_one: error = -cert string
exec plugin: exec_read_one: error = cert path for tls
exec plugin: exec_read_one: error = -endpoint string
exec plugin: exec_read_one: error = docker endpoint (default "unix:///var/run/docker.sock")
exec plugin: exec_read_one: error = -host string
exec plugin: exec_read_one: error = host to report
exec plugin: exec_read_one: error = -interval int
exec plugin: exec_read_one: error = interval to report (default 1)
exec plugin: Program `/usr/bin/collectd-docker-collector' has closed STDERR.

How to configure this container so that it detects other container's services automatically

I am trying to send metrics to hopsoft/graphite-statsd.
I have collectd container running on one node with other application containers.
I want that when other containers come up, collectd should automatically detect those containers and sends metrics of those containers to graphite.
I have snmp running in the application containers. So , I have added snmp plugin in collectd.conf. But the host in snmp plugin is the application container IP, so It should get updated once the other container comes up.
Is it possible??

Feature Request: use Docker TCP deamon.

Hi,
Could you add a possiblity to get stats from the tcp docker endpoint, plz ?

This would allow your plugin to collect all data from a swarm cluster, in one instance.
Without that, we need to add a container on each swarm node.

Regards,

why not build-commands in dockerfile?

Hi there,

while fixing a bug I saw that all build-commands are in an external build.sh. That's not the "docker-way".

Even it's a dedicated docker-repository I can't see a reason why that commands should be collected in a bash-script. If you're interested in, I would migrate that commands from the script to the dockerfile. That would (partially) speed up the build-progress.

Regards

bobrik / collectd-docker Goto Github PK

collectd-docker's People

Stargazers

Watchers

Forkers

collectd-docker's Issues

Recommend Projects

Recommend Topics

Recommend Org