Git Product home page Git Product logo

snap-integration-kubernetes's Introduction

DISCONTINUATION OF PROJECT.

This project will no longer be maintained by Intel.

This project has been identified as having known security escapes.

Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.

Intel no longer accepts patches to this project.

Running Snap in various environments

Snap can be deployed to collect metrics in various environments including Docker containers and Kubernetes. It can be run in a Docker container to gather metrics i.e. from host and other containers. Deployment of Snap in Kubernetes cluster gives a possibility to monitor pods in the cluster. In this repo you will find information on how to run Snap in those environments.

  1. Running Snap
  1. Customization and configuration
  2. Contributing
  3. License

1. Running Snap

First step is to download this repo. All of the needed files are in the snap-integration-kubernetes directory.

$ git clone https://github.com/intelsdi-x/snap-integration-kubernetes/
$ cd ./snap-integration-kubernetes

Snap in Docker container

To learn about running Snap in a Docker container run example Running Snap in Docker container.

Snap in Kubernetes

To learn about running Snap in Kubernetes run example Running Snap in Kubernetes pod.

Here you'll find an example of running Snap with Kubernetes on Google Compute Engine.

2. Customization and configuration

Inside Docker container it is possible to load most of the Snap plugins. The list of all Snap plugins is available in plugin catalog. After you choose plugin you click the plugin name. This redirects you to the plugin repository.

To use plugin inside the container you need to download its binary. In order to get plugin binary URL you go to the release section...

...and copy the link for the latest plugin release.

Many of the plugins require prior configuration and adjustment of container or Kubernetes manifest. The example of such plugin is Snap Docker collector plugin. The Docker collector allows to collect runtime metrics from Docker containers and its host machine. It gathers information about resource usage and performance characteristics. More information about docker collector can be found here.

All of the plugins requirements can be found in their documentation. The documentation of the Snap Docker plugin collector can be found here. Docker plugin collector needs access to files residing in the host machine:

  • /var/run/docker.sock
  • /proc
  • /usr/bin/docker
  • /var/lib/docker
  • /sys/fs/cgroup

This means that the original host files have to be available inside of the container. Running this plugin inside the container requires mapping of those files inside of the container. What is more, Docker collector plugin requires enviroment variable PROCFS_MOUNT to be set. It should point to the directory inside the container where original host directorry /proc is mounted. This has to be done in both cases: Docker container and Kubernetes pod.

Reconfiguration

The default Snap images are using autoload feature to simplify re-configuration of running Snap instance. The default autoload directory is /opt/snap/autoload, and can be chaged in snapteld.conf file - please refer to Snap configuration documentation for details. It is recommended to store plugins and tasks in autoload directory, so that plugins are automatically loaded, and tasks are automatically started, after snapteld restart.

To change configuration of running Snap follow this steps (inside Snap container).

  • edit config file /etc/snap/snapteld.conf
  • restart snapteld:
$ kill -HUP `pidof snapteld`

3. Contributing

We love contributions!

There's more than one way to give back, from examples to blogs to code updates. See our recommended process in CONTRIBUTING.md.

4. License

Snap, along with this plugin, is an Open Source software released under the Apache 2.0 License.

snap-integration-kubernetes's People

Contributors

andrzej-k avatar rdower avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

snap-integration-kubernetes's Issues

Snap on GCE entrypoing needs refactoring

Source file start_snap/start_snap.go needs refactoring:

  1. Clear separation of steps.
  2. Replace sleep command when waiting for plugins to be loaded on all members (see TODO comment).

Broken "snaptel task list"

Hey,
I have kubernetes cluster (1 node) Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.7", GitCommit:"095136c3078ccf887b9034b7ce598a0a1faff769", GitTreeState:"clean", BuildDate:"2017-07-05T16:40:42Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"} . I tried to use Snap's clean helm chart from here. When executed it, all pods are working fine, but when I connect to snap pod using kubectl exec -it snap-xxxxx bash -n kube-system and inside pod try snaptel task list I got:

root@snap-1sxnc:/snap# snaptel task list
Error getting tasks:
Unknown API response: invalid character '&' in string escape code

 Received: {
  "meta": {
    "code": 200,
    "message": "Scheduled tasks retrieved",
    "type": "scheduled_task_list_returned",
    "version": 1
  },
  "body": {
    "ScheduledTasks": [
      {
        "id": "ce27b3f3-08a4-4cd0-920d-c8aaeac370c5",
        "name": "docker",
        "deadline": "5s",
        "creation_timestamp": 1501594381,
        "last_run_timestamp": 1501595552,
        "hit_count": 40,
        "failed_count": 40,
        "last_failure_message": "{\"error\":\"partial write:\\nunable to parse 'intel/docker/stats/cgroups/cpu_stats/cpu_usage/per_cpu/value,annotation.kubernetes.io/config.seen=2017-08-01T11:00:37.14721355+02:00,annotation.kubernetes.io/config.source=api,annotation.kubernetes.io/created-by={\\\"kind\\\":\\\"SerializedReference\\\"\\\\,\\\"apiVersion\\\":\\\"v1\\\"\\\\,\\\"reference\\\":{\\\"kind\\\":\\\"ReplicaSet\\\"\\\\,\\\"namespace\\\":\\\"default\\\"\\\\,\\\"name\\\":\\\"elasticsearch-master-3692376762\\\"\\\\,\\\"uid\\\":\\\"e28eb0ac-7697-11e7-8f9b-525400c9bf

root@snap-1sxnc:/snap#

Liveness probe for snap

It would be nice to have simple liveness probe for snap ( something like snaptel task list | grep Disabled ) so it can be automatically restarted (using RestartPolicy: OnFailure).
Currently when publisher fails 10 times it is disabled but user can check it only by using snap logs, so from kubernetes perspective pod healthy.

[centos@k8s-master-1 ~]$ kubectl get pods --all-namespaces | grep snap-zk02d
kube-system   snap-zk02d                             1/1       Running   0          8h
[centos@k8s-master-1 ~]$ kubectl --namespace=kube-system exec -it  snap-zk02d -- snaptel task list | grep Disabled
438a9445-2102-4c66-b6f9-c9feb30cd5dc 	 Task-438a9445-2102-4c66-b6f9-c9feb30cd5dc 	 Disabled 	 10 	 0 	 10 	 11:12AM 1-23-2017 	 could not select a plugin
[centos@k8s-master-1 ~]$ kubectl --namespace=kube-system exec -it  snap-zk02d -- tail -n 30 snapteld.log
time="2017-01-23T11:14:21Z" level=debug msg="Batch submission complete" _block=work-jobs _module=scheduler-workflow count-process-nodes=0 count-publish-nodes=1 parent-node-type=collector task-id=438a9445-2102-4c66-b6f9-c9feb30cd5dc task-name=Task-438a9445-2102-4c66-b6f9-c9feb30cd5dc 
time="2017-01-23T11:14:21Z" level=warning msg="Task failed" _block=spin _module=scheduler-task consecutive failure limit=10 consecutive failures=10 error="could not select a plugin" task-id=438a9445-2102-4c66-b6f9-c9feb30cd5dc task-name=Task-438a9445-2102-4c66-b6f9-c9feb30cd5dc 
time="2017-01-23T11:14:21Z" level=error msg="Task disabled due to consecutive failures" _block=spin _module=scheduler-task consecutive failures=10 error="could not select a plugin" task-id=438a9445-2102-4c66-b6f9-c9feb30cd5dc task-name=Task-438a9445-2102-4c66-b6f9-c9feb30cd5dc 
time="2017-01-23T11:14:21Z" level=debug msg="event received" _block=handle-events _module=scheduler-events disabled-reason="Task disabled with error: could not select a plugin" event-namespace=Scheduler.TaskDisabled task-id=438a9445-2102-4c66-b6f9-c9feb30cd5dc 
time="2017-01-23T11:14:21Z" level=debug msg="plugin unsubscription" _block=subscriptionGroup.unsubscribePlugins _module=control name=cpu type=collector version=6 
time="2017-01-23T11:14:21Z" level=debug msg="plugin unsubscription" _block=subscriptionGroup.unsubscribePlugins _module=control name=docker type=collector version=6 
time="2017-01-23T11:14:21Z" level=debug msg="plugin unsubscription" _block=subscriptionGroup.unsubscribePlugins _module=control name=kairos type=publisher version=-1 
time="2017-01-23T11:14:21Z" level=info msg="Nothing to do for this event" _block=handle-events _module=control-runner event=Control.PluginUnsubscribed 
time="2017-01-23T11:14:21Z" level=debug msg="event received" _block=handle-events _module=scheduler-events event-namespace=Scheduler.MetricsCollected metric-count=4371 task-id=438a9445-2102-4c66-b6f9-c9feb30cd5dc 
time="2017-01-23T11:14:21Z" level=info msg="Nothing to do for this event" _block=handle-events _module=control-runner event=Control.PluginUnsubscribed 
time="2017-01-23T11:14:21Z" level=debug msg="handling plugin unsubscription event" _block=subscribe-pool _module=control-runner event=Control.PluginUnsubscribed plugin-name=cpu plugin-type=collector plugin-version=6 
time="2017-01-23T11:14:21Z" level=debug msg="killing an available plugin in pool  collector:cpu:6" _block=handle-unsubscription _module=control-runner pool-count=1 pool-subscription-count=0 
time="2017-01-23T11:14:21Z" level=debug msg="plugin selected" _module=control-routing block=select hitcount=10 index="collector:cpu:v6:id1" pool size=1 strategy=least-recently-used 
time="2017-01-23T11:14:21Z" level=info msg="stopping available plugin" _module=control-aplugin block=stop plugin_name=collector:cpu:v6:id1 
time="2017-01-23T11:14:21Z" level=info msg="Nothing to do for this event" _block=handle-events _module=control-runner event=Control.PluginUnsubscribed 
time="2017-01-23T11:14:21Z" level=debug msg="handling plugin unsubscription event" _block=subscribe-pool _module=control-runner event=Control.PluginUnsubscribed plugin-name=docker plugin-type=collector plugin-version=6 
time="2017-01-23T11:14:21Z" level=debug msg="killing an available plugin in pool  collector:docker:6" _block=handle-unsubscription _module=control-runner pool-count=1 pool-subscription-count=0 
time="2017-01-23T11:14:21Z" level=debug msg="plugin selected" _module=control-routing block=select hitcount=10 index="collector:docker:v6:id1" pool size=1 strategy=least-recently-used 
time="2017-01-23T11:14:21Z" level=info msg="stopping available plugin" _module=control-aplugin block=stop plugin_name=collector:docker:v6:id1 
time="2017-01-23T11:14:21Z" level=debug msg="handling plugin unsubscription event" _block=subscribe-pool _module=control-runner event=Control.PluginUnsubscribed plugin-name=kairos plugin-type=publisher plugin-version=-1 
time="2017-01-23T11:14:21Z" level=info msg="hard killing available plugin" _module=control-aplugin block=kill plugin_name=collector:docker:v6:id1 
time="2017-01-23T11:14:21Z" level=info msg="hard killing available plugin" _module=control-aplugin block=kill plugin_name=collector:cpu:v6:id1 
time="2017-01-23T19:17:17Z" level=debug msg="API request" _module="_mgmt-rest" index=3 method=GET url="/v1/tasks" 
time="2017-01-23T19:17:17Z" level=debug msg="API response" _module="_mgmt-rest" index=3 method=GET status=OK status-code=200 url="/v1/tasks"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.