Git Product home page Git Product logo

telemetry_collector's Introduction

telemetry_collector

Automatically build telemetry collector with Telegraf, Influxdb, and Grafana, the example of sensor paths is using the native yang model and OpenConfig yang model of NX-OS as an example. build.sh script will create self-signed certificates for TLS transport. Using docker images of Telegraf, Influxdb, and Grafana to create containers with docker-compose. tested with telegraf >= 1.12.1, influxdb >= 2.0 and grafana>=8.1.

NOTE:

This project has upgraded the Influxdb to 2.0 which is not supported by Chronograf anymore, the dashboard is changed to Grafana with a new set of sensor paths. original code is moved to branch chronograf_influxdb_1_x

Screenshoot

gnmi dashboard

Requirements:

docker-ce, OpenSSL, docker-compose, any Linux distribution, see Known Issues if trying it on MacOS

How to use

  1. To quickly start, set environment variables GNMI_USER and GNMI_PASSWORD, this user needs to be configured on nxos with a network-operator role at least, then use sudo ./build.sh start to start the containers:

    export GNMI_USER=telemetry
    export GNMI_PASSWORD=SuperSecretPassword
    ./build.sh start
    2020-07-30T22:49:02--LOG--influxdb database folder does not exist, creating one
    2020-07-30T22:49:02--LOG--change permission of config and data folder of influxdb
    2020-07-30T22:49:02--LOG--generating self-signed certificates for telegraf plugins
    2020-07-30T22:49:02--LOG--telegraf certificate does not exist, generating
    2020-07-30T22:49:02--LOG--gernerating private key for CN telegraf
    ...<ommited>
    

    By default, telegraf listens on tcp:57000 for gRPC dial-out, if you want to modify the port, change the config file etc/telegraf/telegraf.conf.example in the project folder

    gnmi dial-in is also enabled by default, modify the switches in build.sh with mgmt address and grpc port:

    # swtiches accept gNMI dial-in
    switches=( "172.25.74.70:50051" "172.25.74.61:50051" )

    When first starting the service, the script will check if certificates are generated, if not, it will create them for mdt and gnmi plugins to validate for 10 years. use http://<ip_address_of_host>:3000 to open Grafana gui. login is grafana/cisco123

  2. TLS is enabled on cisco_telemetry_mdt plugin, comment below lines in etc/telegraf/telegraf.conf to disable it:

    # uncomment below to enable tls for dial-out plguin
    tls_cert = "/etc/telegraf/cert/telegraf.crt"
    tls_key = "/etc/telegraf/cert/telegraf.key"

    certificate ./etc/telegraf/cert/telegraf.crt need be copied to nx-os to verify the collector's identity, then use the below command to enable TLS transport for the destination group, the <certificate name> needs to match the common name of telegraf.crt, it is set to telegraf in build.sh:

    switch(config)# telemetry
    switch(config-telemetry)# destination-group 1
    switch(conf-tm-dest)# ip address <collector address> port 57000 protocol gRPC encoding GPB
    switch(conf-tm-dest)# certificate /bootflash/telegraf.crt <certificate name>
    
    
  3. TLS need to be enabled for the gNMI plugin as well as nx-os, when configuring feature gRPC on a switch, a default certificate with 1-day validation is auto-generated, to configure the certificate for gRPC on nx-os, copy etc/telegraf/cert/gnmi.pfx to bootflash, then use below commands to import the certificate, the <export password> is set to cisco123 by default, you could modify it in build.sh, these steps are optional as the gnmi plugin in telegraf is set to disable certificate verification.

    switch(config)# crypto ca trustpoint gnmi_trustpoint
    switch(config-trustpoint)# crypto ca import gnmi_trustpoint pkcs12 bootflash:gnmi.pfx <export password>
    switch(config)# grpc certificate gnmi_trustpoint
    
  4. This tool will import a couple of pre-built dashboards:

    • The fabric dashboard dialout is an example of querying data from telemetry dial-out, you can find the example of the switch telemetry config that is used for this dashboard in telemetry.cfg.
    • The fabric dashboard gnmi is an example of querying data from gNMI dial-in.
    • The Endpoints shows the arp tables and Mac address tables of all the switches.
    • The Interface Counters shows all kinds of interface counters that is collected using the Openconfig model.
    • The System Capacity shows the current system utilization of NX-OS using metric collected from icam
  5. Example of telegraf configuration can be found below:

Known issue

  1. Before NX-OS 10.1(1), a single subscription of gNMI dial-in can only be SAMPLE or ON_CHANGE, not both. In order to configure different types of subscriptions, need to start two telegraf instances to separate SAMPLE and ON_CHANGE sensor paths. Please take a look at enhancement CSCvu58102 for detail.
  2. MacOS uses the BSD version of sed by default which doesn't work with this script, use brew install gnu-sed to install the gnu version of sed if you are trying this script on MacOS.

Reference

  1. Cisco Nexus 9000 Series NX-OS Programmability Guide, Release 10.3(x)

telemetry_collector's People

Contributors

dsx1123 avatar vaneuk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

telemetry_collector's Issues

telegraf container restart loop

Hi,
First of all amazing job, this looks great.
We have an issue with a clean build.
After stopping and starting via build.sh, the telegraf container is stuck in a reload loop.
Is this a known issue? Could it be caused by faulty config? (only changes were switch IPs and uncomment for tls)

cisco nexus 93180 fx3 not working

Hi,
I got an issue when integrate nx 93180fx3 nxos 10.2(5) with telegraft (your repository).
My configuration below:

telemetry
  certificate /bootflash/cert.pem telegraf
  destination-group 1
    ip address x.x.x.1 port 57000 protocol gRPC encoding GPB 
    certificate /bootflash/cert.pem telegraf
  sensor-group 1
    data-source NATIVE
    path microburst
  subscription 1
    dst-grp 1
    snsr-grp 1 sample-interval 0

then

show telemetry transport 

Session Id  Dst Grp  IP Address      Port       Encoding     Transport  Status    
--------------------------------------------------------------------------------
0           1        x.x.x.1    57000      GPB          gRPC       Transmit Error
--------------------------------------------------------------------------------

I tried telnet from nexus switch to x.x.x.1 port 57000, port listen and telnet work fine. Any idea to resolve the issue?

Questions

Hi ,
I saw the interesting video on influxdata's website and am wondering how to get started with your repository? Do I need to download the telegraf, chronograf and influxdb from influxdata's site to use with your configuration files?
thanks for any pointers.

Rgds
Srikanth

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.