Git Product home page Git Product logo

aci-exporter's People

Contributors

camrossi avatar dependabot[bot] avatar minefuto avatar thenodon avatar thushjandan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

aci-exporter's Issues

Regex in value_transform

Hello,

First of all, thank you for this great exporter !

I'm currently trying to get usage visibility on dynamic pool vlans, and in order to do so would need to be able to use as metric values the attributes 'to' and 'from' of class fvnsEncapBlk.
Problem is that those attributes values are in the form 'vlan-xxxx', so I need to remove the 'vlan-' part in order to have a float value.

I did not manage to find a way to use regex on the transform_value function, is it possible ?

BR

Support configuration directory

Is your feature request related to a problem? Please describe.
The config.yml can become a large with all query definitions.

Describe the solution you'd like
Support the pattern of configuration directory that can have multiple files including a mix of queries and query types

A way to convert the systemUpTime to seconds

Is your feature request related to a problem? Please describe.
We are trying to retrieve the property topSystem.attributes.systemUpTime which works (as a label) but unfortunately we didn’t find a way to convert the format (ie. 210:20:16:20.000) to a correct value (seconds)

Describe the solution you'd like
Ideally we would want a metric like this :
aci_node_uptime_duration_seconds{aci="GVA_DC_FABRIC",nodeid="1",podid="1"} 4320000
with the value been the number of second since last reboot

Describe alternatives you've considered
Unfortunately I don’t see any other alternative as we didn’t find another property which would store the value in a raw format.
From what I understand if a calculation does not already exist it would require a custom one, correct ?

Scrape timestamp metrics as value

Is your feature request related to a problem? Please describe.
I want to scrape timestamp metrics as value(not label) from ACI such as modTs.

{
   "totalCount":"1",
   "imdata":[
      {
         "infraWiNode":{
            "attributes":{
               -snip-
               "modTs":"2020-04-18T05:24:07.722+00:00",

Describe the solution you'd like
I think aci-exporter need to translate from "2020-04-18T05:24:07.722+00:00" to unixtime.

Do you have any plans to add the above function?
Or is it already implemented?

Issue with query including an array of children

Describe the bug
A query like /api/class/fvAEPg.json?rsp-subtree-include=health,required will return the health for the object and often the health for the node, e.g:

 "children": [
          {
            "healthNodeInst": {
              "attributes": {
                "childAction": "deleteNonPresent",
                "chng": "400",
                "cur": "100",
                "isExisting": "no",
                "lcOwn": "local",
                "maxSev": "cleared",
                "modTs": "never",
                "nodeId": "101",
                "podId": "1",
                "prev": "20",
                "rn": "nodehealth-101",
                "status": "",
                "twScore": "100",
                "updTs": "2020-08-11T17:41:24.154+02:00",
                "weight": "1"
              }
            }
          },
          {
            "healthNodeInst": {
              "attributes": {
                "childAction": "deleteNonPresent",
                "chng": "400",
                "cur": "100",
                "isExisting": "no",
                "lcOwn": "local",
                "maxSev": "cleared",
                "modTs": "never",
                "nodeId": "102",
                "podId": "1",
                "prev": "20",
                "rn": "nodehealth-102",
                "status": "",
                "twScore": "100",
                "updTs": "2020-08-11T17:41:31.400+02:00",
                "weight": "1"
              }
            }
          },
          {
            "healthInst": {
              "attributes": {
                "childAction": "",
                "chng": "400",
                "cur": "100",
                "maxSev": "cleared",
                "modTs": "never",
                "prev": "20",
                "rn": "health",
                "status": "",
                "twScore": "100",
                "updTs": "2020-08-11T17:41:32.306+02:00"
              }
            }
          }
        ]

In the query I am only interested in the healthInst, but I have not find a way to filter on just healthInst when query the apic api.
With go gjson I have not found a way to express the parsing string for an array of different object. From the documentation it looks like code is needed. The only way I know about now is to sort that string and since healthInst will be "before" healthNodeInst we can apply the .0. for the index, like this:

This is not in anyway a solid solution.

ACI 5.2: Update login/logout API endpoints

These endpoints are not correct

urlMap["login"] = "/api/mo/aaaLogin.xml"
urlMap["logout"] = "/api/mo/aaaLogout.xml"

and are not valid in ACI 5.2 and should be updated to

urlMap["login"] = "/api/aaaLogin.xml"
urlMap["logout"] = "/api/aaaLogout.xml"

The new syntax should work across previous ACI versions as well.

Add static labels

Describe the solution you'd like
Be able to add static labels to queries that is not parsed from the query response. Like:

static_labels:
  - key: xyz
    value: XYZ

Implement Pagination Support

Currently aci-exporter works fine for most configurations but on large scale fabric if a query returns too many object it might hit the Maximum response size the APIC can handle and the query will fail.

aci-exporter should implement pagination

value transformation do not work

Describe the bug
value_transform do not work. It always display 0 as id no matter the value of the operstatus

    class_name: eqptPsu
    metrics:
      - name: infra_node_psu
        value_name: X
        type: "gauge"
        help: "Returns the info of the node psu states"
        unit: "info"
        value_calculation: "1"
    value_transform:
        'unknown':            0
        'ok':                 1
        'fail':               2
        'absent':             3
        'shut':               4
        'mismatch':           5


It is possible to generate a metric file with empty # HELP and / or # TYPE lines

Describe the bug
If you don’t input the type and / or help field in the yaml it will successfully generate the metric file but prometheus will fail to scrap it

To Reproduce
if you leave commented the type and / or help lines in the following metric the fille will be generated with empty # HELP and / or # TYPE lines

    metrics:
      - name: test_metric
        value_name: eqptCh.attributes.model
#        type: "gauge"
        unit: "ratio"
#        help: "Returns the kernel space cpu load of a fabric node"

Expected behavior
Option1: The program should detect the yaml file as invalid
Option2: The program should replace the missing filed to "default" values type: "gauge" / help: "Missing"

auth fails for some passwords containing special characters

auth failed for a user with a longish password that has several special chars in it. Putting the user and password between double quotes in the xml fixed the issue for me .

diff --git a/aci-connection.go b/aci-connection.go
index ccfc6dd..d4c9110 100644
--- a/aci-connection.go
+++ b/aci-connection.go
@@ -84,7 +84,7 @@ func newAciConnction(ctx context.Context, fabricConfig *Fabric) *AciConnection {
 func (c AciConnection) login() error {
        for i, controller := range c.fabricConfig.Apic {
                _, status, err := c.doPostXML("login", fmt.Sprintf("%s%s", controller, c.URLMap["login"]),
-                       []byte(fmt.Sprintf("<aaaUser name=%s pwd=%s/>", c.fabricConfig.Username, c.fabricConfig.Password)))
+                       []byte(fmt.Sprintf("<aaaUser name=\"%s\" pwd=\"%s\"/>", c.fabricConfig.Username, c.fabricConfig.Password)))
                if err != nil || status != 200 {

                        err = fmt.Errorf("failed to login to %s, try next apic", controller)

Add a aci_up metric

Is your feature request related to a problem? Please describe.
Currently no metrics is returned if there is an issue with login to the apic.

Describe the solution you'd like
An additional aci_up that follow exporter pattern to return 1 if target can be scraped and 0 if it failes

Describe alternatives you've considered
N/A

Additional context
N/A

Convert string to float for operSpeed

Describe the bug
The exporter is unable to convert operSpeed string values to float.

To Reproduce

class_queries:
  interface_info:
    class_name: l1PhysIf
    query_parameter: "?rsp-subtree=children&rsp-subtree-include=stats&rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min&query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))"
    metrics:
# It works here
      - name: interface_oper_state
        value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSt
        type: gauge
        help: The current operational state of the interface. (0=unknown, 1=down, 2=up, 3=link-up)
        # A string to float64 transform table of the value
        value_transform:
          'down':               0       ## ~ disabled interfaces
          'up':                 1       ## ~ enabled interfaces     

# But not here
      - name: interface_oper_speed
        value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
        type: gauge
        help: The current operational speed of the interface, in bits per second.
        value_transform:
          'unknown':            0
          '100M':       100000000
          '1G':        1000000000
          '10G':      10000000000
          '25G':      25000000000
          '40G':      40000000000
          '100G':    100000000000
          '400G':    400000000000

Additional context
{"level":"info","msg":"could not convert value to float, will return 0.0 ","time":"2023-08-11T09:56:17Z","value":"100G"}
I have this log for all values (1G,10G,etc.)

It works for all attributes of this classes e.g. operSt, but not for operSpeed

label regex not executed on child class

I am trying to extract the EPG to Port mapping with this code

  epg_to_port:
    class_name: vlanCktEp
    query_parameter: '?rsp-subtree-include=required&rsp-subtree-class=l2RsPathDomAtt&rsp-subtree=children'
    metrics:
      - name: dynamic_binding
        value_name: vlanCktEp.attributes.pcTag
        type: gauge
    labels:
      - property_name: vlanCktEp.attributes.epgDn
        regex: "^uni/tn-(?P<tenant>.*)/ap-(?P<app>.*)/epg-(?P<epg>.*)"
      - property_name: vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn
        regex: "^topology/pod-(?P<podid>[1-9][0-9]*)/node-(?P<nodeid>[1-9][0-9]+)/sys/conng/path-\\[(?P<interface>[^\\]]+)\\]"

But it seems the vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn label is not processed.

I found a way to make it work by passing the child in the metric.
For example if I use this in the metrics value_name vlanCktEp.children.[l2RsPathDomAtt].attributes.parentSKey then the vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn label is added as well as vlanCktEp.attributes.epgDn

Is this expected?

Add static label in metrics

Hi @thenodon :

Describe the bug
Is it possible to add static label in a class query :

BEFORE
aci_node_system_psu_status{aci="MY_FABRIC",fabric="MY_FABRIC",node_id="2",pod_id="1",psu_slot_id="6"} 1

AFTER
aci_node_system_psu_status{aci="MY_FABRIC",fabric="MY_FABRIC",node_id="2",pod_id="1",psu_slot_id="6" sensor_name="PDU"} 1

By static I mean that this label doesn't correspond any class attribut. It is like a custom label that is added to the config

More realistic ethpmDOMCurrentStats example

Is your feature request related to a problem? Please describe.
The example used under "Parsing and metrics" is not realistic as it doesn't extract a useful metric.

Describe the solution you'd like
It would be great if the example would show how to extract all the relevant metrics.

Examples of json-result from ACI

{
	"totalCount": "1",
	"imdata": [
		{
			"ethpmDOMCurrentStats": {
				"attributes": {
					"alert": "none",
					"childAction": "",
					"dn": "topology/pod-1/node-101/sys/phys-[eth1/1]/phys/domstats/current",
					"hiAlarm": "12.000001",
					"hiAlarm2": "0.000000",
					"hiAlarm3": "0.000000",
					"hiAlarm4": "0.000000",
					"hiAlarm5": "0.000000",
					"hiAlarm6": "0.000000",
					"hiAlarm7": "0.000000",
					"hiAlarm8": "0.000000",
					"hiWarn": "10.500001",
					"hiWarn2": "0.000000",
					"hiWarn3": "0.000000",
					"hiWarn4": "0.000000",
					"hiWarn5": "0.000000",
					"hiWarn6": "0.000000",
					"hiWarn7": "0.000000",
					"hiWarn8": "0.000000",
					"lanes": "1",
					"loAlarm": "1.000000",
					"loAlarm2": "0.000000",
					"loAlarm3": "0.000000",
					"loAlarm4": "0.000000",
					"loAlarm5": "0.000000",
					"loAlarm6": "0.000000",
					"loAlarm7": "0.000000",
					"loAlarm8": "0.000000",
					"loWarn": "2.500000",
					"loWarn2": "0.000000",
					"loWarn3": "0.000000",
					"loWarn4": "0.000000",
					"loWarn5": "0.000000",
					"loWarn6": "0.000000",
					"loWarn7": "0.000000",
					"loWarn8": "0.000000",
					"modTs": "never",
					"status": "",
					"value": "5.640000",
					"value2": "0.000000",
					"value3": "0.000000",
					"value4": "0.000000",
					"value5": "0.000000",
					"value6": "0.000000",
					"value7": "0.000000",
					"value8": "0.000000"
				}
			}
		}
	]
}
{
	"totalCount": "1",
	"imdata": [
		{
			"ethpmDOMCurrentStats": {
				"attributes": {
					"alert": "none",
					"childAction": "",
					"dn": "topology/pod-1/node-101/sys/phys-[eth1/36]/phys/domstats/current",
					"hiAlarm": "14.996000",
					"hiAlarm2": "14.996000",
					"hiAlarm3": "14.996000",
					"hiAlarm4": "14.996000",
					"hiAlarm5": "14.996000",
					"hiAlarm6": "14.996000",
					"hiAlarm7": "14.996000",
					"hiAlarm8": "14.996000",
					"hiWarn": "12.998000",
					"hiWarn2": "12.998000",
					"hiWarn3": "12.998000",
					"hiWarn4": "12.998000",
					"hiWarn5": "12.998000",
					"hiWarn6": "12.998000",
					"hiWarn7": "12.998000",
					"hiWarn8": "12.998000",
					"lanes": "8",
					"loAlarm": "4.496000",
					"loAlarm2": "4.496000",
					"loAlarm3": "4.496000",
					"loAlarm4": "4.496000",
					"loAlarm5": "4.496000",
					"loAlarm6": "4.496000",
					"loAlarm7": "4.496000",
					"loAlarm8": "4.496000",
					"loWarn": "5.000000",
					"loWarn2": "5.000000",
					"loWarn3": "5.000000",
					"loWarn4": "5.000000",
					"loWarn5": "5.000000",
					"loWarn6": "5.000000",
					"loWarn7": "5.000000",
					"loWarn8": "5.000000",
					"modTs": "never",
					"status": "",
					"value": "28.256001",
					"value2": "28.256001",
					"value3": "24.744001",
					"value4": "33.856003",
					"value5": "16.448000",
					"value6": "16.448000",
					"value7": "25.702002",
					"value8": "24.676001"
				}
			}
		}
	]
}

The relevant parts of the data:

value: actual value for lane 1
value2: actual value for lane 2 and so on

hiAlarm: high threshold value for alarm for lane 1, alarm will be triggered above this value
hiAlarm2: high threshold value for alarm for lane 2, alarm will be triggered above this value

loAlarm: low threshold value for alarm for lane 1, alarm will be triggered below this value

Similiar for hiWarn and loWarn.

The most important metrics are "value" for all the lanes as this indicates the optical quality of the link.

static_binding_info: value_regex_transformation not working

I am trying to extract the static biding infos with this class query:

      static_binding_info:
        class_name: fvAEPg
        query_parameter: '?rsp-subtree-include=required&rsp-subtree-class=fvRsPathAtt&rsp-subtree=children'
        metrics:
          - name: static_binding
            value_name: fvAEPg.children.[fvRsPathAtt].attributes.encap
            type: gauge
            value_regex_transformation: "vlan-(.*)"
            help: "Static Binding Infos"
        labels:
          - property_name: fvAEPg.attributes.dn
            regex: "^uni/tn-(?P<tenant>.*)/ap-(?P<app>.*)/epg-(?P<epg>.*)"
          - property_name: fvAEPg.attributes.[.*].attributes.tDn
            regex: "^topology/pod-(?P<podid>[1-9][0-9]*)/(protpaths|paths)-(?P<nodeid>[1-9][0-9].*)/pathep-\\[(?P<port>.+)\\]"
          - property_name: fvAEPg.attributes.[.*].attributes.encap
            regex: "^(?P<encap>.*)"

The query works but it seems we aren't triggering the value_regex_transformation:

I get this error message
{"level":"info","msg":"could not convert value to float, will return 0.0 ","time":"2023-07-12T03:02:43Z","value":"vlan-110"} that matches the toFloat function Info message and seems the toFloatTransform isn't called?

the returned data is
aci_static_binding{aci="ACI Fabric2",app="Trex",encap="vlan-110",epg="Trex-2",fabric="fab2",nodeid="203-204",podid="1",port="bm-01-40G_PolGrp",tenant="Trex"} 0

There are no other error messages so I am not sure why this is happening.

Implement concurrent requests call to the apic

Describe the solution you'd like
In the current implementation all requests to the ACI APIC is done sequentially. The scrape time will be the sum of all the requests. Instead goroutines should be used for each http request and the response returned on a channel(s). Depending on the parallelism of the machine the scrape time would be in the range of the time of the longest response time of all the request done to the APIC.

This will have a little effect on scrape time if the number of APIC queries are low, but will have a major effect on scrape time when the number increase and keep the scrape time more "constant" to a growing number of queries.

reduce scrapping time

Describe the bug
As my fabric and my exporter is getting bigger, I would like to know if there is a way to reduce scrapping time. Like multiple class queries for example

The prefix is not reflected in the # HELP and # TYPE lines

Describe the bug
When you use a prefix the # HELP and # TYPE lines in the generated metric file does not reflect it.
Effectively rendering those line inefectives

To Reproduce
Steps to reproduce the behavior:

  1. Use prefix: aci_
  2. Look at the generated files
# HELP health_ratio Returns health score
# TYPE health_ratio gauge
aci_health_ratio

Expected behavior
It should be:

# HELP aci_health_ratio Returns health score
# TYPE aci_health_ratio gauge
aci_health_ratio

Labels should alway be lowercase

Describe the bug
The labels set by he user in the yaml fils can contain upercase

Expected behavior
As a best practice labels should always be lowercase

You can find regex to validates labels names and metric names in this issue for example: prometheus/client_java#28

For label names maybe a .ToLower() could be an option and would minimise rework for existing files.
... But it might break existing queries

Reuse client login cookie data between api calls

Is your feature request related to a problem? Please describe.
Every query api call is doing a new login adding additional latency.

Describe the solution you'd like
After a valid login reuse the token until it fails

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.