Git Product home page Git Product logo

Comments (8)

thenodon avatar thenodon commented on September 23, 2024

@ahmedaall the first thing is to execute each query as a separate Prometheus job. Even if each query is executed in its own goroutine the longest query time will be the time of the scrape, see the README.md for more info.
There is also work going on in branch #37. The idea is to use paging and each "page" will be executed in its own goroutine.

from aci-exporter.

ahmedaall avatar ahmedaall commented on September 23, 2024

@thenodon In my config I only have one class query that contain all my queries and a group class query for health ratio. I think I have first to implement multiple class queries.

from aci-exporter.

thenodon avatar thenodon commented on September 23, 2024

@ahmedaall you can just call for a single query by name using, as in README.md, do

curl -s 'http://localhost:9643/probe?target=XYZ&queries=ethpmdomstats'

where ethpmdomstats is a named query in the config.
I think you can do multiple named queries just by using a comma separator. Otherwise it's a bug.

from aci-exporter.

ahmedaall avatar ahmedaall commented on September 23, 2024

@thenodon Ok so I have to decompose my queries via curl and do a prometheus job for each of them. I know how to point to multiple endpoints in prometheus file. But I don't know how to split my config yaml file into several endpoints.

from aci-exporter.

thenodon avatar thenodon commented on September 23, 2024

@ahmedaall Your queries are already decomposed. If you look at the https://github.com/opsdis/aci-exporter/blob/master/example-config.yaml interface_info is one query, uptime_topsystem is one, etc. So in your prometehus.yml you can have a config like the below for just executing interface_info query:

          - job_name: 'aci_interface_info'
            scrape_interval: 1m
            scrape_timeout: 30s
            metrics_path: /probe
            params:
              queries: [ "interface_info" ]
            file_sd_configs:
              - files:
                  - 'file_sd/aci_file_discovery.yml'

            relabel_configs:
              - source_labels: [ __address__ ]
                target_label: __param_target
              - source_labels: [ __param_target ]
                target_label: instance
              - target_label: __address__
                replacement: localhost:9643

from aci-exporter.

ahmedaall avatar ahmedaall commented on September 23, 2024

@thenodon Got it.
My scrapping time is long because I have all the interface_info metrics in only one query (config bellow). A curl with only this metrics is taking approximately 2sec :). So I was trying to split each metrics to have one query for each of them. It doesn't seem to work when I try to have several metrics calling the same parent class_name (i.e l1PhysIf). How can I do to split them and still have parents attributes ?

PS: I am using children class of l1PhysIf to have "descr" attributes on all interfaces metrics which is a mandatory attributes in my infrastructure

class_queries:
  interface_info:
    class_name: l1PhysIf
    query_parameter: "?rsp-subtree=children&rsp-subtree-include=stats&rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min&query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))"
    metrics:
      - name: interface_speed_temp
        value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
        type: gauge
        help: The current operational speed of the interface, in bits per second.
#        value_transform:
#          'unknown': 0
#          '100M': 100000000
#          '1G': 1000000000
#          '10G': 10000000000
#          '25G': 25000000000
#          '40G': 40000000000
#          '100G': 100000000000
#          '400G': 400000000000    

      - name: interface_admin_state
        # The field in the json that is used as the metric value, qualified path (gjson) under imdata
        value_name: l1PhysIf.attributes.adminSt
        # Type
        type: gauge
        # Help text without prefix of metrics name
        help: The current admin state of the interface.
        value_transform:
          'down':               0       ## ~ disabled interfaces
          'up':                 1       ## ~ enabled interfaces

      - name: interface_oper_state
        # The field in the json that is used as the metric value, qualified path (gjson) under imdata
        value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSt
        # Type
        type: gauge
        # Help text without prefix of metrics name
        help: The current operational state of the interface. (0=unknown, 1=down, 2=up, 3=link-up)
        # A string to float64 transform table of the value
        value_transform:
          "down":               0       ## ~ disabled interfaces
          "up":                 1       ## ~ enabled interfaces     

      - name: interface_rx_unicast
        value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.unicastCum
        type: counter
        help: The number of unicast bytes received on the interface since it was integrated into the fabric.
        unit: bytes

      - name: interface_rx_multicast
        value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.multicastCum
        type: counter
        unit: bytes
        help: The number of multicast bytes received on the interface since it was integrated into the fabric.

      - name: interface_rx_broadcast
        value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.floodCum
        type: counter
        unit: bytes
        help: The number of broadcast bytes received on the interface since it was integrated into the fabric.

      - name: interface_rx_buffer_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.bufferCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          buffer overrun while receiving since it was integrated into the
          fabric.
      - name: interface_rx_error_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.errorCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          packet error while receiving since it was integrated into the
          fabric.
      - name: interface_rx_forwarding_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.forwardingCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          forwarding issue while receiving since it was integrated into the
          fabric.
      - name: interface_rx_loadbal_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.lbCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          load balancing issue while receiving since it was integrated into
          the fabric.

      - name: interface_tx_unicast
        value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.unicastCum
        type: counter
        help: The number of unicast bytes transmitted on the interface since it was integrated into the fabric.
        unit: bytes

      - name: interface_tx_multicast
        value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.multicastCum
        type: counter
        unit: bytes
        help: The number of multicast bytes transmitted on the interface since it was integrated into the fabric.

      - name: interface_tx_broadcast
        value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.floodCum
        type: counter
        unit: bytes
        help: The number of broadcast bytes transmitted on the interface since it was integrated into the fabric.

      - name: interface_tx_queue_dropped
        value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.afdWredCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface during queue
          management while transmitting since it was integrated into the
          fabric.

      - name: interface_tx_buffer_dropped
        value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.bufferCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          buffer overrun while transmitting since it was integrated into the
          fabric.

      - name: interface_tx_error_dropped
        value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.errorCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          packet error while transmitting since it was integrated into the
          fabric.

    # The labels to extract as regex
    labels:
      # The field in the json used to parse the labels from
      - property_name: l1PhysIf.attributes.dn
        # The regex where the string enclosed in the P<xyz> is the label name
        regex: "^topology/pod-(?P<pod_id>[1-9][0-9]*)/node-(?P<node_id>[1-9][0-9]*)/sys/phys-\\[(?P<interface_name>[^\\]]+)\\]"
        # Ajout de l'attribut descr au champs 
      - property_name: l1PhysIf.attributes.descr
        regex: "^(?P<interface_description>.*)"
      - property_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
        regex: "^(?P<speed_temp>.*)"

from aci-exporter.

thenodon avatar thenodon commented on September 23, 2024

@ahmedaall Yes you can split your single query into multiple queries even if the use the same class_name, see below, but it will not help you, since the majority of the time is related to the APIC response time. If you split it up and you still have to do the same api call multiple times, in your example:

https://173.36.219.190/api/class/l1PhysIf.json?rsp-subtree=children\u0026rsp-subtree-include=stats\u0026rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min\u0026query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))

It will still take the same time, but instead you will do it twice. Processing the result of the api call is more or less nothing in compare to the api response time. In my test its like 2ms.
In my test environment the above api call takes approx 500 ms. In addition to this the exporter will for every call do a /login (~700 ms) and a /logout (~100ms) call. So when I run your example, the majority of the time is related to login api. This is something I have on my list to cache and reuse instead of doing it every time.
To understand the different time components of processing look at the output of the aci-exporter log and look for exec_time.

So in your case it will not decrease the scrape time by splitting the query, it will just increase the load on the APIC since you will do it twice, but it possible as you can see from my example below. With the below configuration you can now run each query like:

curl -s 'localhost:9643/probe?target=xyz&queries=interface_info'
curl -s 'localhost:9643/probe?target=xyz&queries=interface_info_more'
curl -s 'localhost:9643/probe?target=xyz&queries=interface_info,interface_info_more'

My recommendation is that you should not split up a query when using the same class_name and query_paramater. It should be done as you have done it.

What can be improved is removing the overhead of doing the login api call every time.

class_queries:
  interface_info:
    class_name: l1PhysIf
    query_parameter: "?rsp-subtree=children&rsp-subtree-include=stats&rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min&query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))"
    metrics:
      - name: interface_speed_temp
        value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
        type: gauge
        help: The current operational speed of the interface, in bits per second.
      #        value_transform:
      #          'unknown': 0
      #          '100M': 100000000
      #          '1G': 1000000000
      #          '10G': 10000000000
      #          '25G': 25000000000
      #          '40G': 40000000000
      #          '100G': 100000000000
      #          '400G': 400000000000

      - name: interface_admin_state
        # The field in the json that is used as the metric value, qualified path (gjson) under imdata
        value_name: l1PhysIf.attributes.adminSt
        # Type
        type: gauge
        # Help text without prefix of metrics name
        help: The current admin state of the interface.
        value_transform:
          'down':               0       ## ~ disabled interfaces
          'up':                 1       ## ~ enabled interfaces

      - name: interface_oper_state
        # The field in the json that is used as the metric value, qualified path (gjson) under imdata
        value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSt
        # Type
        type: gauge
        # Help text without prefix of metrics name
        help: The current operational state of the interface. (0=unknown, 1=down, 2=up, 3=link-up)
        # A string to float64 transform table of the value
        value_transform:
          "down":               0       ## ~ disabled interfaces
          "up":                 1       ## ~ enabled interfaces

    # The labels to extract as regex
    labels:
      # The field in the json used to parse the labels from
      - property_name: l1PhysIf.attributes.dn
        # The regex where the string enclosed in the P<xyz> is the label name
        regex: "^topology/pod-(?P<pod_id>[1-9][0-9]*)/node-(?P<node_id>[1-9][0-9]*)/sys/phys-\\[(?P<interface_name>[^\\]]+)\\]"
        # Ajout de l'attribut descr au champs
      - property_name: l1PhysIf.attributes.descr
        regex: "^(?P<interface_description>.*)"
      - property_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
        regex: "^(?P<speed_temp>.*)"

  interface_info_more:
    class_name: l1PhysIf
    query_parameter: "?rsp-subtree=children&rsp-subtree-include=stats&rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min&query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))"
    metrics:
      - name: interface_rx_unicast
        value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.unicastCum
        type: counter
        help: The number of unicast bytes received on the interface since it was integrated into the fabric.
        unit: bytes

      - name: interface_rx_multicast
        value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.multicastCum
        type: counter
        unit: bytes
        help: The number of multicast bytes received on the interface since it was integrated into the fabric.

      - name: interface_rx_broadcast
        value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.floodCum
        type: counter
        unit: bytes
        help: The number of broadcast bytes received on the interface since it was integrated into the fabric.

      - name: interface_rx_buffer_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.bufferCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          buffer overrun while receiving since it was integrated into the
          fabric.
      - name: interface_rx_error_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.errorCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          packet error while receiving since it was integrated into the
          fabric.
      - name: interface_rx_forwarding_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.forwardingCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          forwarding issue while receiving since it was integrated into the
          fabric.
      - name: interface_rx_loadbal_dropped
        value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.lbCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          load balancing issue while receiving since it was integrated into
          the fabric.

      - name: interface_tx_unicast
        value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.unicastCum
        type: counter
        help: The number of unicast bytes transmitted on the interface since it was integrated into the fabric.
        unit: bytes

      - name: interface_tx_multicast
        value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.multicastCum
        type: counter
        unit: bytes
        help: The number of multicast bytes transmitted on the interface since it was integrated into the fabric.

      - name: interface_tx_broadcast
        value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.floodCum
        type: counter
        unit: bytes
        help: The number of broadcast bytes transmitted on the interface since it was integrated into the fabric.

      - name: interface_tx_queue_dropped
        value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.afdWredCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface during queue
          management while transmitting since it was integrated into the
          fabric.

      - name: interface_tx_buffer_dropped
        value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.bufferCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          buffer overrun while transmitting since it was integrated into the
          fabric.

      - name: interface_tx_error_dropped
        value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.errorCum
        type: counter
        unit: pkts
        help: The number of packets dropped by the interface due to a
          packet error while transmitting since it was integrated into the
          fabric.

    # The labels to extract as regex
    labels:
      # The field in the json used to parse the labels from
      - property_name: l1PhysIf.attributes.dn
        # The regex where the string enclosed in the P<xyz> is the label name
        regex: "^topology/pod-(?P<pod_id>[1-9][0-9]*)/node-(?P<node_id>[1-9][0-9]*)/sys/phys-\\[(?P<interface_name>[^\\]]+)\\]"
        # Ajout de l'attribut descr au champs
      - property_name: l1PhysIf.attributes.descr
        regex: "^(?P<interface_description>.*)"
      - property_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
        regex: "^(?P<speed_temp>.*)"

from aci-exporter.

thenodon avatar thenodon commented on September 23, 2024

This is partially fixed in commit 810e4a4 branch issue_42 with the use of refresh api instead of repeating login for every scrape request.

from aci-exporter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.