Git Product home page Git Product logo

Comments (65)

camrossi avatar camrossi commented on June 22, 2024 2

I tested against a real ACI 5.2(8) with my code and it worked just fine by using infraCont however I did mage to make it crash...
It's the name you use: if start with a Capital letter it crashes:
Not sure if if the same but I see that @ahmedaall fabric name is MY_FABRIC so I would say probably?

data:
  config.yaml: |
    fabrics:
      Fab1:

but this works just fine, can you check if you have the same behaviour ?

data:
  config.yaml: |
    fabrics:
      fab1:

@thenodon let me see if I can get you read only access to one of our DMZ fabrics so you can actually test on something real, might not have lots of flexibility in term of versions but would be better than what you have now :D

Here my crash trace when the uppercase:

2023/09/14 23:04:00 http: panic serving 10.32.0.11:36052: runtime error: invalid memory address or nil pointer dereference
goroutine 2288 [running]:
net/http.(*conn).serve.func1()
	/usr/local/go/src/net/http/server.go:1854 +0xbf
panic({0x92ec20, 0xda7d70})
	/usr/local/go/src/runtime/panic.go:890 +0x263
main.AciConnection.login({{0xa7bb70, 0xc000b443f0}, 0x0, 0xc000940518, 0xc000b44870, 0xc000b44840, {{0xa78320, 0xc00013dcc0}, 0x0, {0xa7a688, ...}, ...}, ...})
	/build/aci-connection.go:85 +0x31
main.aciAPI.CollectMetrics({{0xa7bb70, 0xc000b443f0}, {{0xa7bb70, 0xc000b443f0}, 0x0, 0xc000940518, 0xc000b44870, 0xc000b44840, {{0xa78320, 0xc00013dcc0}, ...}, ...}, ...})
	/build/aci-api.go:109 +0xb8
main.HandlerInit.getMonitorMetrics({{0xc00032ef30?, 0xc00032f770?, 0xc00032f830?}, 0xc00032fd10?}, {0xa7b400, 0xc0002b0120}, 0xc000296200)
	/build/aci-exporter.go:272 +0x345
net/http.HandlerFunc.ServeHTTP(0x40d90a?, {0xa7b400?, 0xc0002b0120?}, 0x30?)
	/usr/local/go/src/net/http/server.go:2122 +0x2f
main.promMonitor.func1({0xa7b400?, 0xc0002b00e0}, 0xa76801?)
	/build/aci-exporter.go:345 +0xf8
net/http.HandlerFunc.ServeHTTP(0xa7bac8?, {0xa7b400?, 0xc0002b00e0?}, 0xa768c0?)
	/usr/local/go/src/net/http/server.go:2122 +0x2f
main.logcall.func1({0xa7b640?, 0xc00009e000}, 0xc000296100)
	/build/aci-exporter.go:322 +0x263
net/http.HandlerFunc.ServeHTTP(0xc000362000?, {0xa7b640?, 0xc00009e000?}, 0x40d90a?)
	/usr/local/go/src/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc0003e600b?, {0xa7b640, 0xc00009e000}, 0xc000296100)
	/usr/local/go/src/net/http/server.go:2500 +0x149
net/http.serverHandler.ServeHTTP({0xc000b44090?}, {0xa7b640, 0xc00009e000}, 0xc000296100)
	/usr/local/go/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc00027e240, {0xa7bb70, 0xc0003965d0})
	/usr/local/go/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:3089 +0x5ed

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024 2

@ahmedaall @camrossi have added documentation and check on /probe endpoint that fabric is in lower case. You can do a pull on the issue_1 branch. For more info see commit 6924973
Changed the query to @camrossi suggestion for detecting aci name, commit acb7428

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024 1

Hi @ahmedaall. I have not done any development on this issue, but If you like to contribute it would be great. Another option is to use some external LB that take care of this and that support more options like health checks etc.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024 1

Exactly

from aci-exporter.

camrossi avatar camrossi commented on June 22, 2024 1

Hi Folks,
Sorry a bit late to the party... Just wanted to say that you can use the infraCont class that returns the av for all the APICs. Same as what you do but perhaps more elegant.

I also tested your code and is not crashing for me.

I implemented this as a test BUT I do not know Go, I am not joking this is the first time I write anything in go so is probably horrible and I am amazed it even works

func (p aciAPI) getAciName() (string, error) {
	if p.connection.fabricConfig.AciName != "" {
		return p.connection.fabricConfig.AciName, nil
	}

	data, err := p.connection.getByClassQuery("infraCont", "query-target=self")
	if err != nil {
		return "", err
	}
	p.connection.fabricConfig.AciName = gjson.Get(data, "imdata.#.infraCont.attributes.fbDmNm").Array()[0].Str

	if p.connection.fabricConfig.AciName != "" {
		return p.connection.fabricConfig.AciName, nil
	}
	return "", fmt.Errorf("could not determine ACI name")
}

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024 1

@thenodon I just launch the sandbox. I am a newbie with it. I'll do the test as fast as possible.

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024 1

The cisco sandbox is on the internet, https://devnetsandbox.cisco.com/RM/Topology, so just configure aci-exporter with:

fabrics:
  cisco_sandbox:
    username: admin
    password: "!v3G@!4@Y"
    apic:
      - https://sandboxapicdc.cisco.com

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024 1

@ahmedaall can you run the same config with the same query against your own apic. Just add you config in the fabric section of the config file

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024 1

@camrossi Yes I didn't mention it but I did update my target in prometheus file :)
The problem was that my browser made an automatic uppercase to my target...So I tried with an other browser and the exporter works !
Now I will restart my failover test shuting down my APIC1. I'll keep you inform

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024 1

@thenodon not yet... i'll do it today or tomorrow. Sorry for the time it takes.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024 1

@thenodon Ok I just finished my failover test. Everything works perfectly. Even if I shutdown 2 of my 3 APICs it works good !

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024 1

Great @ahmedaall. If you have the time to share your setup I can add it to the README. Hope you are now happy with aci-exporter and give it a star.
I will make a release in the coming days.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Hi,
I was about to make the same issue request but you read my mind.
Have you been able to make progress on this point ?
Thx

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Hi @thenodon. During my failover test I saw after shutting down my Apic1 that the exporter become completely unreachable (while when i cut the apic2 or 3 the exporter still works perfectly). So instead of having my 3 Apics adresses in my config, I put the Load Balancer adresse that round robin between them with session persistence :

# Profiles for different fabrics
fabrics:
  # This is the Cisco provided sandbox that is open for testing
  # cisco_sandbox:
  #   username: admin
  #   password: <check the cisco sandbox to get the password>
  #   apic:
  #     - https://sandboxapicdc.cisco.com

  MY_FABRIC:
    username: ACI_USERNAME
    password: ACI_PASSWORD
    apic:
      - https://LB
#      - https://APIC1
#      - https://APIC2
#      - https://APIC3

So I tried again my failover test APIC redundancy by shutting down the Apic1. And at this moment I have this aci_up metrics to DOWN :

# HELP aci_up The connection state 1=UP, 0=DOWN
# TYPE aci_up gauge
aci_up{fabric="MY_FABRIC"} 0

PS : The Load Balancer round robin between APICs properly with a browser access during the failover, but the exporter still goes down.

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Hi @ahmedaall , great that you are testing this. I do not have the have an environment where I can test this myself. I think there should be something in the logs that could help. Great if you could attach. You could also do some debugging. I think a break point in aci-connection.go in the login function is a good starting point since its called on every request.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Hi @thenodon. Yes, here are the logs after noticing that the exporter goes down only when apic1 is down.
After restarting apic1, I turned off apic2 then apic3 and the exporter manages to work properly and switch to the next available apic :

BEFORE SHUTTING DOWN APIC1 : 
{"class":"topSystem","exec_time":1194174,"fabric":"MY_FABRIC","length":54981,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/topSystem.json?rsp-subtree-include=health"}
{"class":"fvTenant","exec_time":1194421,"fabric":"MY_FABRIC","length":2164,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fvTenant.json?rsp-subtree-include=health,required"}
{"class":"faults","exec_time":1195618,"fabric":"MY_FABRIC","length":2879,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/faultCountsWithDetails.json"}
{"class":"fvCtx","exec_time":1196088,"fabric":"MY_FABRIC","length":7894,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fvCtx.json?rsp-subtree-include=health,required"}
{"class":"eqptTemp5min","exec_time":1206707,"fabric":"MY_FABRIC","length":21586,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/eqptTemp5min.json?rsp-subtree-include=stats\u0026rsp-subtree-class=eqptTemp5min\u0026query-target-filter=wcard(eqptTemp5min.dn,\".*sup/sensor-2/CDeqptTemp5min\")"}
{"class":"fvAp","exec_time":1210511,"fabric":"MY_FABRIC","length":3642,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fvAp.json?rsp-subtree-include=health,required"}
{"class":"fabricHealthTotal","exec_time":1211007,"fabric":"MY_FABRIC","length":544,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fabricHealthTotal.json?query-target-filter=wcard(fabricHealthTotal.dn,\"topology/.*/health\")"}
{"class":"procSysMem5min","exec_time":1211795,"fabric":"MY_FABRIC","length":26934,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/procSysMem5min.json"}
{"exec_time":2271430,"fabric":"MY_FABRIC","level":"info","msg":"total scrape time ","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","time":"2023-08-21T06:58:44Z"}
{"exec_time":35451,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:45Z","uri":"https://APIC2/api/aaaLogout.xml"}
{"exec_time":4286804,"fabric":"MY_FABRIC","length":1064980,"level":"info","method":"GET","msg":"api call","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:46Z","uri":"/probe?target=MY_FABRIC"}

AFTER SHUTTING DOWN APIC1 : 
{"exec_time":90839,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":200,"time":"2023-08-21T07:04:31Z","uri":"https://APIC2/api/aaaLogin.xml"}
{"fabric":"MY_FABRIC","level":"info","msg":"Using apic https://APIC2","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","time":"2023-08-21T07:04:31Z"}
{"class":"aci_name","exec_time":90723882,"fabric":"MY_FABRIC","length":0,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":503,"time":"2023-08-21T07:06:01Z","uri":"https://APIC2/api/mo/topology/pod-1/node-1/av.json"}
{"fabric":"MY_FABRIC","level":"error","msg":"Request /api/mo/topology/pod-1/node-1/av.json failed - ACI api returned 503.","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","time":"2023-08-21T07:06:01Z"}
{"exec_time":25825,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":200,"time":"2023-08-21T07:06:01Z","uri":"https://APIC2/api/aaaLogout.xml"}
{"exec_time":90841072,"fabric":"MY_FABRIC","length":101,"level":"info","method":"GET","msg":"api call","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":503,"time":"2023-08-21T07:06:01Z","uri":"/probe?target=MY_FABRIC"}

Looks like the MO (Managed object) is reachable only through the apic1

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall Why do you not test to run against apic2 without LB and verify if that apic node works for queries. 503 is an interesting response. Running directly against apic2 from aci-exporter hopefully revile if the problem is on the apic or exporter side .

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon Yes, you have a point. I tried to run against apic2 and 3 as targets. It works perfectly. But when I shutdown the apic1 the exporter goes down :

{"class":"aci_name","exec_time":90295767,"fabric":"MY_FABRIC","length":0,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","status":503,"time":"2023-08-21T08:39:18Z","uri":"https://APIC2/api/mo/topology/pod-1/node-1/av.json"}

{"fabric":"MY_FABRIC","level":"error","msg":"Request /api/mo/topology/pod-1/node-1/av.json failed - ACI api returned 503.","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","time":"2023-08-21T08:39:18Z"}

{"exec_time":26211,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","status":200,"time":"2023-08-21T08:39:18Z","uri":"https://APIC2/api/aaaLogout.xml"}

{"exec_time":90442276,"fabric":"MY_FABRIC","length":101,"level":"info","method":"GET","msg":"api call","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","status":503,"time":"2023-08-21T08:39:18Z","uri":"/probe?target=MY_FABRIC"}

It look like an APIC issue

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Do you have the same behavior just using curl? Like to take aci-exporter out of the equation :)

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

The curl works from the exporter to the APICs :)

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall I meant do curl works directly at the apic2 when you take apic1 down for queries?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon yes, I confirm that curl works from the exporter to the apic2 when apic1 is down

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall as I interpret your answer is that the problem is related to the LB or aci-exporter, since curl request to any apic will continue to work even if one, or more apics, are down. A couple of more questions.
Is the LB set up to do round robin? If so when all apics are up will the requests from aci-exporter hit all apics in round-robin way? Using curl against the LB will it behave differently then aci-exporter?
Looking at the above logs it looks like aci-exporter is requesting https://APIC2 and not https://LB - I thought that was what you where testing.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon all the tests were done by directly specifying the target apics as below :

  MY_FABRIC:
    username: ACI_USERNAME
    password: ACI_PASSWORD
    apic:
#      - https://LB
      - https://APIC2
      - https://APIC3
#      - https://APIC1

I did not debug by putting the load balancer because I assume that the problem would not be on this side. But I confirm you that curl from the exporter to the APICs and LB works properly.
The load balancer is in round robin with session persistence

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Sound really strange - aci-exporter connect to https://APIC2 and that works until APIC1 is shutdown - correct? But using curl against https://APIC2 works even after APIC1 is shutdown. Is this correct ?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

the only explanation would be that there is a dependency towards apic1. In the logs we see that he manages to login but can't access https://APIC/api/mo/.../.../... It's unlikely but maybe access to the MO is only possible through the APIC1. To verify this hypothesis I will try to make a terraform plan from one of my vm by first turning off the APIC1

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

But how can it work for curl ?

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@camrossi don you any ideas what this problem can be related to?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Maybe there is a policy or other things that block the access against the MO, but not against the rest.
Note that in my curl I only put the url of the apics in destination without specifying a precise path :

curl https://APIC2

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

the only explanation would be that there is a dependency towards apic1. In the logs we see that he manages to login but can't access https://APIC/api/mo/.../.../... It's unlikely but maybe access to the MO is only possible through the APIC1. To verify this hypothesis I will try to make a terraform plan from one of my vm by first turning off the APIC1

I tried to do terraform plan while shutting down apic1 to see if it is a spof for api access. It works.... So maybe the problem is in the exporter side.

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Can you tell me more about the "terraform plan" you did.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

I wanted to verify the hypothesis that only apic1 can communicate with the api. I therefore launched a terraform plan against the address of the load balancer while apic1 was off. And it worked..

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Hi @thenodon, I found the issue !

urlMap := make(map[string]string)
urlMap["login"] = "/api/aaaLogin.xml"
urlMap["logout"] = "/api/aaaLogout.xml"
urlMap["faults"] = "/api/class/faultCountsWithDetails.json"
urlMap["aci_name"] = "/api/mo/topology/pod-1/node-1/av.json"

The urlMap aci_name contains only the path to the "pod-1/node-1" which corresponds to...APIC1. It is therefore normal that with each reboot of the APIC1, the export becomes unavailable. I think aci-connection.go should be updated so that it can contain the url maps of all the APICs targets entered in the config file.

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall - thanks for the trouble shooting. So if I have 2 apic you have node-1 and node-2? Is it possible to get the number of apic nodes? How is node numbering defined in the apic? node-1 will be the first to boot or is it a configuration?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon
So if I have 2 apic you have node-1 and node-2?
--> Yes but the nodes can be on different pods, and therefore not necessarily on pod1. Because the correspondence between the node and the pod is specific to each infrastructure, you can use /api/node/class/fabricNode.json?query-target-filter=eq(fabricNode.role,"controller") catching "dn" attribut (e.g "dn":"topology/pod-2/node-3"). And add it to the urlmap

How is node numbering defined in the apic?
--> It depends on each infrastructure. The order is defined by the administrator.

node-1 will be the first to boot or is it a configuration?
--> No, its fully redondant by design between all apics

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall I have tried to solve this issue. Would be great if you can test it. You can build aci-exporter from branch https://github.com/opsdis/aci-exporter/tree/issue_1. The fix is in commit da40506.
The name of the aci can now be set in the configuration file, but if not it use the /api/node/class/fabricNode.json?query-target-filter=eq(fabricNode.role,"controller") as you suggested. It will loop over returned controller until a name can be determined. The name will be cached until aci-exporter is stopped. Hopefully this will solve the issue. Looking forward to your feedback.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Hi @thenodon. Thank you for the update ! I'll keep you inform when update my exporter

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon I don't understand what value I have to enter in aci_name value

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall its optional, but if you set it the name from the ACI will not be used. This is the value of the aci label. Have an example in eaxmple-config.yaml

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall any update on this?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon I am on it. Keep you inform soon

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall would like to merge the branch to main, but would be great if you have tested.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

Hi @thenodon. Sorry for taking so much time to do the upgrade.

So after the upgrade the exporter goes down with those logs :

{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-13T11:59:58Z"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-13T11:59:58Z"}
2023/09/13 12:00:39 http: panic serving "PRIVATE_IP":35536: runtime error: invalid memory address or nil pointer dereference
goroutine 10 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1868 +0xb9
panic({0x9479a0?, 0xe2d440?})
/usr/local/go/src/runtime/panic.go:920 +0x270
main.AciConnection.login({{0xab0cd0, 0xc0001fc4b0}, 0x0, 0xc00046f078, 0xc00020f200, 0xc00020f1d0, {{0xaac400, 0xc000197900}, 0x0, {0xaae5e8, ...}, ...}, ...})
/build/aci-connection.go:85 +0x25
main.aciAPI.CollectMetrics({{0xab0cd0, 0xc0001fc4b0}, {{0xab0cd0, 0xc0001fc4b0}, 0x0, 0xc00046f078, 0xc00020f200, 0xc00020f1d0, {{0xaac400, 0xc000197900}, ...}, ...}, ...})
/build/aci-api.go:109 +0xa5
main.HandlerInit.getMonitorMetrics({{0xc0001b8900?, 0xc0001b92f0?, 0xc0001b9320?}, 0xc0001b9980?}, {0xaaffb8, 0xc0001f81e0}, 0xc0001edf00)
/build/aci-exporter.go:272 +0x345
net/http.HandlerFunc.ServeHTTP(0x410225?, {0xaaffb8?, 0xc0001f81e0?}, 0xf8?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.promMonitor.func2({0xaaffb8?, 0xc0001f81c0}, 0xaaa4b0?)
/build/aci-exporter.go:345 +0xe3
net/http.HandlerFunc.ServeHTTP(0xc0001ede00?, {0xaaffb8?, 0xc0001f81c0?}, 0xaaa4b0?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.logcall.func3({0xab0258?, 0xc00009e0e0}, 0xc0001ede00)
/build/aci-exporter.go:322 +0x156
net/http.HandlerFunc.ServeHTTP(0x445220?, {0xab0258?, 0xc00009e0e0?}, 0x70f85a?)
/usr/local/go/src/net/http/server.go:2136 +0x29
net/http.(*ServeMux).ServeHTTP(0xe70140?, {0xab0258, 0xc00009e0e0}, 0xc0001ede00)
/usr/local/go/src/net/http/server.go:2514 +0x142
net/http.serverHandler.ServeHTTP({0xc0001fc240?}, {0xab0258?, 0xc00009e0e0?}, 0x6?)
/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc00010acf0, {0xab0cd0, 0xc0001fc150})
/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 1
/usr/local/go/src/net/http/server.go:3086 +0x5cb

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Hi @ahmedaall did not expect that - sorry. Can you attach your config (without any secrets). Did it crash immediately on startup or do you know it happened on a call to /probe?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon It crashed immediately on startup. My config file is very long (900 lines), I share the beginning of the file :

# Exporter port
port: 9643
# Configuration file name default without postfix
config: config
# The prefix of the metrics
prefix: 

# Profiles for different fabrics
fabrics:
  # This is the Cisco provided sandbox that is open for testing
  # cisco_sandbox:
  #   username: admin
  #   password: <check the cisco sandbox to get the password>
  #   apic:
  #     - https://sandboxapicdc.cisco.com

  MY_FABRIC:
    username: ACI_USERNAME
    password: ACI_PASSWORD
    apic:
      - https://my-load-balancer
  # profile-fabric-01:
  #   # Apic username
  #   username: foo
  #   # Apic password.
  #   password: bar
  #   # The available apic controllers
  #   # The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
  #   apic:
  #     - https://apic1
  #     - https://apic2

# Http client settings used to access apic
# Below is the default values, where 0 is no timeout
httpclient:
#  insecurehttps: true
#  keepalive: 15
#  timeout: 10

# Http server settings - this is for the web server aci-exporter expose
# Below is the default values, where 0 is no timeout
httpserver:
#  read_timeout: 0
#  write_timeout: 0

# Define the output format should be in openmetrics format - deprecated from future version after 0.4.0, use below metric_format
#openmetrics: true
metric_format:
  # Output in openmetrics format, default false
  openmetrics: false
  # Transform all label keys to lower case format, default false. E.g. oobMgmtAddr will be oobmgmtaddr
  label_key_to_lower_case: true
  # Transform all label keys to snake case format, default false. E.g. oobMgmtAddr will be oob_mgmt_addr
  label_key_to_snake_case: false


# The query sections define queries that should be ran by all profiles

#
# ATTENTION
# All queries might not work in your environment depending on the permission your API user is granted or if you
# running against a real or simulated API environment.
#

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Thanks @ahmedaall - Looking at the trace it looks like there must be some doing a /probe to the exporter to get the error. Just starting the exporter looks like this:

{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-13T17:42:30+02:00"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-13T17:42:30+02:00"}

The main.AciConnection.login is first executed when when a call to endpoint /probe is done. Sure you do not have a Prometheus trying to scrape?
The panic seems to happen at row 85 in the aci-connection.go on the function login. This line is just:

for i, controller := range c.fabricConfig.Apic {
 ....
 }

What I have test is that the panic you get will happen if c.fabricConfig is nil, that will give the same tracestack. Forcing that to happen trying to have a fault config file or doing a rouge url request has been unsuccessful. I just works for me with.

Can you try to use a minimal config, test against cisco sandbox and make sure you do not have a prometheus trying to scrape. You can start it on some random port just for testing and run a curl against the exporter.
Some additional questions:

  • What version of golang are you using?
  • What OS and version do you build and run on?

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@camrossi - glad to here that the branch worked for you. And your golang works fine. The problem is that I can not get your alternative query to work using the class infraCont. I have tried it against cisco sandbox and also against a customers aci. I'm just getting 400. Can it be an aci version issue? Or do infraCont only work if there is a cluster of apic controllers? I think the cisco sandbox is a single apic. Check on https://pubhub.devnetcloud.com/media/apic-mim-ref-311/docs/MO-infraCont.html and it states:
infra:Cont An APIC cluster is comprised of multiple APIC controllers that provide operators with a unified real time monitoring, diagnostic, and configuration management capability for the ACI fabric.

@camrossi the query you suggested worked fine :)

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024
  • What version of golang are you using?
  • What OS and version do you build and run on?

I build from golang:latest. I am using version go1.21.1, in a Debian 12

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall I have the same golang version and running on ubuntu 22.04. I would suggest that you test what I recommended in #1 (comment)

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon ok I'll take those parameters and for the minimal config I'll directly import example-config.yaml and desactivate my current config yaml file

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon ok I'll take those parameters and for the minimal config I'll directly import example-config.yaml and desactivate my current config yaml file

Done. But what url do I need to enter to launch the exporter with sandbox target ?
I used to launch http://my_exporter/probe?target=MY_FABRIC with my local APIC

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall if you used the above config fΓΆr the cisco sandbox the name is cisco_sandbox. So the curl is:

 http://my_exporter/probe?target=cisco_sandbox&queries=my_query

Where my_query is some basic query in the config file.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon Everything works fine with the sandbox :

{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-14T14:04:14Z"}

{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-14T14:04:14Z"}

{"exec_time":1015415,"fabric":"cisco_sandbox","level":"info","method":"POST","msg":"api call fabric","requestid":"2VOI1DmZOBkw9N0AUieRthm53KK","status":200,"time":"2023-09-14T14:08:03Z","uri":"https://sandboxapicdc.cisco.com/api/aaaLogin.xml"}

{"fabric":"cisco_sandbox","level":"info","msg":"Using apic https://sandboxapicdc.cisco.com","requestid":"2VOI1DmZOBkw9N0AUieRthm53KK","time":"2023-09-14T14:08:03Z"}

{"class":"fabricNode","exec_time":189166,"fabric":"cisco_sandbox","length":580,"level":"info","method":"GET","msg":"api call fabric","requestid":"2VOI1DmZOBkw9N0AUieRthm53KK","status":200,"time":"2023-09-14T14:08:03Z","uri":"https://sandboxapicdc.cisco.com/api/class/fabricNode.json?query-target-filter=eq(fabricNode.role,\"controller\")"}

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon So in the same file I change the target from the sandbox to my APIC and I have this :

{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-14T14:35:03Z"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-14T14:35:03Z"}
2023/09/14 14:35:30 http: panic serving 100.64.6.20:35536: runtime error: invalid memory address or nil pointer dereference
goroutine 36 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1868 +0xb9
panic({0x9479a0?, 0xe2d440?})
/usr/local/go/src/runtime/panic.go:920 +0x270
main.AciConnection.login({{0xab0cd0, 0xc0002f17d0}, 0x0, 0xc0002d72c8, 0xc00031a930, 0xc00031a900, {{0xaac400, 0xc0002977c0}, 0x0, {0xaae5e8, ...}, ...}, ...})
/build/aci-connection.go:85 +0x25
main.aciAPI.CollectMetrics({{0xab0cd0, 0xc0002f17d0}, {{0xab0cd0, 0xc0002f17d0}, 0x0, 0xc0002d72c8, 0xc00031a930, 0xc00031a900, {{0xaac400, 0xc0002977c0}, ...}, ...}, ...})
/build/aci-api.go:109 +0xa5
main.HandlerInit.getMonitorMetrics({{0xc0002f0030?, 0xc0002f06c0?, 0xc0002f0750?}, 0xc0002f0c60?}, {0xaaffb8, 0xc000321820}, 0xc000327b00)
/build/aci-exporter.go:272 +0x345
net/http.HandlerFunc.ServeHTTP(0x410225?, {0xaaffb8?, 0xc000321820?}, 0xf8?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.promMonitor.func2({0xaaffb8?, 0xc000321800}, 0xaaa4b0?)
/build/aci-exporter.go:345 +0xe3
net/http.HandlerFunc.ServeHTTP(0xc000327a00?, {0xaaffb8?, 0xc000321800?}, 0xaaa4b0?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.logcall.func3({0xab0258?, 0xc000348000}, 0xc000327a00)
/build/aci-exporter.go:322 +0x156
net/http.HandlerFunc.ServeHTTP(0x445220?, {0xab0258?, 0xc000348000?}, 0x70f85a?)
/usr/local/go/src/net/http/server.go:2136 +0x29
net/http.(*ServeMux).ServeHTTP(0xe70140?, {0xab0258, 0xc000348000}, 0xc000327a00)
/usr/local/go/src/net/http/server.go:2514 +0x142
net/http.serverHandler.ServeHTTP({0xc0002f1560?}, {0xab0258?, 0xc000348000?}, 0x6?)
/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc000230900, {0xab0cd0, 0xc0002f1470})
/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 1
/usr/local/go/src/net/http/server.go:3086 +0x5cb

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall so now against the lb?

fabrics:
  MY_FABRIC:
    username: ACI_USERNAME
    password: ACI_PASSWORD
    apic:
      - https://my-load-balancer

First try to add the aci_name like this:

fabrics:
  MY_FABRIC:
    username: ACI_USERNAME
    password: ACI_PASSWORD
    apic:
      - https://my-load-balancer
    aci_name: foobar

Do you get the same panic?

You should also verify to run against one of the apic endpoint, no lb.
My thought is that its something related that lb and how its configured. So test with the same config against different fabric section. You can have multiple fabric settings in the same config file as described in the example config

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon
I have already test without the LB. Same issue.
ok i'll test. aci_name refers to the name of the fabric I guess ?

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall so everything works with the released version, both through lb and against apic endpoint.
The branch compiled version works against cisco sandbox, but not against any setup you have onprem, with or without lb.
Is this correct?
Its vary strange. What version of apic do you run?

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon I have version 5.2(6e)
"The branch compiled version works against cisco sandbox, but not against any setup you have onprem, with or without lb.
Is this correct?" --> Exactly

from aci-exporter.

camrossi avatar camrossi commented on June 22, 2024

If I make r.URL.Query().Get("target") lower case with strings.ToLower() the crash is gone.
But is really odd as is not a case of using a fabric name that dosen't exists ... very confusing anyway hope this shed some light !

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@camrossi thanks for your findings. I can confirm I can reproduce the same behavior with the uppercase name of the fabric. Currently not sure why, but I think its a combination on how yaml package deserialize the yaml file and how I match the target query path against the deserialize fabrics structures. I will investigate it more and what changed between the released version and the branch version related to dependency since @ahmedaall got it to work in the released version.
@ahmedaall and for everyone else - the fabrics named section must be lowercase! Change to lowercase and let us know if it works.
@camrossi - would be great to be given access to your DMZ fabric. So the question is if the infraCont is different between ACI 5.2 and 6 (cisco sandbox) since it did not worked for me against cisco sandbox.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@camrossi @thenodon thank you for the debug. So I try to change from uppercase to lower for the name of the fabric :

# Exporter port
port: 9643
# Configuration file name default without postfix
config: config
# The prefix of the metrics
prefix: 

fabrics:
#  cisco_sandbox:
#    username: admin
#    password: ""
#    apic:
#      - https://sandboxapicdc.cisco.com

  my_fabric:
    username: ACI_USERNAME
    password: ACI_PASSWORD
    apic:
      - https://APIC1
      - https://APIC2
      - https://APIC3

but...I have the same issue. I try with and without the lb. Did I forgot something ?

from aci-exporter.

camrossi avatar camrossi commented on June 22, 2024

Yes @ahmedaall you need to update the prometheus config as well (assuming you use that) so that it uses the lowercase name.
I have this for my prom lab config, relevant section is the target one

prometheus:
  prometheusSpec:
    scrapeInterval: 30s
    evaluationInterval: 30s
    additionalScrapeConfigs:
    - job_name: 'aci'
      scrape_interval: 1m
      scrape_timeout: 30s
      metrics_path: /probe
      static_configs:
      - targets: ['fab1','fab2']

and then in aci-exporter

    fabrics:
      # This is the Cisco provided sandbox that is open for testing
      fab1:
        # Apic username
        username: admin
        # Apic password
        password: <>
        # The available apic controllers
        # The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
        apic:
          - https://<>
      fab2:
        # Apic username
        username: admin
        # Apic password
        password: <>
        # The available apic controllers
        # The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
        apic:
          - https://<>
          - https://<>
          - https://<>

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

@ahmedaall have you had a chance to verify your LB solution yet? Would like to close this and make a release.

from aci-exporter.

ahmedaall avatar ahmedaall commented on June 22, 2024

@thenodon I confirm you that I am very happy with the aci-exporter. It is a really good job. Thanks a lot your reactivity with this issue.

from aci-exporter.

thenodon avatar thenodon commented on June 22, 2024

Thanks @ahmedaall and @camrossi for your support. Will close this with pull request #38

from aci-exporter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.