Comments (65)
I tested against a real ACI 5.2(8) with my code and it worked just fine by using infraCont
however I did mage to make it crash...
It's the name you use: if start with a Capital letter it crashes:
Not sure if if the same but I see that @ahmedaall fabric name is MY_FABRIC
so I would say probably?
data:
config.yaml: |
fabrics:
Fab1:
but this works just fine, can you check if you have the same behaviour ?
data:
config.yaml: |
fabrics:
fab1:
@thenodon let me see if I can get you read only access to one of our DMZ fabrics so you can actually test on something real, might not have lots of flexibility in term of versions but would be better than what you have now :D
Here my crash trace when the uppercase:
2023/09/14 23:04:00 http: panic serving 10.32.0.11:36052: runtime error: invalid memory address or nil pointer dereference
goroutine 2288 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1854 +0xbf
panic({0x92ec20, 0xda7d70})
/usr/local/go/src/runtime/panic.go:890 +0x263
main.AciConnection.login({{0xa7bb70, 0xc000b443f0}, 0x0, 0xc000940518, 0xc000b44870, 0xc000b44840, {{0xa78320, 0xc00013dcc0}, 0x0, {0xa7a688, ...}, ...}, ...})
/build/aci-connection.go:85 +0x31
main.aciAPI.CollectMetrics({{0xa7bb70, 0xc000b443f0}, {{0xa7bb70, 0xc000b443f0}, 0x0, 0xc000940518, 0xc000b44870, 0xc000b44840, {{0xa78320, 0xc00013dcc0}, ...}, ...}, ...})
/build/aci-api.go:109 +0xb8
main.HandlerInit.getMonitorMetrics({{0xc00032ef30?, 0xc00032f770?, 0xc00032f830?}, 0xc00032fd10?}, {0xa7b400, 0xc0002b0120}, 0xc000296200)
/build/aci-exporter.go:272 +0x345
net/http.HandlerFunc.ServeHTTP(0x40d90a?, {0xa7b400?, 0xc0002b0120?}, 0x30?)
/usr/local/go/src/net/http/server.go:2122 +0x2f
main.promMonitor.func1({0xa7b400?, 0xc0002b00e0}, 0xa76801?)
/build/aci-exporter.go:345 +0xf8
net/http.HandlerFunc.ServeHTTP(0xa7bac8?, {0xa7b400?, 0xc0002b00e0?}, 0xa768c0?)
/usr/local/go/src/net/http/server.go:2122 +0x2f
main.logcall.func1({0xa7b640?, 0xc00009e000}, 0xc000296100)
/build/aci-exporter.go:322 +0x263
net/http.HandlerFunc.ServeHTTP(0xc000362000?, {0xa7b640?, 0xc00009e000?}, 0x40d90a?)
/usr/local/go/src/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc0003e600b?, {0xa7b640, 0xc00009e000}, 0xc000296100)
/usr/local/go/src/net/http/server.go:2500 +0x149
net/http.serverHandler.ServeHTTP({0xc000b44090?}, {0xa7b640, 0xc00009e000}, 0xc000296100)
/usr/local/go/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc00027e240, {0xa7bb70, 0xc0003965d0})
/usr/local/go/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3089 +0x5ed
from aci-exporter.
@ahmedaall @camrossi have added documentation and check on /probe endpoint that fabric is in lower case. You can do a pull on the issue_1 branch. For more info see commit 6924973
Changed the query to @camrossi suggestion for detecting aci name, commit acb7428
from aci-exporter.
Hi @ahmedaall. I have not done any development on this issue, but If you like to contribute it would be great. Another option is to use some external LB that take care of this and that support more options like health checks etc.
from aci-exporter.
Exactly
from aci-exporter.
Hi Folks,
Sorry a bit late to the party... Just wanted to say that you can use the infraCont
class that returns the av
for all the APICs. Same as what you do but perhaps more elegant.
I also tested your code and is not crashing for me.
I implemented this as a test BUT I do not know Go, I am not joking this is the first time I write anything in go so is probably horrible and I am amazed it even works
func (p aciAPI) getAciName() (string, error) {
if p.connection.fabricConfig.AciName != "" {
return p.connection.fabricConfig.AciName, nil
}
data, err := p.connection.getByClassQuery("infraCont", "query-target=self")
if err != nil {
return "", err
}
p.connection.fabricConfig.AciName = gjson.Get(data, "imdata.#.infraCont.attributes.fbDmNm").Array()[0].Str
if p.connection.fabricConfig.AciName != "" {
return p.connection.fabricConfig.AciName, nil
}
return "", fmt.Errorf("could not determine ACI name")
}
from aci-exporter.
@thenodon I just launch the sandbox. I am a newbie with it. I'll do the test as fast as possible.
from aci-exporter.
The cisco sandbox is on the internet, https://devnetsandbox.cisco.com/RM/Topology, so just configure aci-exporter with:
fabrics:
cisco_sandbox:
username: admin
password: "!v3G@!4@Y"
apic:
- https://sandboxapicdc.cisco.com
from aci-exporter.
@ahmedaall can you run the same config with the same query against your own apic. Just add you config in the fabric section of the config file
from aci-exporter.
@camrossi Yes I didn't mention it but I did update my target in prometheus file :)
The problem was that my browser made an automatic uppercase to my target...So I tried with an other browser and the exporter works !
Now I will restart my failover test shuting down my APIC1. I'll keep you inform
from aci-exporter.
@thenodon not yet... i'll do it today or tomorrow. Sorry for the time it takes.
from aci-exporter.
@thenodon Ok I just finished my failover test. Everything works perfectly. Even if I shutdown 2 of my 3 APICs it works good !
from aci-exporter.
Great @ahmedaall. If you have the time to share your setup I can add it to the README. Hope you are now happy with aci-exporter and give it a star.
I will make a release in the coming days.
from aci-exporter.
Hi,
I was about to make the same issue request but you read my mind.
Have you been able to make progress on this point ?
Thx
from aci-exporter.
Hi @thenodon. During my failover test I saw after shutting down my Apic1 that the exporter become completely unreachable (while when i cut the apic2 or 3 the exporter still works perfectly). So instead of having my 3 Apics adresses in my config, I put the Load Balancer adresse that round robin between them with session persistence :
# Profiles for different fabrics
fabrics:
# This is the Cisco provided sandbox that is open for testing
# cisco_sandbox:
# username: admin
# password: <check the cisco sandbox to get the password>
# apic:
# - https://sandboxapicdc.cisco.com
MY_FABRIC:
username: ACI_USERNAME
password: ACI_PASSWORD
apic:
- https://LB
# - https://APIC1
# - https://APIC2
# - https://APIC3
So I tried again my failover test APIC redundancy by shutting down the Apic1. And at this moment I have this aci_up metrics to DOWN :
# HELP aci_up The connection state 1=UP, 0=DOWN
# TYPE aci_up gauge
aci_up{fabric="MY_FABRIC"} 0
PS : The Load Balancer round robin between APICs properly with a browser access during the failover, but the exporter still goes down.
from aci-exporter.
Hi @ahmedaall , great that you are testing this. I do not have the have an environment where I can test this myself. I think there should be something in the logs that could help. Great if you could attach. You could also do some debugging. I think a break point in aci-connection.go in the login function is a good starting point since its called on every request.
from aci-exporter.
Hi @thenodon. Yes, here are the logs after noticing that the exporter goes down only when apic1 is down.
After restarting apic1, I turned off apic2 then apic3 and the exporter manages to work properly and switch to the next available apic :
BEFORE SHUTTING DOWN APIC1 :
{"class":"topSystem","exec_time":1194174,"fabric":"MY_FABRIC","length":54981,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/topSystem.json?rsp-subtree-include=health"}
{"class":"fvTenant","exec_time":1194421,"fabric":"MY_FABRIC","length":2164,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fvTenant.json?rsp-subtree-include=health,required"}
{"class":"faults","exec_time":1195618,"fabric":"MY_FABRIC","length":2879,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/faultCountsWithDetails.json"}
{"class":"fvCtx","exec_time":1196088,"fabric":"MY_FABRIC","length":7894,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fvCtx.json?rsp-subtree-include=health,required"}
{"class":"eqptTemp5min","exec_time":1206707,"fabric":"MY_FABRIC","length":21586,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/eqptTemp5min.json?rsp-subtree-include=stats\u0026rsp-subtree-class=eqptTemp5min\u0026query-target-filter=wcard(eqptTemp5min.dn,\".*sup/sensor-2/CDeqptTemp5min\")"}
{"class":"fvAp","exec_time":1210511,"fabric":"MY_FABRIC","length":3642,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fvAp.json?rsp-subtree-include=health,required"}
{"class":"fabricHealthTotal","exec_time":1211007,"fabric":"MY_FABRIC","length":544,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/fabricHealthTotal.json?query-target-filter=wcard(fabricHealthTotal.dn,\"topology/.*/health\")"}
{"class":"procSysMem5min","exec_time":1211795,"fabric":"MY_FABRIC","length":26934,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:44Z","uri":"https://APIC2/api/class/procSysMem5min.json"}
{"exec_time":2271430,"fabric":"MY_FABRIC","level":"info","msg":"total scrape time ","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","time":"2023-08-21T06:58:44Z"}
{"exec_time":35451,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:45Z","uri":"https://APIC2/api/aaaLogout.xml"}
{"exec_time":4286804,"fabric":"MY_FABRIC","length":1064980,"level":"info","method":"GET","msg":"api call","requestid":"2UHeqJJd39oafNMVwa2mHNJU3AN","status":200,"time":"2023-08-21T06:58:46Z","uri":"/probe?target=MY_FABRIC"}
AFTER SHUTTING DOWN APIC1 :
{"exec_time":90839,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":200,"time":"2023-08-21T07:04:31Z","uri":"https://APIC2/api/aaaLogin.xml"}
{"fabric":"MY_FABRIC","level":"info","msg":"Using apic https://APIC2","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","time":"2023-08-21T07:04:31Z"}
{"class":"aci_name","exec_time":90723882,"fabric":"MY_FABRIC","length":0,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":503,"time":"2023-08-21T07:06:01Z","uri":"https://APIC2/api/mo/topology/pod-1/node-1/av.json"}
{"fabric":"MY_FABRIC","level":"error","msg":"Request /api/mo/topology/pod-1/node-1/av.json failed - ACI api returned 503.","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","time":"2023-08-21T07:06:01Z"}
{"exec_time":25825,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":200,"time":"2023-08-21T07:06:01Z","uri":"https://APIC2/api/aaaLogout.xml"}
{"exec_time":90841072,"fabric":"MY_FABRIC","length":101,"level":"info","method":"GET","msg":"api call","requestid":"2UHfYFFTFXy9vN7M48szPfQ8cUD","status":503,"time":"2023-08-21T07:06:01Z","uri":"/probe?target=MY_FABRIC"}
Looks like the MO (Managed object) is reachable only through the apic1
from aci-exporter.
@ahmedaall Why do you not test to run against apic2 without LB and verify if that apic node works for queries. 503 is an interesting response. Running directly against apic2 from aci-exporter hopefully revile if the problem is on the apic or exporter side .
from aci-exporter.
@thenodon Yes, you have a point. I tried to run against apic2 and 3 as targets. It works perfectly. But when I shutdown the apic1 the exporter goes down :
{"class":"aci_name","exec_time":90295767,"fabric":"MY_FABRIC","length":0,"level":"info","method":"GET","msg":"api call fabric","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","status":503,"time":"2023-08-21T08:39:18Z","uri":"https://APIC2/api/mo/topology/pod-1/node-1/av.json"}
{"fabric":"MY_FABRIC","level":"error","msg":"Request /api/mo/topology/pod-1/node-1/av.json failed - ACI api returned 503.","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","time":"2023-08-21T08:39:18Z"}
{"exec_time":26211,"fabric":"MY_FABRIC","level":"info","method":"POST","msg":"api call fabric","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","status":200,"time":"2023-08-21T08:39:18Z","uri":"https://APIC2/api/aaaLogout.xml"}
{"exec_time":90442276,"fabric":"MY_FABRIC","length":101,"level":"info","method":"GET","msg":"api call","requestid":"2UHqtP9LRXobqtpFKzcGsaTZIrK","status":503,"time":"2023-08-21T08:39:18Z","uri":"/probe?target=MY_FABRIC"}
It look like an APIC issue
from aci-exporter.
Do you have the same behavior just using curl? Like to take aci-exporter out of the equation :)
from aci-exporter.
The curl works from the exporter to the APICs :)
from aci-exporter.
@ahmedaall I meant do curl works directly at the apic2 when you take apic1 down for queries?
from aci-exporter.
@thenodon yes, I confirm that curl works from the exporter to the apic2 when apic1 is down
from aci-exporter.
@ahmedaall as I interpret your answer is that the problem is related to the LB or aci-exporter, since curl request to any apic will continue to work even if one, or more apics, are down. A couple of more questions.
Is the LB set up to do round robin? If so when all apics are up will the requests from aci-exporter hit all apics in round-robin way? Using curl against the LB will it behave differently then aci-exporter?
Looking at the above logs it looks like aci-exporter is requesting https://APIC2 and not https://LB - I thought that was what you where testing.
from aci-exporter.
@thenodon all the tests were done by directly specifying the target apics as below :
MY_FABRIC:
username: ACI_USERNAME
password: ACI_PASSWORD
apic:
# - https://LB
- https://APIC2
- https://APIC3
# - https://APIC1
I did not debug by putting the load balancer because I assume that the problem would not be on this side. But I confirm you that curl from the exporter to the APICs and LB works properly.
The load balancer is in round robin with session persistence
from aci-exporter.
Sound really strange - aci-exporter connect to https://APIC2 and that works until APIC1 is shutdown - correct? But using curl against https://APIC2 works even after APIC1 is shutdown. Is this correct ?
from aci-exporter.
the only explanation would be that there is a dependency towards apic1. In the logs we see that he manages to login but can't access https://APIC/api/mo/.../.../... It's unlikely but maybe access to the MO is only possible through the APIC1. To verify this hypothesis I will try to make a terraform plan from one of my vm by first turning off the APIC1
from aci-exporter.
But how can it work for curl ?
from aci-exporter.
@camrossi don you any ideas what this problem can be related to?
from aci-exporter.
Maybe there is a policy or other things that block the access against the MO, but not against the rest.
Note that in my curl I only put the url of the apics in destination without specifying a precise path :
curl https://APIC2
from aci-exporter.
the only explanation would be that there is a dependency towards apic1. In the logs we see that he manages to login but can't access https://APIC/api/mo/.../.../... It's unlikely but maybe access to the MO is only possible through the APIC1. To verify this hypothesis I will try to make a terraform plan from one of my vm by first turning off the APIC1
I tried to do terraform plan while shutting down apic1 to see if it is a spof for api access. It works.... So maybe the problem is in the exporter side.
from aci-exporter.
Can you tell me more about the "terraform plan" you did.
from aci-exporter.
I wanted to verify the hypothesis that only apic1 can communicate with the api. I therefore launched a terraform plan against the address of the load balancer while apic1 was off. And it worked..
from aci-exporter.
Hi @thenodon, I found the issue !
urlMap := make(map[string]string)
urlMap["login"] = "/api/aaaLogin.xml"
urlMap["logout"] = "/api/aaaLogout.xml"
urlMap["faults"] = "/api/class/faultCountsWithDetails.json"
urlMap["aci_name"] = "/api/mo/topology/pod-1/node-1/av.json"
The urlMap aci_name contains only the path to the "pod-1/node-1" which corresponds to...APIC1. It is therefore normal that with each reboot of the APIC1, the export becomes unavailable. I think aci-connection.go should be updated so that it can contain the url maps of all the APICs targets entered in the config file.
from aci-exporter.
@ahmedaall - thanks for the trouble shooting. So if I have 2 apic you have node-1 and node-2? Is it possible to get the number of apic nodes? How is node numbering defined in the apic? node-1 will be the first to boot or is it a configuration?
from aci-exporter.
@thenodon
So if I have 2 apic you have node-1 and node-2?
--> Yes but the nodes can be on different pods, and therefore not necessarily on pod1. Because the correspondence between the node and the pod is specific to each infrastructure, you can use /api/node/class/fabricNode.json?query-target-filter=eq(fabricNode.role,"controller") catching "dn" attribut (e.g "dn":"topology/pod-2/node-3"). And add it to the urlmap
How is node numbering defined in the apic?
--> It depends on each infrastructure. The order is defined by the administrator.
node-1 will be the first to boot or is it a configuration?
--> No, its fully redondant by design between all apics
from aci-exporter.
@ahmedaall I have tried to solve this issue. Would be great if you can test it. You can build aci-exporter from branch https://github.com/opsdis/aci-exporter/tree/issue_1. The fix is in commit da40506.
The name of the aci can now be set in the configuration file, but if not it use the /api/node/class/fabricNode.json?query-target-filter=eq(fabricNode.role,"controller")
as you suggested. It will loop over returned controller until a name can be determined. The name will be cached until aci-exporter is stopped. Hopefully this will solve the issue. Looking forward to your feedback.
from aci-exporter.
Hi @thenodon. Thank you for the update ! I'll keep you inform when update my exporter
from aci-exporter.
@thenodon I don't understand what value I have to enter in aci_name value
from aci-exporter.
@ahmedaall its optional, but if you set it the name from the ACI will not be used. This is the value of the aci
label. Have an example in eaxmple-config.yaml
from aci-exporter.
@ahmedaall any update on this?
from aci-exporter.
@thenodon I am on it. Keep you inform soon
from aci-exporter.
@ahmedaall would like to merge the branch to main, but would be great if you have tested.
from aci-exporter.
Hi @thenodon. Sorry for taking so much time to do the upgrade.
So after the upgrade the exporter goes down with those logs :
{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-13T11:59:58Z"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-13T11:59:58Z"}
2023/09/13 12:00:39 http: panic serving "PRIVATE_IP":35536: runtime error: invalid memory address or nil pointer dereference
goroutine 10 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1868 +0xb9
panic({0x9479a0?, 0xe2d440?})
/usr/local/go/src/runtime/panic.go:920 +0x270
main.AciConnection.login({{0xab0cd0, 0xc0001fc4b0}, 0x0, 0xc00046f078, 0xc00020f200, 0xc00020f1d0, {{0xaac400, 0xc000197900}, 0x0, {0xaae5e8, ...}, ...}, ...})
/build/aci-connection.go:85 +0x25
main.aciAPI.CollectMetrics({{0xab0cd0, 0xc0001fc4b0}, {{0xab0cd0, 0xc0001fc4b0}, 0x0, 0xc00046f078, 0xc00020f200, 0xc00020f1d0, {{0xaac400, 0xc000197900}, ...}, ...}, ...})
/build/aci-api.go:109 +0xa5
main.HandlerInit.getMonitorMetrics({{0xc0001b8900?, 0xc0001b92f0?, 0xc0001b9320?}, 0xc0001b9980?}, {0xaaffb8, 0xc0001f81e0}, 0xc0001edf00)
/build/aci-exporter.go:272 +0x345
net/http.HandlerFunc.ServeHTTP(0x410225?, {0xaaffb8?, 0xc0001f81e0?}, 0xf8?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.promMonitor.func2({0xaaffb8?, 0xc0001f81c0}, 0xaaa4b0?)
/build/aci-exporter.go:345 +0xe3
net/http.HandlerFunc.ServeHTTP(0xc0001ede00?, {0xaaffb8?, 0xc0001f81c0?}, 0xaaa4b0?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.logcall.func3({0xab0258?, 0xc00009e0e0}, 0xc0001ede00)
/build/aci-exporter.go:322 +0x156
net/http.HandlerFunc.ServeHTTP(0x445220?, {0xab0258?, 0xc00009e0e0?}, 0x70f85a?)
/usr/local/go/src/net/http/server.go:2136 +0x29
net/http.(*ServeMux).ServeHTTP(0xe70140?, {0xab0258, 0xc00009e0e0}, 0xc0001ede00)
/usr/local/go/src/net/http/server.go:2514 +0x142
net/http.serverHandler.ServeHTTP({0xc0001fc240?}, {0xab0258?, 0xc00009e0e0?}, 0x6?)
/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc00010acf0, {0xab0cd0, 0xc0001fc150})
/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 1
/usr/local/go/src/net/http/server.go:3086 +0x5cb
from aci-exporter.
Hi @ahmedaall did not expect that - sorry. Can you attach your config (without any secrets). Did it crash immediately on startup or do you know it happened on a call to /probe?
from aci-exporter.
@thenodon It crashed immediately on startup. My config file is very long (900 lines), I share the beginning of the file :
# Exporter port
port: 9643
# Configuration file name default without postfix
config: config
# The prefix of the metrics
prefix:
# Profiles for different fabrics
fabrics:
# This is the Cisco provided sandbox that is open for testing
# cisco_sandbox:
# username: admin
# password: <check the cisco sandbox to get the password>
# apic:
# - https://sandboxapicdc.cisco.com
MY_FABRIC:
username: ACI_USERNAME
password: ACI_PASSWORD
apic:
- https://my-load-balancer
# profile-fabric-01:
# # Apic username
# username: foo
# # Apic password.
# password: bar
# # The available apic controllers
# # The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
# apic:
# - https://apic1
# - https://apic2
# Http client settings used to access apic
# Below is the default values, where 0 is no timeout
httpclient:
# insecurehttps: true
# keepalive: 15
# timeout: 10
# Http server settings - this is for the web server aci-exporter expose
# Below is the default values, where 0 is no timeout
httpserver:
# read_timeout: 0
# write_timeout: 0
# Define the output format should be in openmetrics format - deprecated from future version after 0.4.0, use below metric_format
#openmetrics: true
metric_format:
# Output in openmetrics format, default false
openmetrics: false
# Transform all label keys to lower case format, default false. E.g. oobMgmtAddr will be oobmgmtaddr
label_key_to_lower_case: true
# Transform all label keys to snake case format, default false. E.g. oobMgmtAddr will be oob_mgmt_addr
label_key_to_snake_case: false
# The query sections define queries that should be ran by all profiles
#
# ATTENTION
# All queries might not work in your environment depending on the permission your API user is granted or if you
# running against a real or simulated API environment.
#
from aci-exporter.
Thanks @ahmedaall - Looking at the trace it looks like there must be some doing a /probe to the exporter to get the error. Just starting the exporter looks like this:
{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-13T17:42:30+02:00"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-13T17:42:30+02:00"}
The main.AciConnection.login
is first executed when when a call to endpoint /probe is done. Sure you do not have a Prometheus trying to scrape?
The panic seems to happen at row 85 in the aci-connection.go on the function login. This line is just:
for i, controller := range c.fabricConfig.Apic {
....
}
What I have test is that the panic you get will happen if c.fabricConfig is nil
, that will give the same tracestack. Forcing that to happen trying to have a fault config file or doing a rouge url request has been unsuccessful. I just works for me with.
Can you try to use a minimal config, test against cisco sandbox and make sure you do not have a prometheus trying to scrape. You can start it on some random port just for testing and run a curl against the exporter.
Some additional questions:
- What version of golang are you using?
- What OS and version do you build and run on?
from aci-exporter.
@camrossi - glad to here that the branch worked for you. And your golang works fine. The problem is that I can not get your alternative query to work using the class infraCont
. I have tried it against cisco sandbox and also against a customers aci. I'm just getting 400. Can it be an aci version issue? Or do infraCont
only work if there is a cluster of apic controllers? I think the cisco sandbox is a single apic. Check on https://pubhub.devnetcloud.com/media/apic-mim-ref-311/docs/MO-infraCont.html and it states:
infra:Cont An APIC cluster is comprised of multiple APIC controllers that provide operators with a unified real time monitoring, diagnostic, and configuration management capability for the ACI fabric.
@camrossi the query you suggested worked fine :)
from aci-exporter.
- What version of golang are you using?
- What OS and version do you build and run on?
I build from golang:latest. I am using version go1.21.1, in a Debian 12
from aci-exporter.
@ahmedaall I have the same golang version and running on ubuntu 22.04. I would suggest that you test what I recommended in #1 (comment)
from aci-exporter.
@thenodon ok I'll take those parameters and for the minimal config I'll directly import example-config.yaml and desactivate my current config yaml file
from aci-exporter.
@thenodon ok I'll take those parameters and for the minimal config I'll directly import example-config.yaml and desactivate my current config yaml file
Done. But what url do I need to enter to launch the exporter with sandbox target ?
I used to launch http://my_exporter/probe?target=MY_FABRIC with my local APIC
from aci-exporter.
@ahmedaall if you used the above config fΓΆr the cisco sandbox the name is cisco_sandbox
. So the curl is:
http://my_exporter/probe?target=cisco_sandbox&queries=my_query
Where my_query
is some basic query in the config file.
from aci-exporter.
@thenodon Everything works fine with the sandbox :
{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-14T14:04:14Z"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-14T14:04:14Z"}
{"exec_time":1015415,"fabric":"cisco_sandbox","level":"info","method":"POST","msg":"api call fabric","requestid":"2VOI1DmZOBkw9N0AUieRthm53KK","status":200,"time":"2023-09-14T14:08:03Z","uri":"https://sandboxapicdc.cisco.com/api/aaaLogin.xml"}
{"fabric":"cisco_sandbox","level":"info","msg":"Using apic https://sandboxapicdc.cisco.com","requestid":"2VOI1DmZOBkw9N0AUieRthm53KK","time":"2023-09-14T14:08:03Z"}
{"class":"fabricNode","exec_time":189166,"fabric":"cisco_sandbox","length":580,"level":"info","method":"GET","msg":"api call fabric","requestid":"2VOI1DmZOBkw9N0AUieRthm53KK","status":200,"time":"2023-09-14T14:08:03Z","uri":"https://sandboxapicdc.cisco.com/api/class/fabricNode.json?query-target-filter=eq(fabricNode.role,\"controller\")"}
from aci-exporter.
@thenodon So in the same file I change the target from the sandbox to my APIC and I have this :
{"level":"info","msg":"aci-exporter starting on port 9643","time":"2023-09-14T14:35:03Z"}
{"level":"info","msg":"Read timeout 0s, Write timeout 0s","time":"2023-09-14T14:35:03Z"}
2023/09/14 14:35:30 http: panic serving 100.64.6.20:35536: runtime error: invalid memory address or nil pointer dereference
goroutine 36 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1868 +0xb9
panic({0x9479a0?, 0xe2d440?})
/usr/local/go/src/runtime/panic.go:920 +0x270
main.AciConnection.login({{0xab0cd0, 0xc0002f17d0}, 0x0, 0xc0002d72c8, 0xc00031a930, 0xc00031a900, {{0xaac400, 0xc0002977c0}, 0x0, {0xaae5e8, ...}, ...}, ...})
/build/aci-connection.go:85 +0x25
main.aciAPI.CollectMetrics({{0xab0cd0, 0xc0002f17d0}, {{0xab0cd0, 0xc0002f17d0}, 0x0, 0xc0002d72c8, 0xc00031a930, 0xc00031a900, {{0xaac400, 0xc0002977c0}, ...}, ...}, ...})
/build/aci-api.go:109 +0xa5
main.HandlerInit.getMonitorMetrics({{0xc0002f0030?, 0xc0002f06c0?, 0xc0002f0750?}, 0xc0002f0c60?}, {0xaaffb8, 0xc000321820}, 0xc000327b00)
/build/aci-exporter.go:272 +0x345
net/http.HandlerFunc.ServeHTTP(0x410225?, {0xaaffb8?, 0xc000321820?}, 0xf8?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.promMonitor.func2({0xaaffb8?, 0xc000321800}, 0xaaa4b0?)
/build/aci-exporter.go:345 +0xe3
net/http.HandlerFunc.ServeHTTP(0xc000327a00?, {0xaaffb8?, 0xc000321800?}, 0xaaa4b0?)
/usr/local/go/src/net/http/server.go:2136 +0x29
main.main.logcall.func3({0xab0258?, 0xc000348000}, 0xc000327a00)
/build/aci-exporter.go:322 +0x156
net/http.HandlerFunc.ServeHTTP(0x445220?, {0xab0258?, 0xc000348000?}, 0x70f85a?)
/usr/local/go/src/net/http/server.go:2136 +0x29
net/http.(*ServeMux).ServeHTTP(0xe70140?, {0xab0258, 0xc000348000}, 0xc000327a00)
/usr/local/go/src/net/http/server.go:2514 +0x142
net/http.serverHandler.ServeHTTP({0xc0002f1560?}, {0xab0258?, 0xc000348000?}, 0x6?)
/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc000230900, {0xab0cd0, 0xc0002f1470})
/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 1
/usr/local/go/src/net/http/server.go:3086 +0x5cb
from aci-exporter.
@ahmedaall so now against the lb?
fabrics:
MY_FABRIC:
username: ACI_USERNAME
password: ACI_PASSWORD
apic:
- https://my-load-balancer
First try to add the aci_name
like this:
fabrics:
MY_FABRIC:
username: ACI_USERNAME
password: ACI_PASSWORD
apic:
- https://my-load-balancer
aci_name: foobar
Do you get the same panic?
You should also verify to run against one of the apic endpoint, no lb.
My thought is that its something related that lb and how its configured. So test with the same config against different fabric section. You can have multiple fabric settings in the same config file as described in the example config
from aci-exporter.
@thenodon
I have already test without the LB. Same issue.
ok i'll test. aci_name refers to the name of the fabric I guess ?
from aci-exporter.
@ahmedaall so everything works with the released version, both through lb and against apic endpoint.
The branch compiled version works against cisco sandbox, but not against any setup you have onprem, with or without lb.
Is this correct?
Its vary strange. What version of apic do you run?
from aci-exporter.
@thenodon I have version 5.2(6e)
"The branch compiled version works against cisco sandbox, but not against any setup you have onprem, with or without lb.
Is this correct?" --> Exactly
from aci-exporter.
If I make r.URL.Query().Get("target")
lower case with strings.ToLower() the crash is gone.
But is really odd as is not a case of using a fabric name that dosen't exists ... very confusing anyway hope this shed some light !
from aci-exporter.
@camrossi thanks for your findings. I can confirm I can reproduce the same behavior with the uppercase name of the fabric. Currently not sure why, but I think its a combination on how yaml package deserialize the yaml file and how I match the target query path against the deserialize fabrics structures. I will investigate it more and what changed between the released version and the branch version related to dependency since @ahmedaall got it to work in the released version.
@ahmedaall and for everyone else - the fabrics named section must be lowercase! Change to lowercase and let us know if it works.
@camrossi - would be great to be given access to your DMZ fabric. So the question is if the infraCont
is different between ACI 5.2 and 6 (cisco sandbox) since it did not worked for me against cisco sandbox.
from aci-exporter.
@camrossi @thenodon thank you for the debug. So I try to change from uppercase to lower for the name of the fabric :
# Exporter port
port: 9643
# Configuration file name default without postfix
config: config
# The prefix of the metrics
prefix:
fabrics:
# cisco_sandbox:
# username: admin
# password: ""
# apic:
# - https://sandboxapicdc.cisco.com
my_fabric:
username: ACI_USERNAME
password: ACI_PASSWORD
apic:
- https://APIC1
- https://APIC2
- https://APIC3
but...I have the same issue. I try with and without the lb. Did I forgot something ?
from aci-exporter.
Yes @ahmedaall you need to update the prometheus config as well (assuming you use that) so that it uses the lowercase name.
I have this for my prom lab config, relevant section is the target
one
prometheus:
prometheusSpec:
scrapeInterval: 30s
evaluationInterval: 30s
additionalScrapeConfigs:
- job_name: 'aci'
scrape_interval: 1m
scrape_timeout: 30s
metrics_path: /probe
static_configs:
- targets: ['fab1','fab2']
and then in aci-exporter
fabrics:
# This is the Cisco provided sandbox that is open for testing
fab1:
# Apic username
username: admin
# Apic password
password: <>
# The available apic controllers
# The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
apic:
- https://<>
fab2:
# Apic username
username: admin
# Apic password
password: <>
# The available apic controllers
# The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
apic:
- https://<>
- https://<>
- https://<>
from aci-exporter.
@ahmedaall have you had a chance to verify your LB solution yet? Would like to close this and make a release.
from aci-exporter.
@thenodon I confirm you that I am very happy with the aci-exporter. It is a really good job. Thanks a lot your reactivity with this issue.
from aci-exporter.
Thanks @ahmedaall and @camrossi for your support. Will close this with pull request #38
from aci-exporter.
Related Issues (20)
- Scrape timestamp metrics as value HOT 7
- ACI 5.2: Update login/logout API endpoints HOT 2
- Regex in value_transform HOT 10
- The prefix is not reflected in the # HELP and # TYPE lines HOT 2
- A way to convert the systemUpTime to seconds HOT 13
- Add a aci_up metric HOT 1
- It is possible to generate a metric file with empty # HELP and / or # TYPE lines HOT 2
- Labels should alway be lowercase HOT 5
- value transformation do not work HOT 3
- static_binding_info: value_regex_transformation not working HOT 3
- label regex not executed on child class HOT 4
- Convert string to float for operSpeed HOT 9
- Add static label in metrics HOT 6
- Implement Pagination Support HOT 31
- auth fails for some passwords containing special characters HOT 3
- reduce scrapping time HOT 8
- Reuse client login cookie data between api calls
- Add static labels
- Exclude/include queries by fabric configuration
- Issue with query including an array of children HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aci-exporter.