marcinbudny / eventstore_exporter Goto Github PK
View Code? Open in Web Editor NEWEventStoreDB (https://eventstore.com/eventstoredb/) metrics Prometheus exporter.
License: MIT License
EventStoreDB (https://eventstore.com/eventstoredb/) metrics Prometheus exporter.
License: MIT License
Is there a metric with the number of the last event of a particular stream?
Hi,
we want to use the eventstore_exporter with a config file to inject credentials from a vault into the container.
I can't quite figure out from the linked namsral/flag.
How do i configure the eventstore_exporter to use a config file? Or is there a default path and filename?
Any help is appreciated
regards,
Frank
Hi,
Great service, great dashboard! :)
Found a (must be?) memory leak. If the eventstore_exporter gets bad credentials - memory usage just keeps growing. Up to several gigabytes at least. Seems to be around 800mb / day for us.
When the exporter has the correct credentials memory usage is still high - but doesn't seem to grow totally without bounds.
The memory usage of eventstore_exporter where credentials are correct (newly restarted process):
pmap -d $(pgrep eventstore_expo) | tail -n1 | awk '{print $4}'
139720K
eventstore_exporter without creds, running for 4 hours:
pmap -d $(pgrep eventstore_ex) | tail -n1 | awk '{print $4}'
238356K
Is there anything special to do to get data for this metric?
We do not seem to get any subscription lag information for our catch up subscriptions.
Information about persistent subscriptions was added in December, is it possible to include catch up ones?
In the EventStoreDB setup where ExtIpAdvertiseAs
is not set gossip
endpoint returns IP addresses. When httpClient
hits Slave node - Slave responds with 307 redirect to Master. Since redirect is not going to same domain/subdomain (see https://pkg.go.dev/net/http#Client) Authorization
header is not forwarded.
This results in a failure to retrieve subscriptions stats:
Atom Pub is disabled and ES version is < 21.2, there is no way to retrieve subscription stats
.
Quick solution is to add CheckRedirect
and inject basic auth:
esClient.httpClient = http.Client{
Timeout: config.Timeout,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
req.SetBasicAuth(config.EventStoreUser, config.EventStorePassword)
return nil
},
}
More elaborate solution would be to add config flag that enables CheckRedirect
(similar to config.InsecureSkipVerify
), or to check if address is IP, or use gossip to find Master node.
Currently the Projection status is returned as eventstore_projection_running
which is either 1
or 0
.
The problem with this is that it can be difficult to setup up alerting for a failed projection. For example, we have projections that are legitimately stopped, so there value is 0
, but a failed projection is also 0
.
I am trying to find a good example of how this could be implemented, I think I have seen other exporters break up the gauge
like this
eventstore_projection_status {projection="$by_category", status="Running"} 1
eventstore_projection_status {projection="$by_category", status="Stopped"} 0
eventstore_projection_status {projection="$by_category", status="Error"} 0
eventstore_projection_status {projection="my_broken_projection", status="Running"} 0
eventstore_projection_status {projection="my_broken_projection", status="Stopped"} 0
eventstore_projection_status {projection="my_broken_projection", status="Error"} 1
--
The eventstore_tcp_received_bytes metric calls bytes but shows bits data.
There are several changes in diagnostic endpoints in 20.6 that affect the exporter.
/stats
/info
/projections/all-non-transient
/gossip
Hi
I'm trying to add your exporter in my cluster, but I can't find how to monitor the field "Status # of msgs / estimated time to catchup in seconds" in the subscription panel.
is it possible with this export to monitor this field of subscription panel ? and if not do you intend to add it in a future update pls ?
Hi,
First of all - thanks for a very useful exporter!
After using it for a while I've noticed that when projections are deleted they still show up in the metrics:
# HELP eventstore_projection_running If 1, projection is in 'Running' state
# TYPE eventstore_projection_running gauge
eventstore_projection_running{projection="$by_category"} 1
eventstore_projection_running{projection="$by_correlation_id"} 1
eventstore_projection_running{projection="$by_event_type"} 1
eventstore_projection_running{projection="$stream_by_category"} 1
eventstore_projection_running{projection="$streams"} 1
eventstore_projection_running{projection="Test_Fail"} 0
I think this is because the Gauge/label is added to projectionRunning
but if the projection no longer exists in stats
it should be removed.
I think you will need to compare the projections available in stats
with the projections being collected and delete the labels for the removed projections: GaugeVec.DeleteLabelValues
I would make the changes myself but it will take me some time to figure out the Go code :-)
Hello,
The exporter doesn't detect all eventstore down:
time="2024-05-15T13:39:40Z" level=info msg="Running scrape"
time="2024-05-15T13:39:50Z" level=error msg="Error while getting data from EventStore" error="Get \"http://eventstore.svc.cluster.local:2113/subscriptions\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-05-15T13:39:55Z" level=info msg="Running scrape"
In the following case I have in my prometheus:
This is a problem because no alert is fire ( I configure an alert if up == 0 )
Best regards
Hi, I'm looking to use this exporter for our eventstore since it looks really good, but unfortunately it doesn't seem to export disk usage stats (as seen under the /stats/sys
endpoint).
These stats would be useful for me because I'm using a dedicated EBS disk for the eventstore storage, and EBS has no monitoring by default.
If you could add the 'drive' stats, I'd be very appreciative.
Thanks!
Hi! We have trouble running the exporter (v. 0.9.0) in OpenShift (Kubernetes) against single-node eventstore (docker tag: release-5.0.8) as it keeps segfaulting on scrape.
Log:
`
time="2020-09-30T13:34:28Z" level=info msg="EventStore exporter configured" clusterMode=cluster enableParkedMessagesStats=true eventStoreURL="http://eventstore-server:2113 --cluster-mode=single" eventStoreUser=admin insecureSkipVerify=false port=9448 timeout=10s verbose=true
time="2020-09-30T13:34:36Z" level=info msg="Running scrape"
time="2020-09-30T13:34:36Z" level=debug msg="GET request to EventStore" url="http://eventstore-server:2113 --cluster-mode=single/subscriptions"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x72a037]
goroutine 28 [running]:
net/http.(*Request).SetBasicAuth(0x0, 0xc00002a0b0, 0x5, 0xc00002a074, 0x8)
/usr/local/go/src/net/http/request.go:963 +0x37
main.get.func1(0xc000118280, 0x46, 0xc0002e6240, 0xc000118200)
/go/src/github.com/marcinbudny/eventstore_exporter/eventstore_client.go:260 +0x645
created by main.get
/go/src/github.com/marcinbudny/eventstore_exporter/eventstore_client.go:255 +0xcb
`
Is this a know issue?
Firstly, thanks for your work on this exporter, it's been a great help.
One thing that would be useful, is to get a metric of the number of messages in the parked queue for a subscription.
It does not look like that metric is included with the extra statistics on a subscripton but I believe it could be calculated by using the parked stream metadata to get the "truncate before" value ($tb) and subtracting that from the latest event number in the parked stream.
/streams/$persistentsubscription-$ce-CategoryStream::Subscription-Group-parked/metadata
{
"$tb": 1185,
"$acl": {
}
}
Hello ,
it would be nice to see the nodes in clone state , currently as far as i see , value mapping is as like following,
1 - master
0 - slave
,
for the clone nodes it would be nice to have something like this
1 - master
0 - slave
2 - clone
Thank you
Hello,
we updated last week our Eventstore Exporter Pods to version v0.11.0. After the upgraded we noticed that the Containers are using a lot more memory than before and also reaching the memory limit of 30MB and getting OOM killed. We updated this week then to v0.12.0 and had the same issues. Now we´re downgraded to v0.10.4 and everything is fine again.
Memory usage of v0.11.0
Memory usage of v0.10.4
regards,
Frank
Hi, I've deployed the exporter against a v5 cluster, running in cluster mode. However, I'm only seeing stats for a single member rather than all members, for example:
eventstore_process_cpu{instance="xxx:43644",job="evs-gwc",service="gateway-eventstore-exporter-qa"}
Shouldn't I see a metric for each member of the cluster or is the idea that we have to deploy an instance of the exporter pointing to each node?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.