Git Product home page Git Product logo

puppet-prometheus_reporter's Introduction

Puppet Prometheus Reports Processor

This module contains a Puppet reports processor that writes report metrics in a format that is accepted by Prometheus node_exporter Textfile Collector.

How to

Puppet setup

Include this module in your path, and create a file named prometheus.yaml in your Puppet configuration directory. Example:

---
textfile_directory: /var/lib/prometheus-dropzone

Configuration options include:

  • textfile_directory - [String] Location of the node_exporter collector.textfile.directory (Required)
  • report_filename - [String] If specified, saves all reports to a single file (must end with .prom)
  • environments - [Array] If specified, only creates metrics on reports from these environments
  • reports - [Array] If specified, only creates metrics from reports of this type (changes, events, resources, time)
  • stale_time - [Integer] If specified, delete metric files for nodes that haven't sent reports in X days

Include prometheus in your Puppet reports configuration; enable pluginsync:

[agent]
report = true
pluginsync = true

[master]
report = true
reports = prometheus
pluginsync = true

Note: you can use a comma separated list of reports processors:

reports = puppetdb,prometheus

Prometheus

Call the Prometheus node_exporter with the --collector.textfile.directory flag.

node_exporter --collector.textfile.directory=/var/lib/prometheus-dropzone

Note: The directory can be anywhere, but must be matched to the one set in prometheus.yml above.

Sample

puppet_report_resources{name="Changed",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Failed",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Failed to restart",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Out of sync",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Restarted",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Scheduled",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Skipped",environment="production",host="node.example.com"} 0
puppet_report_resources{name="Total",environment="production",host="node.example.com"} 519
puppet_report_time{name="Acl",environment="production",host="node.example.com"} 3.8629975709999984
puppet_report_time{name="Anchor",environment="production",host="node.example.com"} 0.002442332
puppet_report_time{name="Augeas",environment="production",host="node.example.com"} 10.629003954
puppet_report_time{name="Concat file",environment="production",host="node.example.com"} 0.0026740609999999997
puppet_report_time{name="Concat fragment",environment="production",host="node.example.com"} 0.012010700000000003
puppet_report_time{name="Config retrieval",environment="production",host="node.example.com"} 20.471957786
puppet_report_time{name="Cron",environment="production",host="node.example.com"} 0.000874118
puppet_report_time{name="Exec",environment="production",host="node.example.com"} 0.4114313850000001
puppet_report_time{name="File",environment="production",host="node.example.com"} 0.32955574000000015
puppet_report_time{name="File line",environment="production",host="node.example.com"} 0.002702939
puppet_report_time{name="Filebucket",environment="production",host="node.example.com"} 0.0003994
puppet_report_time{name="Grafana datasource",environment="production",host="node.example.com"} 0.187452552
puppet_report_time{name="Group",environment="production",host="node.example.com"} 0.0031514940000000003
puppet_report_time{name="Mysql datadir",environment="production",host="node.example.com"} 0.000422795
puppet_report_time{name="Package",environment="production",host="node.example.com"} 1.670733222
puppet_report_time{name="Service",environment="production",host="node.example.com"} 0.8740041969999999
puppet_report_time{name="Total",environment="production",host="node.example.com"} 38.468031933999995
puppet_report_time{name="User",environment="production",host="node.example.com"} 0.005163427
puppet_report_time{name="Yumrepo",environment="production",host="node.example.com"} 0.0010542610000000001
puppet_report_changes{name="Total",environment="production",host="node.example.com"} 0
puppet_report_events{name="Failure",environment="production",host="node.example.com"} 0
puppet_report_events{name="Success",environment="production",host="node.example.com"} 0
puppet_report_events{name="Total",environment="production",host="node.example.com"} 0
puppet_report{environment="production",host="node.example.com"} 1477054915347
puppet_transaction_completed{environment="production",host="node.example.com"} 1
puppet_cache_catalog_status{state="not_used",environment="production",host="node.example.com"} 0
puppet_cache_catalog_status{state="explicitly_requested",environment="production",host="node.example.com"} 1
puppet_cache_catalog_status{state="on_failure",environment="production",host="node.example.com"} 0
puppet_status{state="failed",environment="production",host="node.example.com"} 0
puppet_status{state="changed",environment="production",host="node.example.com"} 0
puppet_status{state="unchanged",environment="production",host="node.example.com"} 1

Contributors

See Github.

Special thanks to Puppet, Inc for Puppet, and its store reports processor, to EvenUp for their graphite reports processor, and to Vox Pupuli to provide a platform that allows us to develop of this module.

Copyright and License

Copyright © 2016 Puppet Inc

Copyright © 2016 EvenUp

Copyright © 2016 Multiple contributors

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

puppet-prometheus_reporter's People

Contributors

anarcat avatar bastelfreak avatar ckluente avatar corporate-gadfly avatar dhollinger avatar dhoppe avatar juniorsysadmin avatar ldaneliukas avatar matejzero avatar maxadamo avatar oleg-glushak avatar roidelapluie avatar sandra-thieme avatar smortex avatar tragiccode avatar vinzent avatar wyardley avatar yastupin avatar zilchms avatar zipkid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

puppet-prometheus_reporter's Issues

Removing deactivated node

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 5.4.2
  • Ruby: bundled
  • Distribution: CentOS 7

What are you seeing

When I remove a node via puppet node deactivate and puppet cert clean <nodename>, it is still reported by prometheus since this script doesn't remove orphaned nodes from textfile_directory.

What behaviour did you expect instead

I was expecting that the script would remove the nodes that are no longer registered with puppet and this no need to report them.

debian packaging

Many prometheus exporters are packaged in Debian. This one could favorably be packaged similarly, so I filed request for package #940520 to track that work in Debian.

I open this issue to let the maintainers here know about it and bring attention to potential packagers here as well, to avoid duplication work.

Configurable metric types

Description

Similarly to #40 , it'd be great if we could configure what types of metrics are created as in some cases, things such as time metrics for each resource might not be desired.

Proposal

A configuration option which would allow specifying reports from which to create metrics. If unspecified, all reports should be processed as they are now.

remove transaction_uuid

The current exporter uses prometheus in a wrong way: it is abusing prometheus labels as they were metadata.

To avoid that, here is what would be needed:

  • Create one extra files with the list of metrics.

When a new report comes out, take all the metrics from that metrics file, and complete the reports metrics with the one from the file (with value -1).

  • Add the reports metrics to that "staging" file
  • Update the dashboards

config stale_time doesn't work

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 5.5.10
  • Ruby: 2.4.5p335
  • Distribution: 4.15.0-72-generic #81~16.04.1-Ubuntu
  • Module version: 1.0.0

How to reproduce (e.g Puppet code you use)

install the puppet-prometheus_reporter (v1.0.0), and configures like the instruction.

What are you seeing

The old .prom files won't be deleted even their mtime exceed the <stale_time> days.
However, when I restarted puppet, it will work to delete the files

What behaviour did you expect instead

The old .prom files should be deleted automatically without restarting puppet.

Output log

Any additional information you'd like to impart

Although I am not the expert of Puppet of Ruby, I suspect there is a bug in prometheus.rb.
The following codes should be included under the object def process, so it will be executed in the run time.
unless STALE_TIME.nil? || STALE_TIME < 1 Dir.chdir(TEXTFILE_DIRECTORY) Dir.glob('*.prom').each { |filename| File.delete(filename) if (Time.now - File.mtime(filename)) / (24 * 3600) > STALE_TIME } end
Could anyone can confirm ? or know more else?

Does this reporter support the newest node version?

I'm getting a lot of errors scraping the .prom files using the new version of the node_exporter.

Example:

time="2017-03-23T12:57:35+01:00" level=error msg="error gathering metrics: 20 error(s) occurred:

  • gathered metric family puppet_report_resources has help "Metric read from /etc/puppet/metrics/foo.prom" but should have "Metric read from /etc/puppet/metrics/bar.prom"

Note that the error involves two different prom files. It seems the first prom (alfabetically) actually succeeds, while everything else fails.

Configurable environments

Description

Metrics are created for all environments in a multi-environment setup. Having thousands of machines and thousands of short-lived environments before they are merged into master results in a large sum of unnecessary metrics.

Proposal

A configuration option which would allow specifying environments from which to create metrics. If unspecified, all environments should be processed as they are now.

Old metrics replaced with '-1' when used with report_filename option

When the option report_filename in prometheus.yaml is enabled, any new puppet agent launch triggers rewrite of "old" metrics with -1 values.
It is absolutely not valid because we lose data about other hosts that launched puppet before the last host. What is the value of this replacement?

# HELP puppet_report_resources Resources broken down by their state
# TYPE puppet_report_resources gauge
# HELP puppet_report_time Resource apply times
# TYPE puppet_report_time gauge
# HELP puppet_report_changes Changed resources in the last puppet run
# TYPE puppet_report_changes gauge
# HELP puppet_report_events Resource application events
# TYPE puppet_report_events gauge
# HELP puppet_report Unix timestamp of the last puppet run
# TYPE puppet_report gauge
# Old metrics
puppet_report_resources{name="Changed",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Corrective change",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Failed",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Failed to restart",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Out of sync",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Restarted",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Scheduled",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Skipped",environment="prod",host="previous_host"} -1
puppet_report_resources{name="Total",environment="prod",host="previous_host"} -1
puppet_report_time{name="Anchor",environment="prod",host="previous_host"} -1
puppet_report_time{name="Apt key",environment="prod",host="previous_host"} -1
puppet_report_time{name="Archive",environment="prod",host="previous_host"} -1
puppet_report_time{name="Catalog application",environment="prod",host="previous_host"} -1
puppet_report_time{name="Concat file",environment="prod",host="previous_host"} -1
puppet_report_time{name="Concat fragment",environment="prod",host="previous_host"} -1
puppet_report_time{name="Config retrieval",environment="prod",host="previous_host"} -1
puppet_report_time{name="Convert catalog",environment="prod",host="previous_host"} -1
puppet_report_time{name="Cron",environment="prod",host="previous_host"} -1
puppet_report_time{name="Exec",environment="prod",host="previous_host"} -1
puppet_report_time{name="Fact generation",environment="prod",host="previous_host"} -1
puppet_report_time{name="File",environment="prod",host="previous_host"} -1
puppet_report_time{name="Filebucket",environment="prod",host="previous_host"} -1
puppet_report_time{name="Group",environment="prod",host="previous_host"} -1
puppet_report_time{name="Httpauth",environment="prod",host="previous_host"} -1
puppet_report_time{name="Node retrieval",environment="prod",host="previous_host"} -1
puppet_report_time{name="Package",environment="prod",host="previous_host"} -1
puppet_report_time{name="Plugin sync",environment="prod",host="previous_host"} -1
puppet_report_time{name="Resources",environment="prod",host="previous_host"} -1
puppet_report_time{name="Schedule",environment="prod",host="previous_host"} -1
puppet_report_time{name="Service",environment="prod",host="previous_host"} -1
puppet_report_time{name="Total",environment="prod",host="previous_host"} -1
puppet_report_time{name="Transaction evaluation",environment="prod",host="previous_host"} -1
puppet_report_time{name="User",environment="prod",host="previous_host"} -1
puppet_report_changes{name="Total",environment="prod",host="previous_host"} -1
puppet_report_events{name="Failure",environment="prod",host="previous_host"} -1
puppet_report_events{name="Success",environment="prod",host="previous_host"} -1
puppet_report_events{name="Total",environment="prod",host="previous_host"} -1
puppet_report{environment="prod",host="previous_host"} -1
# New metrics
puppet_report_resources{name="Changed",environment="prod",host="latest_host"} 1
puppet_report_resources{name="Corrective change",environment="prod",host="latest_host"} 1
puppet_report_resources{name="Failed",environment="prod",host="latest_host"} 0
puppet_report_resources{name="Failed to restart",environment="prod",host="latest_host"} 0
puppet_report_resources{name="Out of sync",environment="prod",host="latest_host"} 1
puppet_report_resources{name="Restarted",environment="prod",host="latest_host"} 0
puppet_report_resources{name="Scheduled",environment="prod",host="latest_host"} 0
puppet_report_resources{name="Skipped",environment="prod",host="latest_host"} 0
puppet_report_resources{name="Total",environment="prod",host="latest_host"} 1419
puppet_report_time{name="Anchor",environment="prod",host="latest_host"} 0.00020055399999999998
puppet_report_time{name="Apt key",environment="prod",host="latest_host"} 0.0010214339999999999
puppet_report_time{name="Archive",environment="prod",host="latest_host"} 0.0013130680000000001
puppet_report_time{name="Catalog application",environment="prod",host="latest_host"} 12.731362691149116
puppet_report_time{name="Concat file",environment="prod",host="latest_host"} 0.000569588
puppet_report_time{name="Concat fragment",environment="prod",host="latest_host"} 0.0018003480000000002
puppet_report_time{name="Config retrieval",environment="prod",host="latest_host"} 7.667529137805104
puppet_report_time{name="Convert catalog",environment="prod",host="latest_host"} 1.1087825242429972
puppet_report_time{name="Cron",environment="prod",host="latest_host"} 0.021340405
puppet_report_time{name="Exec",environment="prod",host="latest_host"} 2.460908792999999
puppet_report_time{name="Fact generation",environment="prod",host="latest_host"} 6.378860469907522
puppet_report_time{name="File",environment="prod",host="latest_host"} 4.624337396000001
puppet_report_time{name="Filebucket",environment="prod",host="latest_host"} 6.5871e-05
puppet_report_time{name="Group",environment="prod",host="latest_host"} 0.13283111100000008
puppet_report_time{name="Httpauth",environment="prod",host="latest_host"} 0.000573358
puppet_report_time{name="Node retrieval",environment="prod",host="latest_host"} 0.3272962998598814
puppet_report_time{name="Package",environment="prod",host="latest_host"} 0.4034370560000001
puppet_report_time{name="Plugin sync",environment="prod",host="latest_host"} 1.0822257678955793
puppet_report_time{name="Resources",environment="prod",host="latest_host"} 8.5261e-05
puppet_report_time{name="Schedule",environment="prod",host="latest_host"} 0.00038971599999999997
puppet_report_time{name="Service",environment="prod",host="latest_host"} 0.54392664
puppet_report_time{name="Total",environment="prod",host="latest_host"} 29.313116721
puppet_report_time{name="Transaction evaluation",environment="prod",host="latest_host"} 12.170660078525543
puppet_report_time{name="User",environment="prod",host="latest_host"} 0.19735986999999997
puppet_report_changes{name="Total",environment="prod",host="latest_host"} 1
puppet_report_events{name="Failure",environment="prod",host="latest_host"} 0
puppet_report_events{name="Success",environment="prod",host="latest_host"} 1
puppet_report_events{name="Total",environment="prod",host="latest_host"} 1
puppet_report{environment="prod",host="latest_host"} 1585743103.709

Question about how to properly use this

I recently set this up in a test environment, but it doesn't appear to be working correctly and I think I may have set it up improperly, but the docs don't really explain setup.

puppetserver version: 6.12.0
puppet agent version : 6.16.0
Linux puppet-stage 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Steps I took to setup:

  1. On Puppet Master install module:
puppet module install --target-dir /etc/puppetlabs/code/modules puppet-prometheus_reporter
  1. On Puppet Master add prometheus.yaml file under: /etc/puppetlabs/puppet/
---
textfile_directory: /etc/node_exporter_text_files # This is the location where all of my other Prometheus files go
report_filename: puppet_report.prom
  1. On Puppet Master update puppet.conf file under: /etc/puppetlabs/puppet/
[agent]
report = true
pluginsync = true

[master]
vardir = /opt/puppetlabs/server/data/puppetserver
logdir = /var/log/puppetlabs/puppetserver
rundir = /var/run/puppetlabs/puppetserver
pidfile = /var/run/puppetlabs/puppetserver/puppetserver.pid
codedir = /etc/puppetlabs/code

report = true
reports = prometheus
pluginsync = true

storeconfigs = true
storeconfigs_backend = puppetdb
reports = store,puppetdb,prometheus

[main]
server = puppet-stage.sub.domain.com
dns_alt_names = puppet,puppet.-stage.sub.domain.com
autosign = true
pluginsync = true
environment = stage
  1. On Puppet Master restart puppetserver and puppet
systemctl restart puppetserver puppet
  1. On Puppet Master run puppet agent test
puppet agent -t

I would expect after this to have the output file available for scraping, however, I don't see anything show up.

Delete only prometheus reporter's files

At the moment, clean_stale_reports will check each .prom file if it's older than STALE_TIME and delete it. This is not OK, as we can have multiple non-puppet prom files in textfile exporter.

In my case, I'm getting the following error:

2023-04-07T10:36:15.765+02:00 ERROR [qtp219856377-1491] [puppetserver] Puppet Report processor failed: Permission denied - node_info_metrics.prom
org/jruby/RubyFile.java:1291:in `delete'
/etc/puppetlabs/code/modules/prometheus_reporter/lib/puppet/reports/prometheus.rb:167:in `block in clean_stale_reports'
org/jruby/RubyArray.java:1865:in `each'
/etc/puppetlabs/code/modules/prometheus_reporter/lib/puppet/reports/prometheus.rb:167:in `clean_stale_reports'
...

A quick fix was to add a prefix puppet_ to namevar and then do the Dir.glob('puppet_*.prom').each... to only clean puppet reports.

Not the best solution I guess, but it works and it has an extra benefit that now I know where the files come from.

How does one deal with permissions?

I've installed the prometheus reporter and it's successfully producing reports in the designated directory. However, they are written with perms 0640 and owner/group by puppet:puppet and the node_exporter process, which is run as user node-exporter, is unable to read the files.

It doesn't seem right to add the node-exporter user to the puppet group.

Would it make sense to make the owner/group of the .prom files configurable?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.