Git Product home page Git Product logo

sensu's Introduction

sensu

Build Status Gem Version MIT Licensed Join the chat at https://slack.sensu.io/

⚠️ ANNOUNCEMENT - Sensu 1.x has reached End-Of-Life (December 31st, 2019)

The Sensu 1.x project reached end-of-life on December 31st, 2019. The existing package repositories became unreachable on January 6th, 2020. Please see our blog post for more details: https://blog.sensu.io/announcing-the-sensu-archives

Sensu 1.x has been superseded by Sensu Go.

As always, we want to hear from the Community and please reach out on Slack or Discourse if you have any questions or concerns.

Sensu

A simple, malleable, and scalable framework for composing the monitoring system you need.

Sensu is offered in two flavors:

  • Sensu Core - this open source project
  • Sensu Enterprise - a full-featured commercial implementation, built on Sensu Core

Installation

Sensu supports a number of Unix-like platforms, as well as Windows. Please see the list of supported platforms for installation instructions.

Documentation

Please refer to the online documentation for details on configuring and operating Sensu.

Getting Help

If you have questions not covered by the documentation, the Sensu community is here to help. Please check out our chat on Slack, or the sensu-users discussion list.

Commercial support is also available. See the support section of our website for more detail.

Contributing

Please observe these guidelines on contributing.

License

Sensu Core is released under the MIT license.

sensu's People

Contributors

amdprophet avatar apaskulin avatar blast-hardcheese avatar chrisroberts avatar cwjohnston avatar darix avatar decklin avatar grantr avatar gsalisbury avatar igorshp avatar jamtur01 avatar jlambert121 avatar joemiller avatar jtschelling avatar kcrayon avatar lum avatar miyakawataku avatar moises-silva avatar nstielau avatar piavlo avatar portertech avatar rmc3 avatar roobert avatar runningman84 avatar seifer44 avatar smaftoul avatar solarkennedy avatar tmonk42 avatar tomdz avatar webframp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sensu's Issues

SSL Client certificates

The documentation doesn't seem to highlight the risks of using the same SSL certificate on all clients.

Is this an issue that should be revised? I'm going to add a notice about it in the demo readme for sensu-puppet, but I don't know if it should be highlighted in the demo docs?

I noticed sensu-chef might also be facilitating this issue, but haven't verified this.

server and client hang on interrupt

In my tests with a single sensu-server and client on the same machine, both processes always hang when killed. The only thing that will make them exit is kill -9.

I investigated and found that the unsubscribe blocks hang indefinitely. Removing them fixes the issue but that's not a real solution.

Sensu-client exits when subscription is [""]

Snippet of config.json:

"client": {
"address": "hostname",
"name": "hostname",
"subscriptions": [""]
}

Error on client side:
/usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/channel.rb:833:in send': The channel 1 was closed, you can't use it anymore! (AMQP::ChannelClosedError) from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/channel.rb:826:ineach'
from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/channel.rb:826:in send' from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/channel.rb:825:insynchronize'
from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/channel.rb:825:in send' from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/em/deferrable.rb:48:incall'
from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/em/deferrable.rb:48:in callback' from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/channel.rb:824:insend'
from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/exchange.rb:327:in publish' from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/em/deferrable.rb:48:incall'
from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/em/deferrable.rb:48:in callback' from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/exchange.rb:312:inpublish'
from /usr/lib/ruby/gems/1.8/gems/amqp-0.7.4/lib/amqp/queue.rb:388:in publish' from /usr/lib/ruby/gems/1.8/gems/sensu-0.9.4/lib/sensu/client.rb:47:inpublish_keepalive'
from /usr/lib/ruby/gems/1.8/gems/sensu-0.9.4/lib/sensu/client.rb:54:in setup_keepalives' from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/em/timers.rb:56:incall'
from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/em/timers.rb:56:in fire' from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:179:incall'
from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:179:in run_machine' from /usr/lib/ruby/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:179:inrun'
from /usr/lib/ruby/gems/1.8/gems/sensu-0.9.4/lib/sensu/client.rb:14:in run' from /usr/lib/ruby/gems/1.8/gems/sensu-0.9.4/bin/sensu-client:7 from /usr/bin/sensu-client:19:inload'
from /usr/bin/sensu-client:19

sensu-client becomes completely unresponsive after a few hours

I have several hundred boxes deployed with a sensu-client running on each one. They are all publishing keepalives and consuming check requests from rabbitmq. After a few hours I can see by the individual rabbit queues that some clients stop consuming check requests. When I go to those boxes the sensu-client process is unresponsive.

  • The process is still running.
  • Rabbit says that the channel is still connected no matter how long the client stays in this state.
  • lsof shows the same lines as an unstuck client with the rabbitmq connection listed as ESTABLISHED
  • I can telnet to port 3030 but it does not respond to any input. Normal clients respond with ping/pong and complain if I type random text. Stuck clients do not respond at all. lsof does not show an ESTABLISHED connection when telnet is connected on stuck clients. On normal clients a new line for this process shows up in lsof output for the ESTABLISHED local connection.
  • To debug this I added a custom signal handler to output the client state but once the client is stuck it does not respond to the signal at all.
  • If I restart the process all is well again.

Does this sound at all familiar and/or is there anything I can do to debug this?

We are using a fork of 0.9.8 only modified with code like the signal handler to debug this issue.

Thanks.

Puppet module does not update apt

Causing a dependency chain failure.

Side notes:

  • Vagrantfile should use the same vagrant box name as Chef's, perhaps "ubuntu-1104-server-i386"

sensu-client crashes immediately

Hi, just installed sensu as per installation docs, on centos 6.2 64bit, using the omnibus package.
after installing, I was following https://github.com/sensu/sensu/wiki/HOWTO:-Add-a-check to add my first check.
I installed the plugin, modified the config as prescribed, but when restarting the client, I get this:

[root@dfvimeonms1 ~]# /etc/init.d/sensu-client restart
Stopping sensu-client                                      [  OK  ]
Starting sensu-client/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/common.rb:67: [BUG] unknown type 0x22 (0xc given)
ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0028 p:---- s:0099 b:0099 l:000098 d:000098 CFUNC  :initialize
c:0027 p:---- s:0097 b:0097 l:000096 d:000096 CFUNC  :new
c:0026 p:0099 s:0094 b:0092 l:000091 d:000091 METHOD /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/common.rb:67
c:0025 p:0090 s:0083 b:0082 l:000081 d:000081 CLASS  /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext.rb:17
c:0024 p:0011 s:0080 b:0080 l:000079 d:000079 CLASS  /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext.rb:12
c:0023 p:0074 s:0078 b:0078 l:000077 d:000077 TOP    /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext.rb:9
c:0022 p:---- s:0076 b:0076 l:000075 d:000075 FINISH
c:0021 p:---- s:0074 b:0074 l:000073 d:000073 CFUNC  :require
c:0020 p:0053 s:0070 b:0070 l:000069 d:000069 METHOD /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:36
c:0019 p:0027 s:0063 b:0063 l:000062 d:000062 CLASS  /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json.rb:58
c:0018 p:0021 s:0061 b:0061 l:000060 d:000060 TOP    /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json.rb:54
c:0017 p:---- s:0059 b:0059 l:000058 d:000058 FINISH
c:0016 p:---- s:0057 b:0057 l:000056 d:000056 CFUNC  :require
c:0015 p:0174 s:0053 b:0053 l:000052 d:000052 METHOD /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55
c:0014 p:0037 s:0046 b:0046 l:000045 d:000045 TOP    /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/base.rb:5
c:0013 p:---- s:0044 b:0044 l:000043 d:000043 FINISH
c:0012 p:---- s:0042 b:0042 l:000041 d:000041 CFUNC  :require
c:0011 p:0174 s:0038 b:0038 l:000037 d:000037 METHOD /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55
c:0010 p:0039 s:0031 b:0031 l:000030 d:000030 TOP    /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/client.rb:1
c:0009 p:---- s:0029 b:0029 l:000028 d:000028 FINISH
c:0008 p:---- s:0027 b:0027 l:000026 d:000026 CFUNC  :require
c:0007 p:0174 s:0023 b:0023 l:000022 d:000022 METHOD /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55
c:0006 p:0068 s:0016 b:0016 l:000015 d:000015 TOP    /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/bin/sensu-client:7
c:0005 p:---- s:0013 b:0013 l:000012 d:000012 FINISH
c:0004 p:---- s:0011 b:0011 l:000010 d:000010 CFUNC  :load
c:0003 p:0127 s:0007 b:0007 l:001928 d:001050 EVAL   /opt/sensu/bin/sensu-client:19
c:0002 p:---- s:0004 b:0004 l:000003 d:000003 FINISH
c:0001 p:0000 s:0002 b:0002 l:001928 d:001928 TOP   

-- Ruby level backtrace information ----------------------------------------
/opt/sensu/bin/sensu-client:19:in `<main>'
/opt/sensu/bin/sensu-client:19:in `load'
/opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/bin/sensu-client:7:in `<top (required)>'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
/opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/client.rb:1:in `<top (required)>'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
/opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/base.rb:5:in `<top (required)>'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json.rb:54:in `<top (required)>'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json.rb:58:in `<module:JSON>'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
/opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext.rb:9:in `<top (required)>'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext.rb:12:in `<module:JSON>'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext.rb:17:in `<module:Ext>'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/common.rb:67:in `generator='
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/common.rb:67:in `new'
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/common.rb:67:in `initialize'

-- C level backtrace information -------------------------------------------
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19f808) [0x7f4a193c3808]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x5c9fd) [0x7f4a192809fd]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_bug+0x127) [0x7f4a19280b40]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_check_type+0x1ab) [0x7f4a19281019]
/usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/generator.so(+0x452f) [0x7f4a12a6f52f] generator.c:910
/opt/sensu/embedded/lib/libruby.so.1.9(+0x1928ea) [0x7f4a193b68ea]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x195c4c) [0x7f4a193b9c4c]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x196ff7) [0x7f4a193baff7]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x196f4a) [0x7f4a193baf4a]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_funcall2+0x31) [0x7f4a193bb238]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_obj_call_init+0x6f) [0x7f4a19287356]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_class_new_instance+0x30) [0x7f4a192d7042]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x1928ea) [0x7f4a193b68ea]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19279e) [0x7f4a193b679e]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x191b23) [0x7f4a193b5b23]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x18c732) [0x7f4a193b0732]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19c1b1) [0x7f4a193c01b1]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_iseq_eval+0x30) [0x7f4a193c0b99]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x64d57) [0x7f4a19288d57]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_require_safe+0x18e) [0x7f4a19289b29]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_f_require+0x20) [0x7f4a19289137]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x192921) [0x7f4a193b6921]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19279e) [0x7f4a193b679e]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x191b23) [0x7f4a193b5b23]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x18c732) [0x7f4a193b0732]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19c1b1) [0x7f4a193c01b1]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_iseq_eval+0x30) [0x7f4a193c0b99]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x64d57) [0x7f4a19288d57]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_require_safe+0x18e) [0x7f4a19289b29]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_f_require+0x20) [0x7f4a19289137]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x192921) [0x7f4a193b6921]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19279e) [0x7f4a193b679e]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x191b23) [0x7f4a193b5b23]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x18c732) [0x7f4a193b0732]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19c1b1) [0x7f4a193c01b1]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_iseq_eval+0x30) [0x7f4a193c0b99]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x64d57) [0x7f4a19288d57]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_require_safe+0x18e) [0x7f4a19289b29]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_f_require+0x20) [0x7f4a19289137]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x192921) [0x7f4a193b6921]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19279e) [0x7f4a193b679e]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x191b23) [0x7f4a193b5b23]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x18c732) [0x7f4a193b0732]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19c1b1) [0x7f4a193c01b1]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_iseq_eval+0x30) [0x7f4a193c0b99]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x64d57) [0x7f4a19288d57]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_require_safe+0x18e) [0x7f4a19289b29]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_f_require+0x20) [0x7f4a19289137]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x192921) [0x7f4a193b6921]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19279e) [0x7f4a193b679e]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x191b23) [0x7f4a193b5b23]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x18c732) [0x7f4a193b0732]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19c1b1) [0x7f4a193c01b1]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_iseq_eval+0x30) [0x7f4a193c0b99]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x64d57) [0x7f4a19288d57]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x64faf) [0x7f4a19288faf]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x1928ea) [0x7f4a193b68ea]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19279e) [0x7f4a193b679e]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x191b23) [0x7f4a193b5b23]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x18c732) [0x7f4a193b0732]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x19c1b1) [0x7f4a193c01b1]
/opt/sensu/embedded/lib/libruby.so.1.9(rb_iseq_eval_main+0x30) [0x7f4a193c0bdb]
/opt/sensu/embedded/lib/libruby.so.1.9(+0x61b8e) [0x7f4a19285b8e]
/opt/sensu/embedded/lib/libruby.so.1.9(ruby_exec_node+0x1e) [0x7f4a19285cb2]
/opt/sensu/embedded/lib/libruby.so.1.9(ruby_run_node+0x38) [0x7f4a19285c85]
/opt/sensu/embedded/bin/ruby() [0x4008a7]
/lib64/libc.so.6(__libc_start_main+0xfd) [0x3fa0a1ecdd]
/opt/sensu/embedded/bin/ruby() [0x4007a9]

-- Other runtime information -----------------------------------------------

* Loaded script: /opt/sensu/bin/sensu-client

* Loaded features:

    0 enumerator.so
    1 /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/encdb.so
    2 /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/trans/transdb.so
    3 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/defaults.rb
    4 /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/rbconfig.rb
    5 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/deprecate.rb
    6 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/exceptions.rb
    7 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/custom_require.rb
    8 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems.rb
    9 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/version.rb
   10 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/requirement.rb
   11 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/dependency.rb
   12 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/platform.rb
   13 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/specification.rb
   14 /opt/sensu/embedded/lib/ruby/1.9.1/rubygems/path_support.rb
   15 /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/version.rb
   16 /opt/sensu/embedded/lib/ruby/1.9.1/ostruct.rb
   17 /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/generic_object.rb
   18 /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/common.rb
   19 /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/parser.so
   20 /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/generator.so

* Process memory map:

00400000-00401000 r-xp 00000000 fd:01 557513                             /opt/sensu/embedded/bin/ruby
00600000-00601000 rw-p 00000000 fd:01 557513                             /opt/sensu/embedded/bin/ruby
01503000-019b9000 rw-p 00000000 00:00 0                                  [heap]
32cac00000-32cace3000 r-xp 00000000 fd:01 345277                         /usr/lib64/libruby.so.1.8.7
32cace3000-32caee2000 ---p 000e3000 fd:01 345277                         /usr/lib64/libruby.so.1.8.7
32caee2000-32caee7000 rw-p 000e2000 fd:01 345277                         /usr/lib64/libruby.so.1.8.7
32caee7000-32caf05000 rw-p 00000000 00:00 0 
32cc400000-32cc416000 r-xp 00000000 fd:01 453642                         /lib64/libgcc_s-4.4.6-20120305.so.1
32cc416000-32cc615000 ---p 00016000 fd:01 453642                         /lib64/libgcc_s-4.4.6-20120305.so.1
32cc615000-32cc616000 rw-p 00015000 fd:01 453642                         /lib64/libgcc_s-4.4.6-20120305.so.1
3876e00000-3876e5d000 r-xp 00000000 fd:01 450583                         /lib64/libfreebl3.so
3876e5d000-387705c000 ---p 0005d000 fd:01 450583                         /lib64/libfreebl3.so
387705c000-387705d000 r--p 0005c000 fd:01 450583                         /lib64/libfreebl3.so
387705d000-387705e000 rw-p 0005d000 fd:01 450583                         /lib64/libfreebl3.so
387705e000-3877062000 rw-p 00000000 00:00 0 
3878e00000-3878e07000 r-xp 00000000 fd:01 450597                         /lib64/libcrypt-2.12.so
3878e07000-3879007000 ---p 00007000 fd:01 450597                         /lib64/libcrypt-2.12.so
3879007000-3879008000 r--p 00007000 fd:01 450597                         /lib64/libcrypt-2.12.so
3879008000-3879009000 rw-p 00008000 fd:01 450597                         /lib64/libcrypt-2.12.so
3879009000-3879037000 rw-p 00000000 00:00 0 
3fa0200000-3fa0220000 r-xp 00000000 fd:01 450567                         /lib64/ld-2.12.so
3fa041f000-3fa0420000 r--p 0001f000 fd:01 450567                         /lib64/ld-2.12.so
3fa0420000-3fa0421000 rw-p 00020000 fd:01 450567                         /lib64/ld-2.12.so
3fa0421000-3fa0422000 rw-p 00000000 00:00 0 
3fa0600000-3fa0602000 r-xp 00000000 fd:01 450590                         /lib64/libdl-2.12.so
3fa0602000-3fa0802000 ---p 00002000 fd:01 450590                         /lib64/libdl-2.12.so
3fa0802000-3fa0803000 r--p 00002000 fd:01 450590                         /lib64/libdl-2.12.so
3fa0803000-3fa0804000 rw-p 00003000 fd:01 450590                         /lib64/libdl-2.12.so
3fa0a00000-3fa0b86000 r-xp 00000000 fd:01 450574                         /lib64/libc-2.12.so
3fa0b86000-3fa0d86000 ---p 00186000 fd:01 450574                         /lib64/libc-2.12.so
3fa0d86000-3fa0d8a000 r--p 00186000 fd:01 450574                         /lib64/libc-2.12.so
3fa0d8a000-3fa0d8b000 rw-p 0018a000 fd:01 450574                         /lib64/libc-2.12.so
3fa0d8b000-3fa0d90000 rw-p 00000000 00:00 0 
3fa0e00000-3fa0e17000 r-xp 00000000 fd:01 450917                         /lib64/libpthread-2.12.so
3fa0e17000-3fa1016000 ---p 00017000 fd:01 450917                         /lib64/libpthread-2.12.so
3fa1016000-3fa1017000 r--p 00016000 fd:01 450917                         /lib64/libpthread-2.12.so
3fa1017000-3fa1018000 rw-p 00017000 fd:01 450917                         /lib64/libpthread-2.12.so
3fa1018000-3fa101c000 rw-p 00000000 00:00 0 
3fa1600000-3fa1607000 r-xp 00000000 fd:01 450656                         /lib64/librt-2.12.so
3fa1607000-3fa1806000 ---p 00007000 fd:01 450656                         /lib64/librt-2.12.so
3fa1806000-3fa1807000 r--p 00006000 fd:01 450656                         /lib64/librt-2.12.so
3fa1807000-3fa1808000 rw-p 00007000 fd:01 450656                         /lib64/librt-2.12.so
3fa1a00000-3fa1a83000 r-xp 00000000 fd:01 450884                         /lib64/libm-2.12.so
3fa1a83000-3fa1c82000 ---p 00083000 fd:01 450884                         /lib64/libm-2.12.so
3fa1c82000-3fa1c83000 r--p 00082000 fd:01 450884                         /lib64/libm-2.12.so
3fa1c83000-3fa1c84000 rw-p 00083000 fd:01 450884                         /lib64/libm-2.12.so
7f4a12a6a000-7f4a12a6b000 rw-p 00000000 00:00 0 
7f4a12a6b000-7f4a12a72000 r-xp 00000000 fd:01 581722                     /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/generator.so
7f4a12a72000-7f4a12c72000 ---p 00007000 fd:01 581722                     /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/generator.so
7f4a12c72000-7f4a12c73000 rw-p 00007000 fd:01 581722                     /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/generator.so
7f4a12c73000-7f4a12c78000 r-xp 00000000 fd:01 581723                     /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/parser.so
7f4a12c78000-7f4a12e77000 ---p 00005000 fd:01 581723                     /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/parser.so
7f4a12e77000-7f4a12e78000 rw-p 00004000 fd:01 581723                     /usr/lib64/ruby/gems/1.8/gems/gems/json-1.7.5/lib/json/ext/parser.so
7f4a12e78000-7f4a12e7a000 r-xp 00000000 fd:01 558342                     /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/trans/transdb.so
7f4a12e7a000-7f4a1307a000 ---p 00002000 fd:01 558342                     /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/trans/transdb.so
7f4a1307a000-7f4a1307b000 rw-p 00002000 fd:01 558342                     /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/trans/transdb.so
7f4a1307b000-7f4a1307d000 r-xp 00000000 fd:01 558301                     /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/encdb.so
7f4a1307d000-7f4a1327c000 ---p 00002000 fd:01 558301                     /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/encdb.so
7f4a1327c000-7f4a1327d000 rw-p 00001000 fd:01 558301                     /opt/sensu/embedded/lib/ruby/1.9.1/x86_64-linux/enc/encdb.so
7f4a1327d000-7f4a1327e000 ---p 00000000 00:00 0 
7f4a1327e000-7f4a13382000 rw-p 00000000 00:00 0 
7f4a13382000-7f4a19213000 r--p 00000000 fd:01 344634                     /usr/lib/locale/locale-archive
7f4a19213000-7f4a19219000 rw-p 00000000 00:00 0 
7f4a19224000-7f4a19466000 r-xp 00000000 fd:01 557645                     /opt/sensu/embedded/lib/libruby.so.1.9.1
7f4a19466000-7f4a19665000 ---p 00242000 fd:01 557645                     /opt/sensu/embedded/lib/libruby.so.1.9.1
7f4a19665000-7f4a1966d000 rw-p 00241000 fd:01 557645                     /opt/sensu/embedded/lib/libruby.so.1.9.1
7f4a1966d000-7f4a1968c000 rw-p 00000000 00:00 0 
7fffe66ba000-7fffe66cf000 rw-p 00000000 00:00 0                          [stack]
7fffe67ff000-7fffe6800000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]


[NOTE]
You may have encountered a bug in the Ruby interpreter or extension libraries.
Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html

bash: line 1:  6493 Aborted                 (core dumped) /opt/sensu/bin/sensu-client -b -c /etc/sensu/config.json -p /var/run/sensu/sensu-client.pid -l /var/log/sensu/sensu-client.log
                                                           [FAILED]

here's my configs:

[root@dfvimeonms1 ~]# cat /etc/sensu/config.json
{
  "rabbitmq": {
    "host": "localhost",
    "port": 5672,
    "user": "sensu",
    "password": "sensu",
    "vhost": "/sensu"
  },
  "redis": {
    "host": "localhost",
    "port": 6379
  },
  "api": {
    "host": "localhost",
    "port": 4567
  },
  "dashboard": {
    "port": 8080,
    "user": "admin",
    "password": "secret"
  },
  "handlers": {
    "default": {
      "type": "set",
      "handlers": [
        "stdout"
      ]
    },
    "stdout": {
      "type": "pipe",
      "command": "cat"
    }
  },
  "checks": {
    "test": {
      "command": "echo -n OK",
      "subscribers": [
        "test"
      ],
      "interval": 60
    }
  },
}
[root@dfvimeonms1 ~]# cat /etc/sensu/conf.d/check_cron.json 
{
    "checks": {
      "cron_check": {
        "handler": "default",
        "command": "/etc/sensu/plugins/check-procs.rb -p crond -C 1 ",
        "interval": 60,
        "subscribers": [ "webservers" ]
      }
    }
  }
[root@dfvimeonms1 ~]# cat /etc/sensu/conf.d/client.json 
{
    "client": {
      "name": "sensu-client.domain.tld",
      "address": "127.0.0.1",
      "subscriptions": [ "test", "webservers" ]
    }
  }

client log file (contains some old entries):

cat /v[root@dfvimeonms1 ~]# cat /var/log/sensu/sensu-client.log 
{"timestamp":"2012-10-15T15:27:34.884699-0400","message":"cannot connect to rabbitmq","settings":{"host":"localhost","port":5672,"user":"sensu","password":"sensu","vhost":"/sensu"},"level":"fatal"}
{"timestamp":"2012-10-15T15:27:34.884969-0400","message":"SENSU NOT RUNNING!","level":"fatal"}
{"timestamp":"2012-10-15T15:27:41.554979-0400","message":"cannot connect to rabbitmq","settings":{"host":"localhost","port":5672,"user":"sensu","password":"sensu","vhost":"/sensu"},"level":"fatal"}
{"timestamp":"2012-10-15T15:27:41.555214-0400","message":"SENSU NOT RUNNING!","level":"fatal"}
{"timestamp":"2012-10-15T18:19:48.053362-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:19:51.055399-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:19:54.057375-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:19:57.059194-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:00.061377-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:03.063376-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:06.065228-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:09.067387-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:12.069253-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:15.117280-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:18.121207-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:22.444358-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:25.446375-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:28.448812-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:31.452445-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:34.455247-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:37.576583-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:40.738159-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:43.740166-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:46.742277-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:49.744220-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:52.969314-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:56.130301-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:20:59.135644-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:21:02.140347-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:21:05.142320-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:21:08.144370-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:21:11.147451-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:21:14.149376-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T18:21:17.151251-0400","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2012-10-15T20:05:07.578058-0400","message":"received signal","signal":"TERM","level":"warn"}
{"timestamp":"2012-10-15T20:05:07.578211-0400","message":"stopping","level":"warn"}
{"timestamp":"2012-10-15T20:05:07.578302-0400","message":"unsubscribing from client subscriptions","level":"warn"}
{"timestamp":"2012-10-15T20:05:07.578393-0400","message":"completing checks in progress","checks_in_progress":[],"level":"info"}
{"timestamp":"2012-10-15T20:05:08.079268-0400","message":"stopping reactor","level":"warn"}

Client with empty name causes sensu-server to fail

If you've somehow created a client with the name of "" (empty string), sensu-server fails to start. I was unable to delete this client using the API or the sensu-dashboard. I tried deleting the client's key in Redis, but the server failed to start when reading the keep-alive queue in RabbitMQ.

I'm using Sensu 0.9.7-1 via the Omnibus installer on Ubuntu 12.04.

Here is the relevant log entries from sensu-server.

{"timestamp":"2012-09-25T19:43:39.568788+0000","message":"resigning as master","level":"warn"}
/opt/sensu/embedded/lib/ruby/1.9.1/json/common.rb:148:in `initialize': can't convert nil into String (TypeError)
      from /opt/sensu/embedded/lib/ruby/1.9.1/json/common.rb:148:in `new'
      from /opt/sensu/embedded/lib/ruby/1.9.1/json/common.rb:148:in `parse'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/server.rb:475:in `block (4 levels) in setup_keepalive_monitor'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/em/deferrable.rb:151:in `call'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/em/deferrable.rb:151:in `set_deferred_status'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/em/deferrable.rb:191:in `succeed'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/ruby-redis-0.0.2/lib/redis/client.rb:63:in `receive_data'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/eventmachine.rb:187:in `run_machine'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/eventmachine.rb:187:in `run'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/server.rb:10:in `run'
      from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/bin/sensu-server:10:in `<top (required)>'
      from /opt/sensu/bin/sensu-server:19:in `load'
      from /opt/sensu/bin/sensu-server:19:in `<main>' 

Support for mirrored queues (x-ha-policy)

In trying to use sensu with a rabbitmq cluster, one of the missing pieces is a support for mirrored queues.

http://www.rabbitmq.com/ha.html outlines using x-ha-policy.

The ruby AMQP module takes the following option when defining a queue:

:arguments => { "x-ha-policy" => "all" }

Implementation should be pretty trivial. Just curious if anything is planned or some direction on how the config should injected.

Sensu should tell you what it is doing

Sensu as an application is very quiet. It should have ability to:

  1. Log to a file|syslog
  2. Have a --verbose or --debug mode with more logging
  3. Capture error/trace output and deliver it cleanly as output.

Get list of all checks

Hi,

We're in the process of building a dashboard of service health using Sensu. Is there a way to get a list of all checks across all clients and their statuses with the API? There is /checks but that doesn't list the hosts that make each check. What is the best way to approach this?

Thanks,
Alex

Missing client keys

My two PRs skirt the root of the issue so I feel like I should log this issue So far I can't find a simple test case to reproduce it though. Here's the problem:

Background

We're using sensu to run checks and report events for ~ 600 servers. We have a handler to check keepalive events for servers that no longer exist. When appropriate, the handler makes a call to the API to delete the client.

Problem - Missing redis keys

Client keys are disappearing in redis. Eg. client:8eef4f7a-45f7-11e2-b603-12313d23e203 is nil when it should contain client data. This causes all sorts of problems down the pipe - anything that iterates through the 'clients' set and parses the JSON value expected for the keys indicated.

Symptoms

  • Broken /clients resource in the API and all reliant services [https://github.com//pull/430]
  • Remnant data
    history, events, etc. with that client's key that should have been, but has not been, deleted by the API call [https://github.com//pull/431]

Those PRs are just fixing symptoms. Have not found a good solution for ensuring the db's integrity.
I believe the API and the server are the only ones to touch redis - is there anywhere else i should be looking?

Systems with non-UTF-8 default locale can't install gem

On systems where US-ASCII or C is the default system locale (UTF-8 is installed and usable but not the default), doing a "gem install sensu" generates:

root@:~# gem install sensu Building native extensions. This could take a while... [Version 0.9.0 to 0.9.4] [BUG] A couple of concurrency issues (race conditions) fixed for apps that actively close and/or reuse channels [Version 0.9.0 to 0.9.4] [BUG] AMQP::Queue#initialize with :nowait => true no longer fails with NoMethodError [Version 0.9.0 to 0.9.4] [FEATURE] Automatic recovery mode now works for publishers Building native extensions. This could take a while... ERROR: While executing gem ... (ArgumentError) invalid byte sequence in US-ASCII

This appears to be correctable by adding this to the top of the Gemfile:

if RUBY_VERSION =~ /1.9/ Encoding.default_external = Encoding::UTF_8 Encoding.default_internal = Encoding::UTF_8 end

Handler that can process "OK" checks

I was hoping to implement a handler that logs all OK check events by configuring it using severities:['ok']

I looked at server.rb code and only "non OK" check results are passed to handle_event.

Is there any way you can change it that if a handler defines it supports an OK severity, that they are called for each check result ?

Bad configuration values should not give tracebacks

When I leave out keys from my config I get stuff like this:

/var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:38:in `invalid_config': configuration invalid, missing the following key: rabbitmq (RuntimeError)
    from /var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:144:in `has_keys'
    from /var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:142:in `each'
    from /var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:142:in `has_keys'
    from /var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:151:in `validate_settings'
    from /var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:197:in `setup_settings'
    from /var/lib/gems/1.8/gems/sensu-0.9.4/lib/sensu/config.rb:34:in `initialize'
    from /var/lib/gems/1.8/gems/sensu-dashboard-0.9.6/lib/sensu-dashboard/app.rb:25:in `new'
    from /var/lib/gems/1.8/gems/sensu-dashboard-0.9.6/lib/sensu-dashboard/app.rb:25:in `setup'
    from /var/lib/gems/1.8/gems/sensu-dashboard-0.9.6/lib/sensu-dashboard/app.rb:12:in `run'
    from /var/lib/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:179:in `call'
    from /var/lib/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:179:in `run_machine'
    from /var/lib/gems/1.8/gems/eventmachine-1.0.0.beta.4/lib/eventmachine.rb:179:in `run'
    from /var/lib/gems/1.8/gems/sensu-dashboard-0.9.6/lib/sensu-dashboard/app.rb:11:in `run'
    from /var/lib/gems/1.8/gems/sensu-dashboard-0.9.6/lib/sensu-dashboard/app.rb:464
    from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
    from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `require'
    from /var/lib/gems/1.8/gems/sensu-dashboard-0.9.6/bin/sensu-dashboard:3
    from /var/lib/gems/1.8/bin/sensu-dashboard:19:in `load'
    from /var/lib/gems/1.8/bin/sensu-dashboard:19

when I would expect to see stuff like this:

ConfigurationError: Unable to find required configuration section "rabbitmq" in "/etc/sensu/config.json"

This is true of all components, and it would go a long way to improving user experience.

sensu-server will not properly stop if redis disappears

  1. if sensu-server is running OK,
  2. then redis server disappears
  3. and then you try to kill sensu-server (eg: /etc/init.d/sensu-server stop), it will disconnect from rabbit and give up master duties but it will never properly stop/die

example:

/etc/init.d/sensu-server stop

sensu-server.log:

W, [2012-04-24T23:47:51.104202 #4429]  WARN -- : [stop] -- stopping sensu server -- TERM {"timestamp":"2012-04-24T23:47:51.104075+0000","message":"[stop] -- stopping sensu server -- TERM","level":"warn"}
W, [2012-04-24T23:47:51.109766 #4429]  WARN -- : [stop] -- unsubscribing from keepalives {"timestamp":"2012-04-24T23:47:51.104336+0000","message":"[stop] -- unsubscribing from keepalives","level":"warn"}

sensu-server process never stops.

Strace shows it trying to connect to redis repeatedly:

Process 4429 attached - interrupt to quit
select(9, [5 8], [], [], {0, 712840})   = 0 (Timeout)
stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=62, ...}) = 0
stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=62, ...}) = 0
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 7
fstat(7, {st_mode=S_IFREG|0644, st_size=264, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f571e482000
read(7, "127.0.0.1\tlocalhost\n127.0.1.1\tub"..., 4096) = 264
read(7, "", 4096)                       = 0
close(7)                                = 0
munmap(0x7f571e482000, 4096)            = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7
fcntl(7, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
connect(7, {sa_family=AF_INET, sin_port=htons(6379), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
select(9, [5 7 8], [], [], {0, 90000})  = 1 (in [7], left {0, 89997})
read(7, "", 16384)                      = 0
shutdown(7, 1 /* send */)               = -1 ENOTCONN (Transport endpoint is not connected)
close(7)                                = 0
select(9, [5 8], [], [], {0, 999932})   = 0 (Timeout)
stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=62, ...}) = 0
stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=62, ...}) = 0
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 7
fstat(7, {st_mode=S_IFREG|0644, st_size=264, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f571e482000
read(7, "127.0.0.1\tlocalhost\n127.0.1.1\tub"..., 4096) = 264
read(7, "", 4096)                       = 0
close(7)                                = 0
munmap(0x7f571e482000, 4096)            = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 7
fcntl(7, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(7, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
connect(7, {sa_family=AF_INET, sin_port=htons(6379), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(7, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
select(9, [5 7 8], [], [], {0, 90000})  = 1 (in [7], left {0, 89997})
read(7, "", 16384)                      = 0
shutdown(7, 1 /* send */)               = -1 ENOTCONN (Transport endpoint is not connected)
close(7)                                = 0
select(9, [5 8], [], [], {0, 999929}^C <unfinished ...>

Handler repeating notifications

I'm using a simple handler that just pipes the result to an email address.

On a long running sensu server I'm starting to see multiple notifications of the same issue (e.g. multiple "resolved" notifications")

I'm struggling to isolate the cause here - anyone else seen the same thing?

The validate_config method should be refactored

Currently if the config.json file is malformed you get some ugly and confusing errors, for example not including a checks section returns:

vagrant@sensu:/etc/sensu$ sudo sensu-server -c /etc/sensu/config.json 
/var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/config.rb:24:in `validate_config': undefined method `each' for nil:NilClass (#!/usr/bin/env ruby NoMethodError)
    from /var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/config.rb:20:in `initialize'
    from /var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/server.rb:34:in `new'
    from /var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/server.rb:34:in `initialize'
    from /var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/server.rb:11:in `new'
    from /var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/server.rb:11:in `run'
    from /var/lib/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `call'
    from /var/lib/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine'
    from /var/lib/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run'
    from /var/lib/gems/1.8/gems/sensu-0.8.0/lib/sensu/server.rb:10:in `run'
    from /var/lib/gems/1.8/gems/sensu-0.8.0/bin/sensu-server:9
    from /usr/local/bin/sensu-server:19:in `load'
    from /usr/local/bin/sensu-server:19

Sending some keyboard sequences makes the client to exit

Hi -

Found the following by accident. Way to reproduce:

On a client:

telnet localhost 3030

then input either CTRL+Z or CTRL+C terminates sensu-server with the following trace:

/opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/outputs/io.rb:53:in `encode': "\xFF" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/outputs/io.rb:53:in `to_json'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/outputs/io.rb:53:in `block in <<'
    from :10:in `synchronize'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/outputs/io.rb:51:in `<<'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/channel.rb:171:in `block (2 levels) in publish'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/channel.rb:170:in `each'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/channel.rb:170:in `block in publish'
    from :10:in `synchronize'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/channel.rb:169:in `publish'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/mixins/logger.rb:102:in `_log'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/mixins/logger.rb:79:in `log_with_level'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/cabin-0.4.4/lib/cabin/mixins/logger.rb:64:in `block (2 levels) in '
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.9/lib/sensu/socket.rb:43:in `rescue in receive_data'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.9/lib/sensu/socket.rb:19:in `receive_data'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/eventmachine.rb:187:in `run_machine'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/eventmachine.rb:187:in `run'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.9/lib/sensu/client.rb:12:in `run'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.9/bin/sensu-client:10:in `'
    from /opt/sensu/bin/sensu-client:19:in `load'
    from /opt/sensu/bin/sensu-client:19:in `'

Thanks.

Ruby-based handlers are expensive to spawn, causing high CPU usage/load average

Hi, so...

Bottom-line

Seems that when I have Ruby handlers for metrics (or if many non-ok events are gonna fire at some point), it's very easy for the server to become a resource hog.

Sorry for the scroll, I want to share the data...

The details

On our first productive deployment of Sensu, there are about ~15 machines, each with a set of about 3-5 standalone metrics checks running every 30 seconds (that means about 2 checks arriving per second to the server, a small VM of 2 cores & 2GB RAM. Grep'ing the actual log yielded a count of 132 metric-type results per minute).
I do not count here all other non-metric type checks which do no cause handlers to run, since they are almost all in ok state.

Metrics are coming in in Graphite format, and I'm using a custom handler to transform these into StatsD format (this is a server-side optimization, so all metrics in the landscape are only flushed periodically)

The resulting load is sometimes fairly low (0.2-0.3), but might climb to 0.5, 1.0 or even more...

The investigation :-)

It's actually hard to see the culprit via top, so first suspect was actually Graphite (currently running on the same machine), especially since it's a VM. Stopping both StatsD & Graphite had no impact for the better.

When I directed the metrics to a kind of very cheap "dummy" handler (sending the results via UDP to a random port), load dropped to practically zero.

I patched the server code to do the same thing the handler does, just to see what's the cost of that is gonna be - again, very low load - even when changing metric checks to 1-second intervals on a few nodes.

What can be done?

  • For the specific case of metrics for StatsD, I think of: Writing a Sensu::Plugin::Metric::CLI::StatsD class to directly emit results in the needed format, then have a simple UDP handler on the server to dump this into the central StatsD. This handler would only handle OK severity.
  • This would work for this specific case (and arguably maybe we don't need the central StatsD in the current setup, and Graphite would be efficient enough on its own - need to check). Still, what do we do about all the event handlers?
    Usually, they don't fire much, however imagine we have extensive set of checks, and then the DB falls and everything starts screaming...so the monitor slows down just when you need it.

I think we may need to allow in-process plugins or something similar, in some form. I know, I know...but it's so much more efficient, and the problem is real. The server itself seems very efficient, but the handler part...
When you start the server after being down for some time, then you feel a REAL crunch, with load-average through the roof (this is also somewhat related to #398 I guess...but there's a different story there)

Maybe we can think of this in the context of a plugin architecture inside the server, allowing to register custom mutators, handlers or whatever without a huge penalty. For example, I want to store the last output of any metric received, so I can fetch this data, show it and alert on it (server-side) without going to Graphite. Another example - I want to have a "positive confirmation" for each check, ensuring that it actually ran ok. These may be features, or plugins...but they need to run very efficiently.

/scroll ends

Init script start does not sleep long enough to ensure successful start

In the init script start function, there's a section:

sleep 1

# make sure it's still running. some errors occur only after startup
status &> /dev/null
if [ $? -ne 0 ]; then
    echo_fail
    exit 1
fi

However, errors like not being able to connect to RabbitMQ occur on my low-power system only after about 4 seconds.
On the other hand, increasing the sleep time would be an inconvenience when everythign works fine, so I'm not sure what a good solution would be here.

AMQP handler stopped working

Graphite is now receiving the whole message, here's what I see in the carbon logs

24/09/2012 18:59:43 :: [listener] invalid message line: {"client":{"name":"alpha","address":"54.245.118.207","subscriptions":["base","ejabberd","memcached","nginx","pgbouncer","postgresql-master","rabbitmq","redis","statsd","supervisor","webapp","webapp-api","webapp-crons","webapp-jobs"],"timestamp":1348513181},"check":{"subscribers":["postgresql-master","postgresql-standby"],"handlers":["graphite"],"type":"metric","command":"/etc/sensu/plugins/postgres/postgres-dbsize-metric.rb -u XXXX -p XXXX -d XXXX --scheme kwarter.:::name:::.postgres","interval":60,"name":"postgres_dbsize_metric","issued":1348513183,"output":"kwarter.alpha.postgres.size.kwarter_alpha\t277787768\t1348513183\n","status":0,"duration":0.345,"history":["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"]},"occurrences":1}

Resolved email not sent after Warning alert is cleared

We have seen a few cases when RESOLVED email is not sent by sensu after Warning alert is cleared.
We are using the community mailer.rb plugin for this. This is one of the examples I managed to capture...Here are the server logs if it helps:

sensu version = 0.9.7-1
Ubuntu 12.04LTS

this is the check definition:

{
"checks": {
"iowait_check": {
"notification": "IOwait too high",
"handler": "default",
"command": "/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40",
"interval": 60,
"subscribers": [ "all" ],
"low_flap_threshold" : 5,
"occurrences": 2,
"refresh": 60,
"high_flap_threshold" : 20
}
}
}

{"timestamp":"2012-10-24T10:32:03.637417+0000","message":"handling event","event":{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074710},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074720,"output":"CPU STATISTICS WARNING : user=5.05% system=5.05% iowait=34.86% idle=50.24% nice=0.00% steal=4.81% | CpuUser=5.05;CpuSystem=5.05;CpuIoWait=34.86;CpuIdle=50.24;CpuNice=0.00;CpuSteal=4.81;25;40\n","status":1,"duration":2.701,"history":["0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1"],"flapping":false},"occurrences":1,"action":"create"},"handler":{"command":"/etc/sensu/handlers/debug.rb","type":"pipe","name":"debug"},"level":"info"}
{"timestamp":"2012-10-24T10:32:03.637861+0000","message":"handling event","event":{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074710},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074720,"output":"CPU STATISTICS WARNING : user=5.05% system=5.05% iowait=34.86% idle=50.24% nice=0.00% steal=4.81% | CpuUser=5.05;CpuSystem=5.05;CpuIoWait=34.86;CpuIdle=50.24;CpuNice=0.00;CpuSteal=4.81;25;40\n","status":1,"duration":2.701,"history":["0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1"],"flapping":false},"occurrences":1,"action":"create"},"handler":{"command":"/etc/sensu/handlers/mailer.rb","type":"pipe","name":"email"},"level":"info"}
{"timestamp":"2012-10-24T10:32:03.692054+0000","message":"{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074710},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074720,"output":"CPU STATISTICS WARNING : user=5.05% system=5.05% iowait=34.86% idle=50.24% nice=0.00% steal=4.81% | CpuUser=5.05;CpuSystem=5.05;CpuIoWait=34.86;CpuIdle=50.24;CpuNice=0.00;CpuSteal=4.81;25;40\n","status":1,"duration":2.701,"history":["0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1"],"flapping":false},"occurrences":1,"action":"create"}","level":"info"}
{"timestamp":"2012-10-24T10:32:04.431209+0000","message":"not enough occurrences: app02.live.production/iowait_check","level":"info"}
{"timestamp":"2012-10-24T10:33:03.355071+0000","message":"handling event","event":{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074770},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074780,"output":"CPU STATISTICS WARNING : user=9.14% system=2.96% iowait=29.14% idle=56.54% nice=0.00% steal=2.22% | CpuUser=9.14;CpuSystem=2.96;CpuIoWait=29.14;CpuIdle=56.54;CpuNice=0.00;CpuSteal=2.22;25;40\n","status":1,"duration":2.406,"history":["0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1","1"],"flapping":false},"occurrences":2,"action":"create"},"handler":{"command":"/etc/sensu/handlers/debug.rb","type":"pipe","name":"debug"},"level":"info"}
{"timestamp":"2012-10-24T10:33:03.355478+0000","message":"handling event","event":{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074770},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074780,"output":"CPU STATISTICS WARNING : user=9.14% system=2.96% iowait=29.14% idle=56.54% nice=0.00% steal=2.22% | CpuUser=9.14;CpuSystem=2.96;CpuIoWait=29.14;CpuIdle=56.54;CpuNice=0.00;CpuSteal=2.22;25;40\n","status":1,"duration":2.406,"history":["0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1","1"],"flapping":false},"occurrences":2,"action":"create"},"handler":{"command":"/etc/sensu/handlers/mailer.rb","type":"pipe","name":"email"},"level":"info"}
{"timestamp":"2012-10-24T10:33:03.417633+0000","message":"{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074770},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074780,"output":"CPU STATISTICS WARNING : user=9.14% system=2.96% iowait=29.14% idle=56.54% nice=0.00% steal=2.22% | CpuUser=9.14;CpuSystem=2.96;CpuIoWait=29.14;CpuIdle=56.54;CpuNice=0.00;CpuSteal=2.22;25;40\n","status":1,"duration":2.406,"history":["0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1","1"],"flapping":false},"occurrences":2,"action":"create"}","level":"info"}
{"timestamp":"2012-10-24T10:34:03.065661+0000","message":"handling event","event":{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074830},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074840,"output":"CPU STATISTICS OK : user=0.51% system=0.76% iowait=0.00% idle=98.74% nice=0.00% steal=0.00% | CpuUser=0.51;CpuSystem=0.76;CpuIoWait=0.00;CpuIdle=98.74;CpuNice=0.00;CpuSteal=0.00;25;40\n","status":0,"duration":2.112,"history":["0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1","1","0"]},"occurrences":2,"action":"resolve"},"handler":{"command":"/etc/sensu/handlers/debug.rb","type":"pipe","name":"debug"},"level":"info"}
{"timestamp":"2012-10-24T10:34:03.066358+0000","message":"handling event","event":{"client":{"name":"app02.live.production","address":"94.236.40.237","subscriptions":["all","app_db","nginx","unicorn"],"environment":"production","timestamp":1351074830},"check":{"notification":"IOwait too high","handler":"default","command":"/etc/sensu/nagios-extra/check_cpu_stats.sh -w 25 -c 40","interval":60,"subscribers":["all"],"low_flap_threshold":5,"occurrences":2,"refresh":60,"high_flap_threshold":20,"name":"iowait_check","issued":1351074840,"output":"CPU STATISTICS OK : user=0.51% system=0.76% iowait=0.00% idle=98.74% nice=0.00% steal=0.00% | CpuUser=0.51;CpuSystem=0.76;CpuIoWait=0.00;CpuIdle=98.74;CpuNice=0.00;CpuSteal=0.00;25;40\n","status":0,"duration":2.112,"history":["0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","1","1","0"]},"occurrences":2,"action":"resolve"},"handler":{"command":"/etc/sensu/handlers/mailer.rb","type":"pipe","name":"email"},"level":"info"}

Suggestion - allow plugins to read check event JSON from STDIN

I have a check plugin that I wish to provide a complex configuration to, and the easiest way to achieve this is to have it read the check event JSON from STDIN

This would let me do something like this in the config :

"metric-jmx-prod": {
    "type" : "metric",
    "handlers": ["graphite"],
    "command": "/etc/sensu/plugins/metric-jmx-prod.rb",
    "config": [
        {
            "query" : "java.lang:type=Memory",
            "scheme" : "Memory",
            "attributes" : ["HeapMemoryUsage", "NonHeapMemoryUsage"]
        },{
            "query" : "java.lang:type=Threading",
            "scheme" : "Threading",
            "attributes": ["ThreadCount"]
        }
    ],
    "interval": 60,
    "subscribers": [ "hna", "beast", "xenon" ]
},

If this check json was written to STDIN when executing the check command, I could parse it, and access the config hash directly from my plugin.

Support for this could also be added to sensu-plugin.rb in a similar vein to how sensu-handler.rb does it

Support HTTP monitoring with a Query language

Upon reviewing a few bookmarks posted in another IRC Channel I ran upon an Open source project by Heroku called Umpire.

A few random times in my life I had the unfortunate question of, "Can you add x to nagios and have him get alerts for y?" and I think adding this to Sensu could allow any pointy haired boss can have fun with adding alerts on his time.

Simple route with a basic query language is all it would take then your boss can point Pingdom or Siteuptime at a URI that is Sensu with this ability to determine health.

sensu client should still run if a check is invalid

in 0.9.6.beta.7 (and possibly more versions) a broken check config will cause sensu-client to exit. In my case I had a check with a missing interval.

This is way too brittle for a monitoring system. All clients could go down at once if you have a system that pushes out configuration changes, and you have to grep the logs to find out what happened.

The server ignores invalid conf.d files and the client should too. Invalid checks should return an error message like the "Missing client attributes" response.

runaway rabbitmq queues

On 9/26 I upgraded sensu from 9.6-4 to 9.7-1. On 10/10 I received an alert from my external monitoring service saying that there was no disk space left on my sensu server. /var/lib/rabbitmq was filling up because one of the queues had no consumers. Stopping sensu-server, purging the queue and restarting sensu-server resolved the issue.

On 10/11 I upgraded rabbitmq to latest stable 2.8.7 and made sure all of my sensu clients were running 9.7-1, but on 10/12 and every day since we've had one or two instances of /results or /results and /keepalives filling up because nothing is consuming them and had to repeat the stop, purge, restart procedure to get our alerts back. There is nothing useful or unusual in the rabbitmq log or sensu-sever log.

Per @portertech, going to try this patch this afternoon to see if we can get some better info in the logs: https://github.com/sensu/sensu/blob/master/lib/sensu/server.rb#L79-87

pidfile is not deleted when api setup fails

Using 0.9.7.beta.2-1, when setup fails in lib/sensu/api.rb (in my case because I can't connect to rabbitmq), the pidfile is already written and will not be deleted.

Unless this is intended, I suggest not simply using exit 2, but exiting to a cleanup function (could refactor the logger messages there as well).
This could also be handled in the init script, assuming #353 is resolved.

Warn or fail if no Redis server is available for testing

I had this problem when running unit tests. While the suite fails when no AMQP is available, it loops forever when no Redis server is available. It's kind of clear that sensu needs Redis but I think it would be a good idea to at least warn that this can occur when running the tests. Or better have the tests timeout and fail if no Redis server is running.

Suppression windows only act on checks not handlers

It would be great if suppression configuration could be set on a handler defined to a check (as well as / instead of) being placed against the check itself. There are a couple of use cases where this behaviour would be great.

  1. When production alerting is pushed to a different handler than basic notification. In this model we'd want to alert for production issues on a check in certain time windows (let's say 9-5 weekdays), but we'd want to also record via basic email all of these alerts to cover a secondary record of the incident and re-use the same basic check for staging servers which wouldn't push include the PagerDuty handler. Here we'd want a suppression rule on the PagerDuty handler and allow the notification to be pushed to email 24/7.

  2. When combining metric extraction with alerting. This is something I'm working to achieve so we can minimise the number of checks made on client instances. Here we're collecting metrics from the client and ALWAYS want to record it to graphite via a handler (the metric collector is modified to output exit codes relevant to alerting thresholds). In certain time periods (as with example 1) we'd want to push anything critical to PagerDuty. By having suppression rules on the PagerDuty handler, we'd ensure that we don't send alerts to PagerDuty outside of the alerting windows, but still continue to collect metrics for Graphite.

Cannot install

bundle install follwed by rake:

~/src/sensu@arianna α:
bundle install
Invalid gemspec in [/var/lib/gems/1.8/specifications/rack-protection-1.1.4.gemspec]: invalid date format in specification: "2011-10-04 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/hashie-1.2.0.gemspec]: invalid date format in specification: "2011-10-15 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/em-http-request-1.0.0.gemspec]: invalid date format in specification: "2011-08-27 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/cabin-0.1.7.gemspec]: invalid date format in specification: "2011-11-08 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/tilt-1.3.3.gemspec]: invalid date format in specification: "2011-08-25 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/sinatra-1.3.1.gemspec]: invalid date format in specification: "2011-10-05 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/rack-protection-1.1.4.gemspec]: invalid date format in specification: "2011-10-04 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/hashie-1.2.0.gemspec]: invalid date format in specification: "2011-10-15 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/em-http-request-1.0.0.gemspec]: invalid date format in specification: "2011-08-27 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/cabin-0.1.7.gemspec]: invalid date format in specification: "2011-11-08 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/tilt-1.3.3.gemspec]: invalid date format in specification: "2011-08-25 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/sinatra-1.3.1.gemspec]: invalid date format in specification: "2011-10-05 00:00:00.000000000Z"
Fetching source index for http://rubygems.org/
Using rake (0.9.2.2)
Using addressable (2.2.6)
Using eventmachine (1.0.0.beta.4)
Using amqp (0.7.4)
Using rack (1.3.5)
Installing rack-protection (1.1.4)
Installing tilt (1.3.3)
Installing sinatra (1.3.1)
Using async_sinatra (0.5.0)
Using bacon (1.1.0)
Using bundler (1.0.15)
Using json (1.6.3)
Installing cabin (0.1.7)
Using daemons (1.1.4)
Using diff-lcs (1.1.3)
Using em-socksify (0.1.0)
Using http_parser.rb (0.5.3)
Installing em-http-request (1.0.0)
Using rspec-core (2.6.4)
Using rspec-expectations (2.6.0)
Using rspec-mocks (2.6.0)
Using rspec (2.6.0)
Using test-unit (2.4.2)
Using em-spec (0.2.5)
Installing hashie (1.2.0)
Using ruby-redis (0.0.2)
Using thin (1.3.1)
Your bundle is complete! Use bundle show [gemname] to see where a bundled gem is installed.

~/src/sensu@arianna α:
bundle show em-http-request
Invalid gemspec in [/var/lib/gems/1.8/specifications/rack-protection-1.1.4.gemspec]: invalid date format in specification: "2011-10-04 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/hashie-1.2.0.gemspec]: invalid date format in specification: "2011-10-15 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/em-http-request-1.0.0.gemspec]: invalid date format in specification: "2011-08-27 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/cabin-0.1.7.gemspec]: invalid date format in specification: "2011-11-08 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/tilt-1.3.3.gemspec]: invalid date format in specification: "2011-08-25 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/sinatra-1.3.1.gemspec]: invalid date format in specification: "2011-10-05 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/rack-protection-1.1.4.gemspec]: invalid date format in specification: "2011-10-04 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/hashie-1.2.0.gemspec]: invalid date format in specification: "2011-10-15 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/em-http-request-1.0.0.gemspec]: invalid date format in specification: "2011-08-27 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/cabin-0.1.7.gemspec]: invalid date format in specification: "2011-11-08 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/tilt-1.3.3.gemspec]: invalid date format in specification: "2011-08-25 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/sinatra-1.3.1.gemspec]: invalid date format in specification: "2011-10-05 00:00:00.000000000Z"
Could not find rack-protection-1.1.4 in any of the sources

~/src/sensu@arianna α:
rake
(in /home/rudd-o/src/sensu)
Invalid gemspec in [/var/lib/gems/1.8/specifications/rack-protection-1.1.4.gemspec]: invalid date format in specification: "2011-10-04 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/hashie-1.2.0.gemspec]: invalid date format in specification: "2011-10-15 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/em-http-request-1.0.0.gemspec]: invalid date format in specification: "2011-08-27 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/cabin-0.1.7.gemspec]: invalid date format in specification: "2011-11-08 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/tilt-1.3.3.gemspec]: invalid date format in specification: "2011-08-25 00:00:00.000000000Z"
Invalid gemspec in [/var/lib/gems/1.8/specifications/sinatra-1.3.1.gemspec]: invalid date format in specification: "2011-10-05 00:00:00.000000000Z"
NOTE: Gem::Specification#has_rdoc= is deprecated with no replacement. It will be removed on or after 2011-10-01.
Gem::Specification#has_rdoc= called from /home/rudd-o/src/sensu/sensu.gemspec:13
.
rake aborted!
no such file to load -- em-http-request
/home/rudd-o/src/sensu/Rakefile:7
(See full trace by running task with --trace)

Client can't start if interval is specified as string

I mistakenly set up a check with "interval": "180". When starting sensu client, I received the error "invalid settings" with the reason "check is missing interval". The interval setting wasn't missing, but was entered as a string rather than an integer. Sensu should either attempt to convert the string to an integer or at least provide a more specific error message.

feature request: ability to set hostname at client side

Since the sensu-client listen w/ tcp 3030 or udp 3030, I can use the sensu-client to act as the proxy to query other device or relaying events into sensu via the local sockets. But now the client cannot set the hostname at client side, only read from the client.json, I think it'll be interesting feature when you need extend the sensu-client

client can stop making checks if a check hangs

Running sensu-client on a node last night spotted that a check didn't happen.

I'm investigating, but one theory is that the check (an external script) didn't ever return - it hung. This caused sensu-client to never check again, and no alert to be raised.

Now it's a separate (and valid) question as to why did my check hang forever, but the default behaviour here seems to be wrong. Should be possible to specify a "timeout" option for a check, after which the check is automatically failed (if the default is "never" then the API doesn't change from current)

Anyone else seen something similar?

sensu-client dies with "undefined method `method_class' for AMQ::Protocol::HeartbeatFrame:Class"

Sensu-client died on a couple of our nodes today with the following exception in sensu-client.log:

{"timestamp":"2012-10-05T11:43:46.982026-0400","message":"reconnecting to rabbitmq","level":"warn"}
/opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.4/lib/amq/client/exceptions.rb:65:in `initialize': undefined method `method_class' for AMQ::Protocol::HeartbeatFrame:Class (NoMethodError)
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.4/lib/amq/client/async/adapter.rb:247:in `new'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.4/lib/amq/client/async/adapter.rb:247:in `send_frame'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.4/lib/amq/client/async/adapter.rb:575:in `send_heartbeat'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/em/timers.rb:56:in `call'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/em/timers.rb:56:in `fire'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/eventmachine.rb:187:in `call'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/eventmachine.rb:187:in `run_machine'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0.rc.4/lib/eventmachine.rb:187:in `run'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/lib/sensu/client.rb:8:in `run'
    from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.7/bin/sensu-client:10:in `<top (required)>'
    from /opt/sensu/bin/sensu-client:19:in `load'
    from /opt/sensu/bin/sensu-client:19:in `<main>'

@portertech asked me to remove the chef attribute node.sensu.rabbitmq.heartbeat, which I've done. I'll update this ticket if it happens again.

Synchronizing time needed?

Because any client use it's local timer, so the timestamp attached with keepalive message is different with each other, and
server makes a wrong event decision!

Sensu API server frequently crashes

The Sensu API server often crashes with the following message:

{"timestamp":"2013-01-22T18:35:15.909051-0500","message":"reconnecting to rabbitmq","level":"warn"}
{"timestamp":"2013-01-22T18:35:21.011405-0500","message":"reconnecting to redis","level":"warn"}
{"timestamp":"2013-01-22T18:35:21.011700-0500","message":"reconnecting to rabbitmq","level":"warn"}
/opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.10/lib/amq/client/exceptions.rb:65:in initialize': undefined methodmethod_class' for AMQ::Protocol::HeartbeatFrame:Class (NoMethodError)
from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.10/lib/amq/client/async/adapter.rb:247:in new' from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.10/lib/amq/client/async/adapter.rb:247:insend_frame'
from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/amq-client-0.9.10/lib/amq/client/async/adapter.rb:575:in send_heartbeat' from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/em/timers.rb:56:incall'
from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/em/timers.rb:56:in fire' from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/eventmachine.rb:187:incall'
from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/eventmachine.rb:187:in run_machine' from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/eventmachine-1.0.0/lib/eventmachine.rb:187:inrun'
from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.9/lib/sensu/api.rb:16:in run' from /opt/sensu/embedded/lib/ruby/gems/1.9.1/gems/sensu-0.9.9/bin/sensu-api:10:in<top (required)>'
from /opt/sensu/bin/sensu-api:19:in load' from /opt/sensu/bin/sensu-api:19:in

'

Any insight into why this might be happening would be much appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.