Git Product home page Git Product logo

check_rhev3's People

Contributors

administratoor avatar jhernand avatar lorenzbischof avatar scrat14 avatar xorpaul avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

check_rhev3's Issues

check_rhev3 default port

The default port for REST-API is 8443 in check_rhev3, but most installations of RHEV and oVirt are running on port 443 now.

Imho it would make sense to change the default port, but this could cause issues with existing RHEV/oVirt checks where no port is specified and REST-API listens on port 8443. So notes in ChangeLog and ReleaseNotes are highly required.

Memory usage fix

When using memcached in virtualized guest on RHEV 3.2 rhev-guest-agent reports negative memory usage value:
memory.used
Memory used (agent)


-1975684957


GAUGE
BYTES

Fix for this is (as reported by Jonathan) a multiplication with -1.

Display more detailed information of host and vm status

When checking e.g. the status of all hosts in a cluster only the number of hosts in a certain state is displayed but not which hosts is not in ok. Same for vms.
When running the check with -v these information should be displayed.

Interface traffic statistics

When creating check_rhev3 network interface traffic from RHEV 3.0 REST-API was reported in MBytes/s and not Bytes/s as documented in RHEV developer guide. It seems this has changed now.
Verify traffic against oVirt 3.3 and RHEV 3.2 and change interface traffic calculation accordingly.

Monitor Load/Mem/CPU of Individual Cluster node.

Is it possible to monitor Monitor Load/Mem/CPU of individual Cluster nodes?

Fo eg. i have 2 hosts in one cluster ,how do i monitor the Load/Mem/CPU of each hosts.

RHEV OK: Hosts ok - 2/2 Hosts with state UP |Hosts_up=2;2;4;0;

perl dependency when installing

Hello
I have tried to install the rpm on my system.
[root@nagios ~]# uname -a
Linux nagios.ft.uam.es 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 15:13:20 CDT 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@nagios ~]# cat /etc/redhat-release
Scientific Linux release 6.8 (Carbon)
I have perl installed:
[root@nagios ~]# perl -v

This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi
...

And the modules:
[root@nagios ~]# instmodsh
Available commands are:
l - List all installed modules
m - Select a module
q - Quit the program
cmd? l
Installed modules are:
Encode::Locale
Getopt::Long
HTTP::Cookies
HTTP::Date
HTTP::Negotiate
IO::HTML
LWP::MediaTypes
Perl
Test::Simple
WWW::RobotRules
XML::NamespaceSupport
XML::SAX
XML::SAX::Base
XML::SAX::Expat
XML::Simple
cmd? q

I get this error message while installing the rpm:
[root@nagios check_rhev3-1.5]# yum install /root/rpmbuild/RPMS/x86_64/nagios-plugins-rhev3-1.5-1.el6.x86_64.rpm
Loaded plugins: priorities, security, versionlock
Setting up Install Process
Examining /root/rpmbuild/RPMS/x86_64/nagios-plugins-rhev3-1.5-1.el6.x86_64.rpm: nagios-plugins-rhev3-1.5-1.el6.x86_64
Marking /root/rpmbuild/RPMS/x86_64/nagios-plugins-rhev3-1.5-1.el6.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package nagios-plugins-rhev3.x86_64 0:1.5-1.el6 will be installed
--> Processing Dependency: /bin/perl for package: nagios-plugins-rhev3-1.5-1.el6.x86_64
--> Processing Dependency: /bin/perl for package: nagios-plugins-rhev3-1.5-1.el6.x86_64
--> Finished Dependency Resolution
--> Finding unneeded leftover dependencies
Found and removing 0 unneeded dependencies
Error: Package: nagios-plugins-rhev3-1.5-1.el6.x86_64 (/nagios-plugins-rhev3-1.5-1.el6.x86_64)
Requires: /bin/perl
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

I appreciate any help.Thanks and regards,
almudena

Optimize error message

Plugins exits with "RHEV CRITICAL: Can't connect to RHEVM-API." if RHEV REST-API doesn't return expected results.

As the reason for this doesn't need to be an unsucessfull connect (it can also be e.g. statistics for nic aren't available), provide better error messages.

Furthermore:
Traffic check for unknown hosts raises a Perl exception:
./check_rhev3.pl -H engine -a admin@internal:password -R 'bla' -l network -s traffic
Use of uninitialized value $output in concatenation (.) or string at ./check_rhev3.pl line 1036.
RHEV UNKNOWN: traffic unknown - |

plugin not working after change password on RHEV-M 3.5

Hello
I changed the password on admin@internal, after that plugin does not work:
[V] Starting the main script.
[V] This is check_rhev3 version 1.4.0.
[V] Checking which component to monitor.
[D] check_dc: Called function check_dc
[V] Datacenter: Checking datacenter default.
[D] check_istatus: Called function check_istatus.
[V] Status: Checking status of data_centers.
[D] check_istatus: Input parameter $component: data_centers
[D] check_istatus: Input parameter $search: default
[D] check_istatus: Input parameter $subcheck: storagedomains
[D] check_istatus: Converting variables.
[D] check_istatus: Converted variable $url: datacenters
[D] get_result: Called function get_result.
[D] get_result: Input parameter $_[0]: /datacenters?search=name%3Ddefault
[D] get_result: Input parameter $xml: data_centers
[D] get_result: Input parameter $search: id
[D] rhev_connect: Called function rhev_connect.
[V] REST-API: Connecting to REST-API.
[D] rhev_connect: Input parameter:
[V] REST-API: RHEVM-API URL:
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: cookie filename: czAxdnN5c21ndC1hZG1pbkBpbnRlcm5hbAo=
[D] rhev_connect: Trying cookie authentication.
[D] rhev_connect: No cookie file found - using username and password
[D] rest_api_connect: Called function rest_api_connect.
[V] REST-API: Connecting to REST-API.
[D] rest_api_connect: Input parameter: HTTP::Request=HASH(0x20315190).
[D] rest_api_connect: Input parameter: LWP::UserAgent=HASH(0x2036cdd0).
[D] rest_api_connect: Input parameter: /var/tmp/czAxdnN5c21ndC1hZG1pbkBpbnRlcm5hbAo=.
[V] REST-API: Client-Date: Wed, 12 Oct 2016 07:14:22 GMT
[D] rest_api_connect: RHEV UNKNOWN: Failed to connect to RHEVM-API or received invalid response.

Connect to API works via curl

Is there any cache where could be the old password stored ?
Thank you

Default CPU load warning/critical values based on CPU cores

At the moment the default values for CPU load (5min avg) are 2 (warning) and 4 (critical) if users don't specify other thresholds.
A better solution would be a calculation based on physical CPU cores:
warning = #cores
critical = #cores * 2

Source code formatting

Improve formatting of source code by replacing tabs with spaces.
Some of the formatting done with vim got lost after migrating to eclipse.

Data truncated

Hello,

I have a probleme with the check on ovirt 3.6
I have 2 clusters.
One is OK, but in the second, the data is truncated on the XML:

The options:
-H host -a login@domain:pass -l hosts -C cl1 -o
(I have the same problem with the network )

        <summary>
            <active>12</active>
            <migrating>0</migrating>
            <total>12</total>
        </summary>
        <protocol>stomp</protocol>
        <os type="oVirt Node">
            <version full_version="3.6 - 0.999.201606071021.el7.centos"RHEV CRITICAL: Error in XML returned from RHEVM - enable debug mode for details.
Can't use an undefined value as a HASH reference at test.pl line 700.

Have you got an idea for this issue ?
I don't know if the issue is on the script or in ovirt limitation.

Best regards

Set cluster to warning if host is in maintenance mode instead of critical

The cluster status is based on UP-status of hosts only.
But for example state Maintenance shouldn't result in a critical state, as someone is working on this host and did put it into maintenance mode.

Host status for clusters should have the same logic as vms per cluster. I suggest to use critical for:

  • error
  • install_failed
  • non_responsive
  • unassigned
  • down
    All other states except up (ok) should be warning.

tmp file not found in rhev3 1.3?

Upgraded from 1.2 to 1.3 and now I get :

cat: /var/tmp/YXVzMDJnbW92aXJ0MDEuY29ycC52b2x1c2lvbi5jb20tc3ZjX25hZ2lvc0Bjb3JwLnZvbHVzaW9u: No such file or directory

network traffic

Not sure if this has been already discussed but with my rhev 3.2 environment, the network traffic status is somewhat confusing. Please see for p1p1 interface.

Status Information: RHEV CRITICAL: traffic critical - em3: 0 Mbit/s p5p1: 0 Mbit/s em1: 0 Mbit/s p1p1: 104000000 Mbit/s em2: 0 Mbit/s (ovprd2)

Performance Data: traffic_em3=0MB;62.5;87.5;0; traffic_p5p1=0MB;62.5;87.5;0; traffic_em1=0MB;62.5;87.5;0; traffic_p1p1=104000000MB;62.5;87.5;0; traffic_em2=0MB;62.5;87.5;0;-

How much of data is that in MBps?

Thanks!
Paras.

element has non-​unique value in 'id' key attribute

I would like to check the storage:

./check_rhev3.pl -D datacenter -H manager.domain.com -a user@ldap:password -l storage -s usage

But the following warning appears multiple times above the results:

Warning: <logical_​unit> element has non-​unique value in 'id' key attribute: 2...31 at /usr/lib64/nagios/plugins/custom/check_​rhev3.​pl line 1715

There are duplicate key addributes. What am I doing wrong?

Unsuccessful stat on filename containing newline

Example:

$ sudo -u nagios ./plugin-dir/check_rhev3.pl -H my-very-long-rhev-manager-hostname -f /etc/check_rhev/check_rhev3.cfg -D foobar
Unsuccessful stat on filename containing newline at ./plugin-dir/check_rhev3.pl line 1678.
RHEV UNKNOWN: Datacenters foobar not found.
$ 

It appears to be related to the way cookie filename is built. It may include newline when hostname combined with username@domain is too long.

$ echo [email protected] | base64
bXktdmVyeS1sb25nLXJoZXYtbWFuYWdlci1ob3N0bmFtZS1teXVzZXJuYW1lQG15LmRvbWFpbmUu
Y29tCg==
$ 

Regards,
Eric

Discrepancy in DataCenter Storage usage Output.

Found that storage usage for multiple Storage domain attched to Dtacenter is not proving the exact output.

/opt/plugins/check_rhev3 -H xxx -f /etc/.rhevpw -D xxx -l storage -s usage
RHEV WARNING: storage warning - 77.92% used (xxxx) |storage_XXX=77.92%;60;80;0;

Attched Storage details/

    <type>nfs</type>
    <path>/vol/01_kvm/iso</path>
    </storage>
    <available>289910292480</available>
    <used>998579896320</used>
    <committed>0</committed>
    <storage_format>v1</storage_format>

    <type>nfs</type>
    <path>/vol/01_kvm2/data1</path>
    </storage>
    <available>1714765692928</available>
    <used>1486058684416</used>
    <committed>1471026298880</committed>
    <storage_format>v3</storage_format>
    </storage_domain>

    <type>nfs</type>
    </storage>
    <available>289910292480</available>
    <used>998579896320</used>
    <committed>8256000884736</committed>
    <storage_format>v3</storage_format>
    </storage_domain>

Critical/Warning thresholds for -l vms/hosts are ignored

The check against $tmp_state on line 1630 and 1633 means that any thresholds specified will be ignored.

Example:
/usr/local/bin/check_rhev3.pl -H rhevm.somesite.net -f /etc/nagios/rhevm_credentials -C mycluster -l vms -w 50 -c 20
RHEV CRITICAL: Vms critical - 62/66 Vms with state UP |Vms_up=62;50;20;0; restoring_state=0;;;0; image_illegal=0;;;0; powering_down=0;;;0; reboot_in_progress=0;;;0; image_locked=0;;;0; unassigned=0;;;0; unknown=0;;;0; down=4;;;0; wait_for_launch=0;;;0; not_responding=0;;;0; powering_up=0;;;0; saving_state=0;;;0; migrating_to=0;;;0; paused=0;;;0; migrating_from=0;;;0; suspended=0;;;0;

When using '*' and option '-l usage', output is incomplete

When checking storage domains with "*" option and "-l usage" option, output is not correct. Only the last volume output is written to STDOUT. I was expecting information concerning all volumes.
exemple (I currently have 15 volumes in ovirt-engine) :

$ ./check_rhev3 -H ovirt-engine -A /ovirt-engine/api -a username:password -S \* -l usage
RHEV OK: storage ok - 29.17% used (Export) |storage_Export=29.17%;60;80;0;

"Export" is the last storage, 14 other storages are missing from the output.

When using the "-vvv" option, we clearly see that all volumes are analysed, but only the last one is "outputed", so maybe it is a "concatenation" bug ?

FYI : I have seen this problem with other options using '*', but have not memorised them all.

VM Pool Check broken in oVirt 4.1

When testing vm pool usage in oVirt 4.1 (maybe in older versions as well, but not yet verified) the following error occurs:

$ ./check_rhev3.pl -H engine.lab.rk-it.at -f /tmp/auth -A /ovirt-engine/api -P test-centos -l usage Not an ARRAY reference at ./check_rhev3.pl line 610.

oVirt: ovirt-engine-4.1.0.4-1.el7.centos.noarch

RHEV host Memory ussage showing wrong

Plugin version: check_rhev3 1.2.0
RHEV OK: Version ok - Default: 3.0

RHEV-H host memory ussage was showing wrong, plugin showing 61.80% was used, where as actual host usage is only 15% ( checked using free -m command)
debug output attached, kindly help me how to fix the usage.

check_rhev3.pl -H RHEVM -a admin@internal:ctrls.123 -R RHEV -l memory -s mem

[V] Starting the main script.
[V] Checking which component to monitor.
[V] Host: Checking host RHEV.
[V] Statistics: Checking statistics of hosts.
[V] REST-API: Connecting to REST-API.
[V] REST-API: RHEVM-API URL: https://RHEV-M:8443/api/hosts?search=RHEV
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: RHEVM-API Password: ctrls.123
[V] REST-API: cookie filename: MTA1sTXzLjI0OC44sMC4xOTQtaSdfuYWwK
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 12 Nov 2013 05:40:29 GMT
Pragma: No-cache
Server: Apache-Coyote/1.1
Content-Length: 2631
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 05:30:00 IST
Client-Date: Tue, 12 Nov 2013 05:41:38 GMT
Client-Peer: RHEV-M:8443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=LT/CN=CA-10.101.3.106.73692
Client-SSL-Cert-Subject: /C=US/O=LT/CN=10.101.3.106
Client-SSL-Cipher: DHE-RSA-AES128-SHA
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
X-Powered-By: Servlet 2.5; JBoss-5.0/JBossWeb-2.1
RHEV: f16b5c86-13e5-11e3-aa8e-3440b5a4784a
[V] Statistics: RHEV: f16b5c86-13e5-11e3-aa8e-3440b5a4784a.
[V] Stats: Checking statistics of hosts.
[V] REST-API: Connecting to REST-API.
[V] REST-API: RHEVM-API URL: https://RHEV-M:8443/api/hosts/f16b5c86-13e5-11e3-aa8e-3440b5a4784a/statistics
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: RHEVM-API Password: password
[V] REST-API: cookie filename: MTA1sTXzLjI0OC44sMC4xOTQtaSdfuYWwK
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 12 Nov 2013 05:40:29 GMT
Pragma: No-cache
Server: Apache-Coyote/1.1
Content-Length: 7891
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 05:30:00 IST
Client-Date: Tue, 12 Nov 2013 05:41:38 GMT
Client-Peer: RHEV-M:8443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=LT/CN=CA-10.101.3.106.73692
Client-SSL-Cert-Subject: /C=US/O=LT/CN=10.101.3.106
Client-SSL-Cipher: DHE-RSA-AES128-SHA
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
X-Powered-By: Servlet 2.5; JBoss-5.0/JBossWeb-2.1
[V] Statistics: Getting Memory Usage.
[V] Statistics: Memory Usage of RHEV: 61.80.
[V] Statistics: warning value: 60.
[V] Statistics: critical value: 80.
[V] Statistics: Performance data: |memory=61.80%;60;80;0; memory.cached=0;;;0; memory.used=40632424857.6;;;0; memory.buffers=0;;;0; .
[V] Statistics: Output: 61.80% used (RHEV)
RHEV WARNING: memory warning - 61.80% used (RHEV) |memory=61.80%;60;80;0; memory.cached=0;;;0; memory.used=40632424857.6;;;0; memory.buffers=0;;;0;

Wrong CPU value for vms

Version : 1.4.0
When monitoring the cpu usage of a vm, i get a percentual over 100%.
[root@nagios]# ./check_rhev3 -H RHEVM -a admin@internal:password -M VM -l cpu -w 10 -c 15
RHEV CRITICAL: cpu critical - 170% used (VM) |cpu=170%;10;15;0; cpu.current.guest=100;;;0; cpu.current.hypervisor=68;;;0;

As workaround i have edited the check_statistics function.

}else{
  # get warning and critical values for load based on physical CPU cores
  if ((! defined $o_warn || ! defined $o_crit) && $statistics eq "cpu.load.avg.5m"){
    my $lref  = get_result("/$url/$id{ $key }","cpu","topology");
    my %cputop = %{ $lref };
    print "[D] check_statistics: \%cputop: " if $o_verbose == 3; print Dumper(%cputop) if $o_verbose == 3;
    # warning = #cores
    # critical = #cores * 2
    $o_warn = $cputop{ 'sockets' } * $cputop{ 'cores' } unless defined $o_warn;
    $o_crit = $cputop{ 'sockets' } * $cputop{ 'cores' } * 2 unless defined $o_crit;
  }
  #-----Updated by Franz Geiser-----
  #Ricavo il numero di socket e core
  if ($statistics eq "cpu" && $component eq "vms"){
    my $lref  = get_result("/$url/$id{ $key }","cpu","topology");
    my %cputop = %{ $lref };
    print "[D] check_statistics: \%cputop: " if $o_verbose == 3; print Dumper(%cputop) if $o_verbose == 3;
    # warning = #cores
    # critical = #cores * 2
    $cpu_num = $cputop{ 'sockets' } * $cputop{ 'cores' };
  }
  #------------------------------------
  # check cpu, load and memory
  my $iret = get_stats($component,$id{ $key },$subcheck,$statistics,$key);
  my %temp = %{ $iret };
  $rethash{$key} = $temp{$key};
  #-----Updated by Franz Geiser-----
  if ($statistics eq "cpu" && $component eq "vms"){
    #divido l'utilizzo della CPU per il numero di core totali
    $rethash{$key}{usage} = int($rethash{$key}{usage}/$cpu_num);
  }
  #------------------------------------
  print "[D] check_statistics: \%rethash: " if $o_verbose == 3; print Dumper(%rethash) if $o_verbose == 3;
}

plugin not working with ovirt 3.3.1

hello
i was testing the plugin and it was working fine on ovirt 3.3.0
now i did an upgrade to 3.3.1 and the plugin is no longer working

any clues?

./check_rhev3 -H ovirt -p 443 -a admin@internal:xxxxxxxxx-C WestmereCluster -vvvv
[V] Starting the main script.
[V] Checking which component to monitor.
[D] check_cluster: Called function check_cluster
[V] Cluster: Checking cluster WestmereCluster.
[V] Cluster: No check is specified, checking cluster host status.
[D] check_cluster_status: Called function check_cluster_status.
[V] Status: Checking status of hosts.
[D] check_cluster_status: Input parameter $subcheck: hosts
[D] get_result: Called function get_result.
[D] get_result: Input parameter $_[0]: /clusters?search=name%3DWestmereCluster
[D] get_result: Input parameter $xml: clusters
[D] get_result: Input parameter $search: id
[D] rhev_connect: Called function rhev_connect.
[V] REST-API: Connecting to REST-API.
[D] rhev_connect: Input parameter: /clusters?search=name%3DWestmereCluster.
[V] REST-API: RHEVM-API URL: https://vega:443/api/clusters?search=name%3DWestmereCluster
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: RHEVM-API Password: xxxxxxxxx
[V] REST-API: cookie filename: dmVnYS1hZG1pbkBpbnRlcm5hbAo=
[D] rhev_connect: Trying cookie authentication.
[D] rhev_connect: No cookie file found - using username and password
[D] rest_api_connect: Called function rest_api_connect.
[V] REST-API: Connecting to REST-API.
[D] rest_api_connect: Input parameter: HTTP::Request=HASH(0x2312228).
[D] rest_api_connect: Input parameter: LWP::UserAgent=HASH(0x2956858).
[D] rest_api_connect: Input parameter: /var/tmp/dmVnYS1hZG1pbkBpbnRlcm5hbAo=.
RHEV UNKNOWN: Failed to connect to RHEVM-API or received invalid response.

use cookie based authentication as default authentication method

Since check_rhev3 1.2 it's possible to use cookied based authentication instead of username and password based authentication for RHEV >= 3.1 and oVirt >= 3.1 environments when specifiying option "-o".

As most users didn't recognize this options. As discussed in oVirt mailling list, it's better to use cookied based auth first and fallback to username and password.

Workflow for this could be:
check_rhev3-authentication

Host in maintenaince mode reports 100% cpu

Hi,
Host in maintenaince mode reports 100% cpu

./check_rhev3 -H khk9dsg32.ip.tdk.dk -t 60 -p 443 -a admin@internal:xxxxxxxxxxxx -R khk9dsk34.ip.tdk.dk -l cpu -s usage
RHEV CRITICAL: cpu critical - 100% used (khk9dsk34.ip.tdk.dk) |cpu=100%;60;80;0; cpu.current.user=0;;;0; cpu.current.system=0;;;0; cpu.current.idle=0;;;0;

I would prefer a WARNING state 0% load and a text saying the host are in maintenance.

thanks,
Peter Calum
Denmark

VM's in Host status result (Feature request)

It could be nice to see how many VMs running under each host, also in Nagios.
Example : /check_rhev3 -H khk9dsg31.ip.tdk.dk -p 443 -a admin@internal:xxxxxxx -R khk9dsk30.ip.tdk.dk -l status
Result now:
RHEV OK: Hosts ok - 1/1 Hosts with state UP|Hosts=1;1;1;0;
New Result :
RHEV OK: Host ok - 1/1 Host UP, 12 VM's, 12 active, 0 migrating |Hosts=1;1;1;0; VMS=12;12;0;0;
Thanks
Peter Calum

Add option for state of vms/hosts

Some users want to have a vm state as critical others as warning or ok when duing a check of running vms on a host or in a cluster. So an option should be implemented to not only specify number of vms for critical/warning state but also a range of states.

Monitor datacenter quotas

Add support for monitoring datacenter quotas with the following default values (make it customizable as well):
warning = cluster_soft_limit_pct | storage_soft_limit
critical = 100% utilization

build error

I am trying to build check_rhev3 under rhel 6. I am seeing the following issue. All the required perl modules are installed.

  • aclocal: this is fine

* autoconf:

[root@monitor3 check_rhev3-master]# autoconf
configure.in:35: error: possibly undefined macro: AC_PROG_PERL_MODULES
If this token and others are legitimate, please use m4_pattern_allow.

See the Autoconf documentation.

Line 35 is: AC_PROG_PERL_MODULES( LWP::UserAgent, , AC_MSG_ERROR(Missing Perl module LWP::UserAgent))

Is it not seeing the perl modules?

install perl modules

[root@monitor3 check_rhev3-master]# instmodsh
Available commands are:
l - List all installed modules
m - Select a module
q - Quit the program
cmd? l
Installed modules are:
Encode::Locale
File::Listing
Getopt::Long
HTTP::Cookies
HTTP::Daemon
HTTP::Date
HTTP::Message
HTTP::Negotiate
IO::HTML
LWP
LWP::MediaTypes
Net::HTTP
Perl
WWW::RobotRules
XML::SAX::Expat

XML::Simple

Thanks!
-Paras

More detailed information of datacenter, host, vm and storagedomain status

check_rhev3 should provide more detailed information of the status of datacenters, hosts, vms and storagedomains. At the moment only the amount of UP elements is given.

Expected verbose output:
$ ./check_rhev3.pl -H localhost -a admin@internal:password -D "*" -v
RHEV CRITICAL: Datacenters critical - 1/3 Datacenters with state UP [Details: 2 uninitialized, 1 up]|up=1;3;3;0; uninitialized=2;;;0;

Non-verbose output:
$ ./check_rhev3.pl -H localhost -a admin@internal:password -D "*"
RHEV CRITICAL: Datacenters critical - 1/3 Datacenters with state UP |up=1;3;3;0; uninitialized=2;;;0;

Detailed performance data should help getting an overview of the virtualization environment.

check for "update available"

Hi,

Thanks for the Script, it's working fine and it is as expected. One thing what I am missing is that there is no check for the hosts if an update is available.

Did a little search and found this:

https://ovirt.github.io/ovirt-engine-api-model/master/#services/host
POST /hosts/{host:id}/upgradecheck

I think that can be implemented, but not sure how. I just need a Warning if an Update is available on one of my hosts.

Would be nice if someone has the time to do this.

much appreciated

(new) api user not working

Hi,

First, thanks for the script!

I have Ovirt-Engine 4.1 and the script is working with the default admin@internal user. I want to create a new user (best read only) only for API but I don't get it to work. I tried with new Admin User and treid to add it to the viewer group, which I could not find.

I also couldn't find a proper Ovirt API Docu.

Thanks for your / someone's help.

RHEV UNKNOWN: Failed to connect to RHEVM-API or received invalid response.

Hello,
I have installed the new Release of your plugin, I can use it in the console and all works great, but I have this message in the Nagios portal.
I've tried a lot of options -a -f, without success, I suspect the credentials but not sure. The password has an @ character, I've tried to use escape \ but I still have the error.
Can you help me to debug it,
Thanks Alain

Wrong CPU Usage on hypervisor maintenance

Hi,

when a hypervisor is in maintenance mode in ovirtm/rhevm the plugin reports (wrongly) 100% cpu usage (and state CRITICAL depending on thresholds)
This is probably because the api sends cpu.idle, cpu.user and cpu.system as 0 value. In that case (all three values: 0) the plugin should report UNKNOWN state.

Cheers,
Benjamin

VM CPU usage graphs show values over 100%

CPU usage graph shows values of 'cpu.current.guest' instead of 'cpu', so they can be way over 100%. I made a change to PNP template so it uses relative 'cpu' value instead (up to 100%):

--- check_rhev3.php-orig        2014-06-27 09:29:51.628727867 +0200
+++ check_rhev3.php     2014-06-27 09:55:57.804063199 +0200
@@ -39,7 +39,7 @@

     # process VM CPU stats
     $def[1] .= "CDEF:sp1=100,var1,- ";
-    $def[1] .= "CDEF:sp2=var2 ";
+    $def[1] .= "CDEF:sp2=var1 ";
     $def[1] .= "CDEF:sp3=var3 ";

     $def[1] .= "AREA:sp2#000080:\"Guest      \" ";

Authentication fails if password contains equal sign

If password of API user contains an equal sign (=) authentication fails as = is used for split.

  }elsif ($_ =~ /^password=/){
    my @tmp = split(/=/, $_);

Substituting "password=" with "" seems to be the better approach.

RHEV UNKNOWN: Can't open file /var/tm/xxx writing: Permission denied

Hi

Getting " RHEV UNKNOWN: Can't open file /var/tmp/ODcuMjU0LjIxNS4xNDAtYWRtaW5AaW50ZXJuYWwK for writing: Permission denied,when ever we tried to run the plugin with diffrent users.

it would be better enable auto delete the TMP file after finshing the comman execution which can avoid users manullay delete the tmp file .

Br/Prashanth.P

Improve search filters

Every component like datacenters, hosts,... accepts regular expressions additionally to full names. Some of these regular expressions don't behave in the same way as in search bar of RHEV/oVirt.

E.g. in this example no storage is found when using wild cards instead of datacenter name:
$ ./check_rhev3.pl -H rhevm -a admin@internal:password -D "*" -l storage -s usage
RHEV CRITICAL: storage critical - No storage found!

Include version in debug output

Debug output needs some improvements:

  • remove password as it doesn't help troubleshooting and users have to substitute it when reporting bugs
  • add plugin version of check_rhev3

storage domain usage warnings/critical with fixed values

As suggested by Stefan Marleaux, warning and critical values for storage domains should be fixed values (e.g. gb free), too.
At the moment warnings and critical can only be % free, which is unflexible when extending a storage.

example:
$ check_rhev3 -H localhost -f .authrc -S isos -l usage -w 80 -c 90
warning if storage usage is more then 80%
critical if storage usage is more then 90%

$ check_rhev3 -H localhost -f .authrc -S isos -l usage -w 200G -c 100G
warning if storage has less then 200G free
critical if storage has less then 100G free

Set memory usage of vms to 0 if negative

Memory is the amount of memory used on the hypervisor, not the amount of memory used in the guest. If this value is below 0, set it to 0 (negative value is a bug in oVirt/RHEV-API)

check_rhev not providing results after upgrade to RHEVM 3.6

Hi Team,

We recently upgraded rhevm machine from RHEVM3.2 to RHEVM3.6

After the upgrade check_rhevm3 plugin is not able to get the data for host for most of the services as below

[root@mon01 libexec]# ./check_rhev3.pl -H XXXXXX -a nagios@internal:XXXXXX -R hyp-app-03 -l memory
RHEV UNKNOWN: memory unknown - Performance data not found!
[root@mon01 libexec]# ./check_rhev3.pl -H XXXXXX -a nagios@internal:XXXXXX -R hyp-app-03 -l cpu
RHEV UNKNOWN: cpu unknown - Performance data not found!
[root@mon01 libexec]# ./check_rhev3.pl -H XXXXXX -a nagios@internal:XXXXXXXX -R hyp-app-03 -l load
RHEV UNKNOWN: cpu.load.avg.5m unknown - Performance data not found!

I have replaced the RHEVM hostname and password with XXXX

Wrong return message (minor issue)

./check_rhev3 -H khk9dsg31.ip.tdk.dk -p 443 -a admin@internal:xxxxxxxxx -R khk9dsk30.ip.tdk.dk -l network -s status
RHEV OK: Hosts ok - 6/6 Nics with state Active|nics=6;6;6;0;

It returns ‘Hosts ok’ – I would expect ‘Network ok’ or 'Nics Ok'
Thanks
Peter Calum

Check RHEV Host VMs (-l vms) fails with RHEV UNKNOWN: Host ' ' not found when no vms on host

Have been running this script in Nagios for a while and suddenly it failed.
Running is from the CLI
check_rhev3.pl -H rhevm -p 443 -f .rhevauth -R RHEVHOST -l vms
RHEV UNKNOWN: Host RHEVHOST not found.

Saw that here is no vm running on the RHEVHOST. If I start a VM on the RHEVHOST the "RHEV UNKNOWN" goes away.

I think this is a bug.

Have provided a -vvv output.
check_rhev3.pl -vvv -H rhevm -p 443 -f .rhevauth -R RHEVHOST -l vms
[V] Starting the main script.
[V] This is check_rhev3 version 1.6.0.
[V] Checking which component to monitor.
[D] check_host: Called function check_host.
[V] Host: Checking host RHEVHOST.
[D] check_cstatus: Called function check_cstatus.
[D] check_status: Called function check_status.
[V] Status: Checking status of hostvms.
[D] check_status: Input parameter $components: hostvms
[D] check_status: Input parameter $search: RHEVHOST
[D] check_status: Converting variables.
[D] check_status: Converted variable $components: hostvms
[D] check_status: Converted variable $component: hostvm
[D] rhev_connect: Called function rhev_connect.
[V] REST-API: Connecting to REST-API.
[D] rhev_connect: Input parameter: /vms?search=host%3DRHEVHOST.
[V] REST-API: RHEVM-API URL: https://rhevm.:443/api/vms?search=host%3DRHEVHOST
[V] REST-API: RHEVM-API User:
[V] REST-API: cookie filename:
[D] rhev_connect: Trying cookie authentication.
[D] rhev_connect: Using cookie: JSESSIONID=
[V] REST-API: Cookie authentication failed - using username and password.
[D] rest_api_connect: Called function rest_api_connect.
[V] REST-API: Connecting to REST-API.
[D] rest_api_connect: Input parameter: HTTP::Request=HASH(0x1684af0).
[D] rest_api_connect: Input parameter: LWP::UserAgent=HASH(0x1677db8).
[D] rest_api_connect: Input parameter:
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 28 Mar 2017 12:03:02 GMT
Pragma: No-cache
Vary: Accept-Encoding
Content-Length: 63
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 01:00:00 CET
Client-Date: Tue, 28 Mar 2017 12:03:02 GMT
Client-Peer: xxx.xxx.xxx.xxx:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: CN=Certificate Authority
Client-SSL-Cert-Subject: CN=rhevm
Client-SSL-Cipher: ECDHE-RSA-AES256-GCM-SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
JSESSIONID:
[D] rest_api_connect:

[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 28 Mar 2017 12:03:02 GMT
Pragma: No-cache
Vary: Accept-Encoding
Content-Length: 63
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 01:00:00 CET
Client-Date: Tue, 28 Mar 2017 12:03:02 GMT
Client-Peer: xxx.xxx.xxx.xxx:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: CN=Certificate Authority
Client-SSL-Cert-Subject: CN=rhevm
Client-SSL-Cipher: ECDHE-RSA-AES256-GCM-SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
JSESSIONID:
[D] rhev_connect:

[D] check_status: %result: [D] check_status: Looping through %result
[V] Status: Search pattern RHEVHOST not found.
[D] print_notfound: Called function print_notfound.
[D] print_notfound: Input parameter: Host
[D] print_notfound: Input parameter: RHEVHOST
RHEV UNKNOWN: Host RHEVHOST not found.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.