ovido / check_rhev3 Goto Github PK
View Code? Open in Web Editor NEWThis plugin has a new home: https://github.com/rk-it-at/check_rhv
License: GNU General Public License v3.0
This plugin has a new home: https://github.com/rk-it-at/check_rhv
License: GNU General Public License v3.0
The default port for REST-API is 8443 in check_rhev3, but most installations of RHEV and oVirt are running on port 443 now.
Imho it would make sense to change the default port, but this could cause issues with existing RHEV/oVirt checks where no port is specified and REST-API listens on port 8443. So notes in ChangeLog and ReleaseNotes are highly required.
When using memcached in virtualized guest on RHEV 3.2 rhev-guest-agent reports negative memory usage value:
memory.used
Memory used (agent)
-1975684957
GAUGE
BYTES
Fix for this is (as reported by Jonathan) a multiplication with -1.
When checking e.g. the status of all hosts in a cluster only the number of hosts in a certain state is displayed but not which hosts is not in ok. Same for vms.
When running the check with -v these information should be displayed.
When creating check_rhev3 network interface traffic from RHEV 3.0 REST-API was reported in MBytes/s and not Bytes/s as documented in RHEV developer guide. It seems this has changed now.
Verify traffic against oVirt 3.3 and RHEV 3.2 and change interface traffic calculation accordingly.
Is it possible to monitor Monitor Load/Mem/CPU of individual Cluster nodes?
Fo eg. i have 2 hosts in one cluster ,how do i monitor the Load/Mem/CPU of each hosts.
RHEV OK: Hosts ok - 2/2 Hosts with state UP |Hosts_up=2;2;4;0;
http://www.ovirt.org/develop/release-management/features/engine/cumulative-rx-tx-statistics/
I guess it could be done in the same way than nic errors are checked now.
Hello
I have tried to install the rpm on my system.
[root@nagios ~]# uname -a
Linux nagios.ft.uam.es 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 15:13:20 CDT 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@nagios ~]# cat /etc/redhat-release
Scientific Linux release 6.8 (Carbon)
I have perl installed:
[root@nagios ~]# perl -v
This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi
...
And the modules:
[root@nagios ~]# instmodsh
Available commands are:
l - List all installed modules
m - Select a module
q - Quit the program
cmd? l
Installed modules are:
Encode::Locale
Getopt::Long
HTTP::Cookies
HTTP::Date
HTTP::Negotiate
IO::HTML
LWP::MediaTypes
Perl
Test::Simple
WWW::RobotRules
XML::NamespaceSupport
XML::SAX
XML::SAX::Base
XML::SAX::Expat
XML::Simple
cmd? q
I get this error message while installing the rpm:
[root@nagios check_rhev3-1.5]# yum install /root/rpmbuild/RPMS/x86_64/nagios-plugins-rhev3-1.5-1.el6.x86_64.rpm
Loaded plugins: priorities, security, versionlock
Setting up Install Process
Examining /root/rpmbuild/RPMS/x86_64/nagios-plugins-rhev3-1.5-1.el6.x86_64.rpm: nagios-plugins-rhev3-1.5-1.el6.x86_64
Marking /root/rpmbuild/RPMS/x86_64/nagios-plugins-rhev3-1.5-1.el6.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package nagios-plugins-rhev3.x86_64 0:1.5-1.el6 will be installed
--> Processing Dependency: /bin/perl for package: nagios-plugins-rhev3-1.5-1.el6.x86_64
--> Processing Dependency: /bin/perl for package: nagios-plugins-rhev3-1.5-1.el6.x86_64
--> Finished Dependency Resolution
--> Finding unneeded leftover dependencies
Found and removing 0 unneeded dependencies
Error: Package: nagios-plugins-rhev3-1.5-1.el6.x86_64 (/nagios-plugins-rhev3-1.5-1.el6.x86_64)
Requires: /bin/perl
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
I appreciate any help.Thanks and regards,
almudena
Plugins exits with "RHEV CRITICAL: Can't connect to RHEVM-API." if RHEV REST-API doesn't return expected results.
As the reason for this doesn't need to be an unsucessfull connect (it can also be e.g. statistics for nic aren't available), provide better error messages.
Furthermore:
Traffic check for unknown hosts raises a Perl exception:
./check_rhev3.pl -H engine -a admin@internal:password -R 'bla' -l network -s traffic
Use of uninitialized value $output in concatenation (.) or string at ./check_rhev3.pl line 1036.
RHEV UNKNOWN: traffic unknown - |
Hello
I changed the password on admin@internal, after that plugin does not work:
[V] Starting the main script.
[V] This is check_rhev3 version 1.4.0.
[V] Checking which component to monitor.
[D] check_dc: Called function check_dc
[V] Datacenter: Checking datacenter default.
[D] check_istatus: Called function check_istatus.
[V] Status: Checking status of data_centers.
[D] check_istatus: Input parameter $component: data_centers
[D] check_istatus: Input parameter $search: default
[D] check_istatus: Input parameter $subcheck: storagedomains
[D] check_istatus: Converting variables.
[D] check_istatus: Converted variable $url: datacenters
[D] get_result: Called function get_result.
[D] get_result: Input parameter $_[0]: /datacenters?search=name%3Ddefault
[D] get_result: Input parameter $xml: data_centers
[D] get_result: Input parameter $search: id
[D] rhev_connect: Called function rhev_connect.
[V] REST-API: Connecting to REST-API.
[D] rhev_connect: Input parameter:
[V] REST-API: RHEVM-API URL:
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: cookie filename: czAxdnN5c21ndC1hZG1pbkBpbnRlcm5hbAo=
[D] rhev_connect: Trying cookie authentication.
[D] rhev_connect: No cookie file found - using username and password
[D] rest_api_connect: Called function rest_api_connect.
[V] REST-API: Connecting to REST-API.
[D] rest_api_connect: Input parameter: HTTP::Request=HASH(0x20315190).
[D] rest_api_connect: Input parameter: LWP::UserAgent=HASH(0x2036cdd0).
[D] rest_api_connect: Input parameter: /var/tmp/czAxdnN5c21ndC1hZG1pbkBpbnRlcm5hbAo=.
[V] REST-API: Client-Date: Wed, 12 Oct 2016 07:14:22 GMT
[D] rest_api_connect: RHEV UNKNOWN: Failed to connect to RHEVM-API or received invalid response.
Connect to API works via curl
Is there any cache where could be the old password stored ?
Thank you
At the moment the default values for CPU load (5min avg) are 2 (warning) and 4 (critical) if users don't specify other thresholds.
A better solution would be a calculation based on physical CPU cores:
warning = #cores
critical = #cores * 2
Improve formatting of source code by replacing tabs with spaces.
Some of the formatting done with vim got lost after migrating to eclipse.
Hello,
I have a probleme with the check on ovirt 3.6
I have 2 clusters.
One is OK, but in the second, the data is truncated on the XML:
The options:
-H host -a login@domain:pass -l hosts -C cl1 -o
(I have the same problem with the network )
<summary>
<active>12</active>
<migrating>0</migrating>
<total>12</total>
</summary>
<protocol>stomp</protocol>
<os type="oVirt Node">
<version full_version="3.6 - 0.999.201606071021.el7.centos"RHEV CRITICAL: Error in XML returned from RHEVM - enable debug mode for details.
Can't use an undefined value as a HASH reference at test.pl line 700.
Have you got an idea for this issue ?
I don't know if the issue is on the script or in ovirt limitation.
Best regards
The cluster status is based on UP-status of hosts only.
But for example state Maintenance shouldn't result in a critical state, as someone is working on this host and did put it into maintenance mode.
Host status for clusters should have the same logic as vms per cluster. I suggest to use critical for:
Upgraded from 1.2 to 1.3 and now I get :
cat: /var/tmp/YXVzMDJnbW92aXJ0MDEuY29ycC52b2x1c2lvbi5jb20tc3ZjX25hZ2lvc0Bjb3JwLnZvbHVzaW9u: No such file or directory
Not sure if this has been already discussed but with my rhev 3.2 environment, the network traffic status is somewhat confusing. Please see for p1p1 interface.
Status Information: RHEV CRITICAL: traffic critical - em3: 0 Mbit/s p5p1: 0 Mbit/s em1: 0 Mbit/s p1p1: 104000000 Mbit/s em2: 0 Mbit/s (ovprd2)
How much of data is that in MBps?
Thanks!
Paras.
I would like to check the storage:
./check_rhev3.pl -D datacenter -H manager.domain.com -a user@ldap:password -l storage -s usage
But the following warning appears multiple times above the results:
Warning: <logical_unit> element has non-unique value in 'id' key attribute: 2...31 at /usr/lib64/nagios/plugins/custom/check_rhev3.pl line 1715
There are duplicate key addributes. What am I doing wrong?
Example:
$ sudo -u nagios ./plugin-dir/check_rhev3.pl -H my-very-long-rhev-manager-hostname -f /etc/check_rhev/check_rhev3.cfg -D foobar
Unsuccessful stat on filename containing newline at ./plugin-dir/check_rhev3.pl line 1678.
RHEV UNKNOWN: Datacenters foobar not found.
$
It appears to be related to the way cookie filename is built. It may include newline when hostname combined with username@domain is too long.
$ echo [email protected] | base64
bXktdmVyeS1sb25nLXJoZXYtbWFuYWdlci1ob3N0bmFtZS1teXVzZXJuYW1lQG15LmRvbWFpbmUu
Y29tCg==
$
Regards,
Eric
Found that storage usage for multiple Storage domain attched to Dtacenter is not proving the exact output.
/opt/plugins/check_rhev3 -H xxx -f /etc/.rhevpw -D xxx -l storage -s usage
RHEV WARNING: storage warning - 77.92% used (xxxx) |storage_XXX=77.92%;60;80;0;
<type>nfs</type>
<path>/vol/01_kvm/iso</path>
</storage>
<available>289910292480</available>
<used>998579896320</used>
<committed>0</committed>
<storage_format>v1</storage_format>
<type>nfs</type>
<path>/vol/01_kvm2/data1</path>
</storage>
<available>1714765692928</available>
<used>1486058684416</used>
<committed>1471026298880</committed>
<storage_format>v3</storage_format>
</storage_domain>
<type>nfs</type>
</storage>
<available>289910292480</available>
<used>998579896320</used>
<committed>8256000884736</committed>
<storage_format>v3</storage_format>
</storage_domain>
The check against $tmp_state on line 1630 and 1633 means that any thresholds specified will be ignored.
Example:
/usr/local/bin/check_rhev3.pl -H rhevm.somesite.net -f /etc/nagios/rhevm_credentials -C mycluster -l vms -w 50 -c 20
RHEV CRITICAL: Vms critical - 62/66 Vms with state UP |Vms_up=62;50;20;0; restoring_state=0;;;0; image_illegal=0;;;0; powering_down=0;;;0; reboot_in_progress=0;;;0; image_locked=0;;;0; unassigned=0;;;0; unknown=0;;;0; down=4;;;0; wait_for_launch=0;;;0; not_responding=0;;;0; powering_up=0;;;0; saving_state=0;;;0; migrating_to=0;;;0; paused=0;;;0; migrating_from=0;;;0; suspended=0;;;0;
When checking storage domains with "*" option and "-l usage" option, output is not correct. Only the last volume output is written to STDOUT. I was expecting information concerning all volumes.
exemple (I currently have 15 volumes in ovirt-engine) :
$ ./check_rhev3 -H ovirt-engine -A /ovirt-engine/api -a username:password -S \* -l usage RHEV OK: storage ok - 29.17% used (Export) |storage_Export=29.17%;60;80;0;
"Export" is the last storage, 14 other storages are missing from the output.
When using the "-vvv" option, we clearly see that all volumes are analysed, but only the last one is "outputed", so maybe it is a "concatenation" bug ?
FYI : I have seen this problem with other options using '*', but have not memorised them all.
When testing vm pool usage in oVirt 4.1 (maybe in older versions as well, but not yet verified) the following error occurs:
$ ./check_rhev3.pl -H engine.lab.rk-it.at -f /tmp/auth -A /ovirt-engine/api -P test-centos -l usage Not an ARRAY reference at ./check_rhev3.pl line 610.
oVirt: ovirt-engine-4.1.0.4-1.el7.centos.noarch
Plugin version: check_rhev3 1.2.0
RHEV OK: Version ok - Default: 3.0
RHEV-H host memory ussage was showing wrong, plugin showing 61.80% was used, where as actual host usage is only 15% ( checked using free -m command)
debug output attached, kindly help me how to fix the usage.
check_rhev3.pl -H RHEVM -a admin@internal:ctrls.123 -R RHEV -l memory -s mem
[V] Starting the main script.
[V] Checking which component to monitor.
[V] Host: Checking host RHEV.
[V] Statistics: Checking statistics of hosts.
[V] REST-API: Connecting to REST-API.
[V] REST-API: RHEVM-API URL: https://RHEV-M:8443/api/hosts?search=RHEV
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: RHEVM-API Password: ctrls.123
[V] REST-API: cookie filename: MTA1sTXzLjI0OC44sMC4xOTQtaSdfuYWwK
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 12 Nov 2013 05:40:29 GMT
Pragma: No-cache
Server: Apache-Coyote/1.1
Content-Length: 2631
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 05:30:00 IST
Client-Date: Tue, 12 Nov 2013 05:41:38 GMT
Client-Peer: RHEV-M:8443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=LT/CN=CA-10.101.3.106.73692
Client-SSL-Cert-Subject: /C=US/O=LT/CN=10.101.3.106
Client-SSL-Cipher: DHE-RSA-AES128-SHA
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
X-Powered-By: Servlet 2.5; JBoss-5.0/JBossWeb-2.1
RHEV: f16b5c86-13e5-11e3-aa8e-3440b5a4784a
[V] Statistics: RHEV: f16b5c86-13e5-11e3-aa8e-3440b5a4784a.
[V] Stats: Checking statistics of hosts.
[V] REST-API: Connecting to REST-API.
[V] REST-API: RHEVM-API URL: https://RHEV-M:8443/api/hosts/f16b5c86-13e5-11e3-aa8e-3440b5a4784a/statistics
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: RHEVM-API Password: password
[V] REST-API: cookie filename: MTA1sTXzLjI0OC44sMC4xOTQtaSdfuYWwK
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 12 Nov 2013 05:40:29 GMT
Pragma: No-cache
Server: Apache-Coyote/1.1
Content-Length: 7891
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 05:30:00 IST
Client-Date: Tue, 12 Nov 2013 05:41:38 GMT
Client-Peer: RHEV-M:8443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=LT/CN=CA-10.101.3.106.73692
Client-SSL-Cert-Subject: /C=US/O=LT/CN=10.101.3.106
Client-SSL-Cipher: DHE-RSA-AES128-SHA
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
X-Powered-By: Servlet 2.5; JBoss-5.0/JBossWeb-2.1
[V] Statistics: Getting Memory Usage.
[V] Statistics: Memory Usage of RHEV: 61.80.
[V] Statistics: warning value: 60.
[V] Statistics: critical value: 80.
[V] Statistics: Performance data: |memory=61.80%;60;80;0; memory.cached=0;;;0; memory.used=40632424857.6;;;0; memory.buffers=0;;;0; .
[V] Statistics: Output: 61.80% used (RHEV)
RHEV WARNING: memory warning - 61.80% used (RHEV) |memory=61.80%;60;80;0; memory.cached=0;;;0; memory.used=40632424857.6;;;0; memory.buffers=0;;;0;
Version : 1.4.0
When monitoring the cpu usage of a vm, i get a percentual over 100%.
[root@nagios]# ./check_rhev3 -H RHEVM -a admin@internal:password -M VM -l cpu -w 10 -c 15
RHEV CRITICAL: cpu critical - 170% used (VM) |cpu=170%;10;15;0; cpu.current.guest=100;;;0; cpu.current.hypervisor=68;;;0;
As workaround i have edited the check_statistics function.
}else{
# get warning and critical values for load based on physical CPU cores
if ((! defined $o_warn || ! defined $o_crit) && $statistics eq "cpu.load.avg.5m"){
my $lref = get_result("/$url/$id{ $key }","cpu","topology");
my %cputop = %{ $lref };
print "[D] check_statistics: \%cputop: " if $o_verbose == 3; print Dumper(%cputop) if $o_verbose == 3;
# warning = #cores
# critical = #cores * 2
$o_warn = $cputop{ 'sockets' } * $cputop{ 'cores' } unless defined $o_warn;
$o_crit = $cputop{ 'sockets' } * $cputop{ 'cores' } * 2 unless defined $o_crit;
}
#-----Updated by Franz Geiser-----
#Ricavo il numero di socket e core
if ($statistics eq "cpu" && $component eq "vms"){
my $lref = get_result("/$url/$id{ $key }","cpu","topology");
my %cputop = %{ $lref };
print "[D] check_statistics: \%cputop: " if $o_verbose == 3; print Dumper(%cputop) if $o_verbose == 3;
# warning = #cores
# critical = #cores * 2
$cpu_num = $cputop{ 'sockets' } * $cputop{ 'cores' };
}
#------------------------------------
# check cpu, load and memory
my $iret = get_stats($component,$id{ $key },$subcheck,$statistics,$key);
my %temp = %{ $iret };
$rethash{$key} = $temp{$key};
#-----Updated by Franz Geiser-----
if ($statistics eq "cpu" && $component eq "vms"){
#divido l'utilizzo della CPU per il numero di core totali
$rethash{$key}{usage} = int($rethash{$key}{usage}/$cpu_num);
}
#------------------------------------
print "[D] check_statistics: \%rethash: " if $o_verbose == 3; print Dumper(%rethash) if $o_verbose == 3;
}
The endpoint has moved to /ovirt-engine/api , but it is still referred to /api in the code.
This may work in 3.6, but will fail on 4.0.
hello
i was testing the plugin and it was working fine on ovirt 3.3.0
now i did an upgrade to 3.3.1 and the plugin is no longer working
any clues?
./check_rhev3 -H ovirt -p 443 -a admin@internal:xxxxxxxxx-C WestmereCluster -vvvv
[V] Starting the main script.
[V] Checking which component to monitor.
[D] check_cluster: Called function check_cluster
[V] Cluster: Checking cluster WestmereCluster.
[V] Cluster: No check is specified, checking cluster host status.
[D] check_cluster_status: Called function check_cluster_status.
[V] Status: Checking status of hosts.
[D] check_cluster_status: Input parameter $subcheck: hosts
[D] get_result: Called function get_result.
[D] get_result: Input parameter $_[0]: /clusters?search=name%3DWestmereCluster
[D] get_result: Input parameter $xml: clusters
[D] get_result: Input parameter $search: id
[D] rhev_connect: Called function rhev_connect.
[V] REST-API: Connecting to REST-API.
[D] rhev_connect: Input parameter: /clusters?search=name%3DWestmereCluster.
[V] REST-API: RHEVM-API URL: https://vega:443/api/clusters?search=name%3DWestmereCluster
[V] REST-API: RHEVM-API User: admin@internal
[V] REST-API: RHEVM-API Password: xxxxxxxxx
[V] REST-API: cookie filename: dmVnYS1hZG1pbkBpbnRlcm5hbAo=
[D] rhev_connect: Trying cookie authentication.
[D] rhev_connect: No cookie file found - using username and password
[D] rest_api_connect: Called function rest_api_connect.
[V] REST-API: Connecting to REST-API.
[D] rest_api_connect: Input parameter: HTTP::Request=HASH(0x2312228).
[D] rest_api_connect: Input parameter: LWP::UserAgent=HASH(0x2956858).
[D] rest_api_connect: Input parameter: /var/tmp/dmVnYS1hZG1pbkBpbnRlcm5hbAo=.
RHEV UNKNOWN: Failed to connect to RHEVM-API or received invalid response.
Since check_rhev3 1.2 it's possible to use cookied based authentication instead of username and password based authentication for RHEV >= 3.1 and oVirt >= 3.1 environments when specifiying option "-o".
As most users didn't recognize this options. As discussed in oVirt mailling list, it's better to use cookied based auth first and fallback to username and password.
Hi,
Host in maintenaince mode reports 100% cpu
./check_rhev3 -H khk9dsg32.ip.tdk.dk -t 60 -p 443 -a admin@internal:xxxxxxxxxxxx -R khk9dsk34.ip.tdk.dk -l cpu -s usage
RHEV CRITICAL: cpu critical - 100% used (khk9dsk34.ip.tdk.dk) |cpu=100%;60;80;0; cpu.current.user=0;;;0; cpu.current.system=0;;;0; cpu.current.idle=0;;;0;
I would prefer a WARNING state 0% load and a text saying the host are in maintenance.
thanks,
Peter Calum
Denmark
It could be nice to see how many VMs running under each host, also in Nagios.
Example : /check_rhev3 -H khk9dsg31.ip.tdk.dk -p 443 -a admin@internal:xxxxxxx -R khk9dsk30.ip.tdk.dk -l status
Result now:
RHEV OK: Hosts ok - 1/1 Hosts with state UP|Hosts=1;1;1;0;
New Result :
RHEV OK: Host ok - 1/1 Host UP, 12 VM's, 12 active, 0 migrating |Hosts=1;1;1;0; VMS=12;12;0;0;
Thanks
Peter Calum
Some users want to have a vm state as critical others as warning or ok when duing a check of running vms on a host or in a cluster. So an option should be implemented to not only specify number of vms for critical/warning state but also a range of states.
Add support for monitoring datacenter quotas with the following default values (make it customizable as well):
warning = cluster_soft_limit_pct | storage_soft_limit
critical = 100% utilization
I am trying to build check_rhev3 under rhel 6. I am seeing the following issue. All the required perl modules are installed.
[root@monitor3 check_rhev3-master]# autoconf
configure.in:35: error: possibly undefined macro: AC_PROG_PERL_MODULES
If this token and others are legitimate, please use m4_pattern_allow.
Line 35 is: AC_PROG_PERL_MODULES( LWP::UserAgent, , AC_MSG_ERROR(Missing Perl module LWP::UserAgent))
Is it not seeing the perl modules?
install perl modules
[root@monitor3 check_rhev3-master]# instmodsh
Available commands are:
l - List all installed modules
m - Select a module
q - Quit the program
cmd? l
Installed modules are:
Encode::Locale
File::Listing
Getopt::Long
HTTP::Cookies
HTTP::Daemon
HTTP::Date
HTTP::Message
HTTP::Negotiate
IO::HTML
LWP
LWP::MediaTypes
Net::HTTP
Perl
WWW::RobotRules
XML::SAX::Expat
Thanks!
-Paras
check_rhev3 should provide more detailed information of the status of datacenters, hosts, vms and storagedomains. At the moment only the amount of UP elements is given.
Expected verbose output:
$ ./check_rhev3.pl -H localhost -a admin@internal:password -D "*" -v
RHEV CRITICAL: Datacenters critical - 1/3 Datacenters with state UP [Details: 2 uninitialized, 1 up]|up=1;3;3;0; uninitialized=2;;;0;
Non-verbose output:
$ ./check_rhev3.pl -H localhost -a admin@internal:password -D "*"
RHEV CRITICAL: Datacenters critical - 1/3 Datacenters with state UP |up=1;3;3;0; uninitialized=2;;;0;
Detailed performance data should help getting an overview of the virtualization environment.
Hi,
Thanks for the Script, it's working fine and it is as expected. One thing what I am missing is that there is no check for the hosts if an update is available.
Did a little search and found this:
https://ovirt.github.io/ovirt-engine-api-model/master/#services/host
POST /hosts/{host:id}/upgradecheck
I think that can be implemented, but not sure how. I just need a Warning if an Update is available on one of my hosts.
Would be nice if someone has the time to do this.
much appreciated
Regex for nic search isn't strict enough.
When using -n eth0 not only eth0, but also e.g. eth0.100 is a valid interface.
Hi,
First, thanks for the script!
I have Ovirt-Engine 4.1 and the script is working with the default admin@internal user. I want to create a new user (best read only) only for API but I don't get it to work. I tried with new Admin User and treid to add it to the viewer group, which I could not find.
I also couldn't find a proper Ovirt API Docu.
Thanks for your / someone's help.
Hello,
I have installed the new Release of your plugin, I can use it in the console and all works great, but I have this message in the Nagios portal.
I've tried a lot of options -a -f, without success, I suspect the credentials but not sure. The password has an @ character, I've tried to use escape \ but I still have the error.
Can you help me to debug it,
Thanks Alain
Hi,
when a hypervisor is in maintenance mode in ovirtm/rhevm the plugin reports (wrongly) 100% cpu usage (and state CRITICAL depending on thresholds)
This is probably because the api sends cpu.idle, cpu.user and cpu.system as 0 value. In that case (all three values: 0) the plugin should report UNKNOWN state.
Cheers,
Benjamin
CPU usage graph shows values of 'cpu.current.guest' instead of 'cpu', so they can be way over 100%. I made a change to PNP template so it uses relative 'cpu' value instead (up to 100%):
--- check_rhev3.php-orig 2014-06-27 09:29:51.628727867 +0200
+++ check_rhev3.php 2014-06-27 09:55:57.804063199 +0200
@@ -39,7 +39,7 @@
# process VM CPU stats
$def[1] .= "CDEF:sp1=100,var1,- ";
- $def[1] .= "CDEF:sp2=var2 ";
+ $def[1] .= "CDEF:sp2=var1 ";
$def[1] .= "CDEF:sp3=var3 ";
$def[1] .= "AREA:sp2#000080:\"Guest \" ";
If password of API user contains an equal sign (=) authentication fails as = is used for split.
}elsif ($_ =~ /^password=/){
my @tmp = split(/=/, $_);
Substituting "password=" with "" seems to be the better approach.
Check network interface errors with check_rhev3, too.
Note: This requires implementation of temp files.
Hi
Getting " RHEV UNKNOWN: Can't open file /var/tmp/ODcuMjU0LjIxNS4xNDAtYWRtaW5AaW50ZXJuYWwK for writing: Permission denied,when ever we tried to run the plugin with diffrent users.
it would be better enable auto delete the TMP file after finshing the comman execution which can avoid users manullay delete the tmp file .
Br/Prashanth.P
Every component like datacenters, hosts,... accepts regular expressions additionally to full names. Some of these regular expressions don't behave in the same way as in search bar of RHEV/oVirt.
E.g. in this example no storage is found when using wild cards instead of datacenter name:
$ ./check_rhev3.pl -H rhevm -a admin@internal:password -D "*" -l storage -s usage
RHEV CRITICAL: storage critical - No storage found!
Debug output needs some improvements:
As suggested by Stefan Marleaux, warning and critical values for storage domains should be fixed values (e.g. gb free), too.
At the moment warnings and critical can only be % free, which is unflexible when extending a storage.
example:
$ check_rhev3 -H localhost -f .authrc -S isos -l usage -w 80 -c 90
warning if storage usage is more then 80%
critical if storage usage is more then 90%
$ check_rhev3 -H localhost -f .authrc -S isos -l usage -w 200G -c 100G
warning if storage has less then 200G free
critical if storage has less then 100G free
Memory is the amount of memory used on the hypervisor, not the amount of memory used in the guest. If this value is below 0, set it to 0 (negative value is a bug in oVirt/RHEV-API)
RHEV 3.2 REST-API returns "Operation Failed" for network interface statistics of tagged nics (e.g. eth0.100), so check_rhev3 exits with "RHEV CRITICAL: Can't connect to RHEVM-API.".
Only RHEV 3.2 is affected - all other RHEV version (3.0, 3.1) and oVirt is working fine.
Update PNP templates for detailed stati of datacenters, hosts and vms (#18)
Hi Team,
We recently upgraded rhevm machine from RHEVM3.2 to RHEVM3.6
After the upgrade check_rhevm3 plugin is not able to get the data for host for most of the services as below
[root@mon01 libexec]# ./check_rhev3.pl -H XXXXXX -a nagios@internal:XXXXXX -R hyp-app-03 -l memory
RHEV UNKNOWN: memory unknown - Performance data not found!
[root@mon01 libexec]# ./check_rhev3.pl -H XXXXXX -a nagios@internal:XXXXXX -R hyp-app-03 -l cpu
RHEV UNKNOWN: cpu unknown - Performance data not found!
[root@mon01 libexec]# ./check_rhev3.pl -H XXXXXX -a nagios@internal:XXXXXXXX -R hyp-app-03 -l load
RHEV UNKNOWN: cpu.load.avg.5m unknown - Performance data not found!
I have replaced the RHEVM hostname and password with XXXX
./check_rhev3 -H khk9dsg31.ip.tdk.dk -p 443 -a admin@internal:xxxxxxxxx -R khk9dsk30.ip.tdk.dk -l network -s status
RHEV OK: Hosts ok - 6/6 Nics with state Active|nics=6;6;6;0;
It returns ‘Hosts ok’ – I would expect ‘Network ok’ or 'Nics Ok'
Thanks
Peter Calum
Have been running this script in Nagios for a while and suddenly it failed.
Running is from the CLI
check_rhev3.pl -H rhevm -p 443 -f .rhevauth -R RHEVHOST -l vms
RHEV UNKNOWN: Host RHEVHOST not found.
Saw that here is no vm running on the RHEVHOST. If I start a VM on the RHEVHOST the "RHEV UNKNOWN" goes away.
I think this is a bug.
Have provided a -vvv output.
check_rhev3.pl -vvv -H rhevm -p 443 -f .rhevauth -R RHEVHOST -l vms
[V] Starting the main script.
[V] This is check_rhev3 version 1.6.0.
[V] Checking which component to monitor.
[D] check_host: Called function check_host.
[V] Host: Checking host RHEVHOST.
[D] check_cstatus: Called function check_cstatus.
[D] check_status: Called function check_status.
[V] Status: Checking status of hostvms.
[D] check_status: Input parameter $components: hostvms
[D] check_status: Input parameter $search: RHEVHOST
[D] check_status: Converting variables.
[D] check_status: Converted variable $components: hostvms
[D] check_status: Converted variable $component: hostvm
[D] rhev_connect: Called function rhev_connect.
[V] REST-API: Connecting to REST-API.
[D] rhev_connect: Input parameter: /vms?search=host%3DRHEVHOST.
[V] REST-API: RHEVM-API URL: https://rhevm.:443/api/vms?search=host%3DRHEVHOST
[V] REST-API: RHEVM-API User:
[V] REST-API: cookie filename:
[D] rhev_connect: Trying cookie authentication.
[D] rhev_connect: Using cookie: JSESSIONID=
[V] REST-API: Cookie authentication failed - using username and password.
[D] rest_api_connect: Called function rest_api_connect.
[V] REST-API: Connecting to REST-API.
[D] rest_api_connect: Input parameter: HTTP::Request=HASH(0x1684af0).
[D] rest_api_connect: Input parameter: LWP::UserAgent=HASH(0x1677db8).
[D] rest_api_connect: Input parameter:
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 28 Mar 2017 12:03:02 GMT
Pragma: No-cache
Vary: Accept-Encoding
Content-Length: 63
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 01:00:00 CET
Client-Date: Tue, 28 Mar 2017 12:03:02 GMT
Client-Peer: xxx.xxx.xxx.xxx:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: CN=Certificate Authority
Client-SSL-Cert-Subject: CN=rhevm
Client-SSL-Cipher: ECDHE-RSA-AES256-GCM-SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
JSESSIONID:
[D] rest_api_connect:
[V] REST-API: Cache-Control: no-cache
Connection: close
Date: Tue, 28 Mar 2017 12:03:02 GMT
Pragma: No-cache
Vary: Accept-Encoding
Content-Length: 63
Content-Type: application/xml
Expires: Thu, 01 Jan 1970 01:00:00 CET
Client-Date: Tue, 28 Mar 2017 12:03:02 GMT
Client-Peer: xxx.xxx.xxx.xxx:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: CN=Certificate Authority
Client-SSL-Cert-Subject: CN=rhevm
Client-SSL-Cipher: ECDHE-RSA-AES256-GCM-SHA384
Client-SSL-Socket-Class: IO::Socket::SSL
Client-SSL-Warning: Peer certificate not verified
JSESSIONID:
[D] rhev_connect:
[D] check_status: %result: [D] check_status: Looping through %result
[V] Status: Search pattern RHEVHOST not found.
[D] print_notfound: Called function print_notfound.
[D] print_notfound: Input parameter: Host
[D] print_notfound: Input parameter: RHEVHOST
RHEV UNKNOWN: Host RHEVHOST not found.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.