centreon / centreon-engine Goto Github PK
View Code? Open in Web Editor NEWExtremely fast monitoring scheduler, forked from Nagios
License: GNU General Public License v2.0
Extremely fast monitoring scheduler, forked from Nagios
License: GNU General Public License v2.0
Hello,
It could be great that notification's interval of a service will not be acknowledged when the service enters in escalation's period.
For example:
I have a service who notifies contacts every two hours and a esclalation who notifies every 5 minutes.
The escalation begins at 10:00 and the last notificaton of my service sended at 9:30.
Currently, the next notification will not be sended at 10:00 but at 11:30, I think it's a problem and it could be an enhancement for the next version.
Regards,
Good day,
Since the upgrade of my Centreon platform to the version 2.7.1, all my contacts which was having notification disabled start receiving all notifications.
Is anybody has the same issue ?
I have test the most simple configuration, mean without any inheritance (no group contact, no template contact, no host template.......)
Contact with notification disable linked to a host ..... and the user receive the notification in case of UP or down...
Thanks for your help.
In a normal operating case, Centreon Engine check timeout should never be reached. When it is however, nodes should be set to UNKNOWN has it is precisely the meaning of a timeout : we don't known the real node state. It could be caused by multiple factors : overloaded machine, network latency, bad check plugin, ...
Comment from Mathieu Cinquin:
I just realized several tests on the consideration of timeperiod with centreon-engine. After applying a timeperiod with an exclude timeperiod if you force control during the exclude timeperiod, the date of next check is more consistent and no longer reflects the exclude timeperiod.
See issue #3 for explanations.
Hope to have been clear.
Hi,
I run into a problem where cent engine.log shows lot of errors like this:
Error: Service check command execution failed: Connector 'Perl Connector' failed to restart
Do you have any ideas, what can cause this?
How can I restart the connector myself?
Centreon Engine ne semble plus détecter les relations Parents/Enfants : Lorsque le parent tombe, l'enfant est toujours vu comme DOWN et non en UNREACHABLE
This is a feature request.
With non-OK states, we already can set via the retries setting how often the check is retried until the host is considered down (hard state). I want to have the same for the OK state, i.e., to be able to enforce multiple successful retries before the host is considered OK.
Sometimes hosts (or services) are down and recover for a short time before failing again. This is not the same as flapping, since flapping checks for frequent OK<->non-OK changes. My case is more like non-OK, non-OK, ...., non-OK, OK, non-OK, ... , non-OK
The problem is, that the short period of OK causes two notifications (a host-up and another host-down notification).
Escalation schemes also do not work well, because the downtime counter starts again at zero.
In database "last_state_change" and "last_hard_state_change" have the same value.
They should not be the same.
centengine 1.5.1
I have a custom macro with no value. i set the value with 'XXX' for example (with external command).
I remove the macro (with external command) and the macro is well removed. But i use external command 'ENABLE_PASSIVE_SVC_CHECKS' and 'DISABLE_PASSIVE_SVC_CHECKS', the macro value is coming back Oo
Hi,
I have command that should return how long service has been down with an email, but it always returns 0. With hosts the
As reported by Thomas Esteve:
Bonjour,
Lors d'une installation fraiche sur un serveur Ubuntu 14.10 (Utopic)
Après installation de centreon-clib -engine et -broker depuis les sources
Lors du lancement du script 'install.h -i' de centreon-2.5.4
A la question :
What is the Monitoring engine init.d script ?
/etc/init.d/centengine
Le script ne trouve pas /etc/init.d/centengine, et pour cause : lors du cmake de centreon-engine la doc stipule -DWITH_STARTUP_DIR=/etc/init.d
Alors que ce n'est pas un script init.d mais un fichier startup.conf qui doit se situer dans /etc/init
Qu'à cela ne tienne : un 'mv /etc/init.d/centengine.conf /etc/init' permet de remettre le script généré à sa place,
MAIS remplacer le init.d par le startup.conf dans l'install ne fonctionne toujours pas :
What is the Monitoring engine init.d script ?
/etc/init/centengine.conf
Le script bloque en boucle sous prétexte que ce fichier n'est pas exécutable
les startup.conf sont lancés via la commande 'service mondaemon start' et ne sont effectivement pas exécutables.
WORKAROUND :
j'ai tenté de tricher en soumettant
/usr/bin/service centengine
service centengine
sans succès... je l'ai donc berné en mettant
/bin/echo
mais je me retrouve avec un sudoers pollué ainsi que probablement certains champs dans la base que je n'ai pas encore cherché.
Hi.
I've earlier used Centreon 2.5.4 with Nagios 3.0. There I had all the configuration for who should get notification in the host templates. I had no notification config for people in the service's, since the services inherited the contacts from the host templates.
Now I've setup a new server with Centreon 2.6.1 and replaced nagios with Centengine 1.4.11. With the new setup and same host and services configuration I get notifications when a host goes down, but when a service goes not-ok I don't get notified.
If I put in a contact for notification directly in the service defenition I get notification when a service goes down. So it seems the problem is that its no longer able to inherit the hosts contacts for use with for service alerts.
When I check the centengine.log I can see the service alert in the log, but no notification is created unless I configure a contact directly in the service.
I've allso tried with Centengine 1.4.13, but same problem.
Kind regards
Bjorn Tore.
Comment from Kevin Duret:
Currently, check_command is mandatory for service.
I think it's not cause we can have passive check that doesn't use this field.
Hello,
Is there a way to schedule downtime for BA (BAM) with an EXTERNAL COMMAND ?
if not, can i consider that a BA is a service and use "SCHEDULE_SVC_DOWNTIME", where
service_description and host_name are those defined in centreon-bam-services.cfg.
if i understand the centreon-bam-services.cfg is defined in the poller and the central both, so i must run two external command for each one (poller and central) ?
thank you for your help.
CES 3.3 iso, updated to Centreon 2.7.5
Jul 22 12:27:40 centreon kernel: centengine[1391]: segfault at 5791b470 ip 000000316913382f sp 00007fca775f80e8 error 4 in libc-2.12.so[3169000000+18a000]
/etc/init.d/centengine: line 176: kill: (24702) - No such process
centengine status: /usr/sbin/centengine dead with existing [FAILED]: /var/run/centengine.pid
centcore (pid 14044) is running...
cbd (pid 14091) is running...
cbd (pid 14124) is running...
"# service centengine start" works to restart it, also see bug #51
Avec Centreon engine 1.4.15 les déclenchements d'EventHandler ne fonctionnement pas même si dans le fichier de debug celui-ci est marqué comme lancé...
Charles Judith Write here : https://forge.centreon.com/issues/4419
I think it's good to take this point in account in order to start the new generation of RPM (new version of Engine) with this fix.
Hi,
Here is home directories for centreon users:
centreon: x :101:102::/var/spool/centreon:/bin/bash
centreon-broker: x :102:103::/var/spool/centreon-broker:/bin/bash
centreon-engine: x :104:105::/var/lib/centreon-engine:/bin/bash
I think it will be better to have consistency for home directory for these users.
What do you think about that ?
Thanks,
Hi,
This is just a question. I've read about centreon-engine that was started using the nagios 3x.
Can you tell me if the centreon-egine is compatible with all nagios plugins and work in the same way?
Thanks
Bonjour,
Depuis peu, nous avons un problème de fuite mémoire lié au processus "centengine". Au bout de 9 jours environ, nous sommes obligé de redémarrer le processus.
Nous avons repéré ceci dans les logs :
_[1464158699] [14759] Error: can't execute service notification 'DTS' : [/ml/apps/build/centreon-clib-1.4.2/src/process_posix.cc:466(static pid_t com::centreon::process::create_process_with_setpgid(char**, char**))] could not create process: Cannot allocate memory
Je joins un graph qui montre l'occupation mémoire de centengine (2 jours).
Est-ce que vous pouvez nous aider à corriger ce problème, svp?
We use Centreon 2.6.0 with Centreon Engine 2.4.12 and Centreon Broker 2.8.2 and it seem's that after the last updates the recovery notifications are not more sendet. I checked out the hole configuration:
Every other type of notification (warning, error) is send correctly. Is this a issue?
Hi,
I have a service in OK state since 7 months. When the service state change to UNREACHABLE (because of parent host), the state duration stay to 7 months. I think the state duration should be reset.
Best regards,
As reported by Loic Fontaine:
Hi,
There is timezone management in centreon-engein 1.5.x. It will be great to have a macro
Best regards,
Pour reproduire le bug :
As reported by Francis Fachinan:
Default permissions defined on the centreon-engine home directory is drwxrwxr-x centreon-engine centreon-engine (775).
When the ssh keys for centreon-engine are shared with different servers, permissions need to be manually changed in order to respect SSH security rules.
However, if we proceed to centreon-engine updates, permissions are set again to 775.
This cause problems when centreon-engine try to authenticate on other servers.
Is it possible to no modify the permission on the home directory when updating centreon-engine ?
Max Mongardini says :
When adding a new resource, the macro expression field only allows 256 characters.
Although 256 chars sounds like a lot, I found myself in a situation where I had to create a resource with plenty of key=values cookies to pass to a command and the string was truncated.
Ticket come from centreon forge. id6354
Description
This feature will implement timezone management in Centreon Engine. It will be possible to specify which timezone to use per host, service or contact.
Timeperiod identification
Technically a timeperiod is now identified by its name and a timezone. The old behavior is still supported (no timezone specified) but under the hood this timeperiod will be identified by its name and the local/UTC (don't know yet) timezone.
Timezone configuration
Most objects (hosts, services, escalations, dependencies, ...) gets a new timezone parameter that is inherited when applicable. This parameter is a string referencing a timezone, following the TZ environment variable POSIX syntax. If this parameter is specified for the object in question it will impact timeperiods used by this object (if such periods are not using a specific timezone). Examples will explain more than text.
// Some widely used timeperiod.
define timeperiod {
timeperiod_name workhours
monday 9:00-18:00
tuesday 9:00-18:00
wednesday 9:00-18:00
thursday 9:00-18:00
friday 9:00-18:00
}// Host will be checked from monday to friday, from 9am to 6pm Paris local time.
define host {
host_name my_host
check_period my_period
timezone :Europe/ParisXXX
}
// Service will be checked from monday to friday, from 9am to 6pm Paris local time (timezone is inherited).
define service {
host my_host
service_description my_service1XXX
}
// Service will be checked from monday to friday, from 9am to 6pm New York local time (timezone override).
define service {
host my_host
service_description my_service2
check_period my_period,:America/New_YorkXXX
}
Timezone computation
Timeperiod computation is mainly based on mktime. We will override the program timezone by setting TZ to the target timezone and calling tzset() before any computation involving mktime (or other).
Comment from Mathieu Cinquin:
I just realized several tests on the consideration of timeperiod with centreon-engine. I noticed that when changing/adding a timeperiod with include or exclude exception of services, it is not applied after a restart of centreon-engine. To be considered it is necessary to perform a forced check on the services concerned.
Example :
- creation of a new timeperiod (monday->friday | 17:15-18:00) with exclude timeperiod (Time Range exceptions xxxx/xx/xx - xxxx/xx/xx | 00:00 - 24:00)
- update the timeriod on the concerned service
- export and restart centreon-engine configuration
- verification of the next check => NOK The next check date is not correct
- Force check on the concerned service
- verification of the next check => OK
Hope to have been clear.
centengine 1.5.1
I have an empty macro for a service. I have:
]> select * from customvariables where name = 'TICKET_ID' AND service_id = 761;
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+-------+
| customvariable_id | host_id | name | service_id | default_value | modified | type | update_time | value |
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+-------+
| 1087797 | 101 | TICKET_ID | 761 | | 0 | 1 | 1466435753 | |
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+-------+
I add a value with a external command:
> select * from customvariables where name = 'TICKET_ID' AND service_id = 761;
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+------------------+
| customvariable_id | host_id | name | service_id | default_value | modified | type | update_time | value |
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+------------------+
| 1087797 | 101 | TICKET_ID | 761 | | 1 | 1 | 1466435968 | 2016062085000209 |
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+------------------+
I remove the value:
]> select * from customvariables where name = 'TICKET_ID' AND service_id = 761;
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+-------+
| customvariable_id | host_id | name | service_id | default_value | modified | type | update_time | value |
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+-------+
| 1087797 | 101 | TICKET_ID | 761 | | 1 | 1 | 1466435995 | |
+-------------------+---------+-----------+------------+---------------+----------+------+-------------+-------+
I restart the engine:
> select * from customvariables where name = 'TICKET_ID' AND service_id = 761;
+-------------------+---------+-----------+------------+------------------+----------+------+-------------+------------------+
| customvariable_id | host_id | name | service_id | default_value | modified | type | update_time | value |
+-------------------+---------+-----------+------------+------------------+----------+------+-------------+------------------+
| 1095495 | 101 | TICKET_ID | 761 | 2016062085000209 | 0 | 1 | 0 | 2016062085000209 |
+-------------------+---------+-----------+------------+------------------+----------+------+-------------+------------------+
As reported by Martin Soltis:
I don't know how to set name of this topic.
Last night notifications from centreon engine going mad, and it sent many notifications in few seconds.
And of course i have escalations set to send notifications after 10 minutes.
And there is some interresting warning after, or before every notification
[1409706224] [15656] Warning: Check result queue contained results for service 'temp_core1' on host 'r2.example.com', but the service could not be found! Perhaps you forgot to define the ser
vice in your config files ?
I try to find some info, and found same warning, but it was related to nagios and two running instances. So i think, that here is different problem.
centreon engine is 1.4.6
Plugin execution may lead to warnings (this is the case today with Centreon Plugins), these warnings are displayed in stderr. This output is not redirected anywhere at the moment so a user cannot see them.
The goal is to redirect these messages in logs table en DB, users will then be able to see them close to other Engine logs.
To be discussed : find a way (new flag?) in DB to make it possible to filter them from other logs?
If so, Web is likely to be impacted, so we will need to create a new ticket/issue in their tracker.
CES 3.3 iso, updated to Centreon 2.7.5, right after reloading poller config, line in /var/log/messages:
Jul 22 11:21:38 centreon kernel: centengine[24204] general protection ip:316913382f sp:7f24560320e8 error:0 in libc-2.12.so[3169000000+18a000]
From the shell:
/etc/init.d/centengine: line 176: kill: (17971) - No such process
centengine status: /usr/sbin/centengine dead with existing [FAILED]: /var/run/centengine.pid
centcore (pid 14044) is running...
cbd (pid 14091) is running...
cbd (pid 14124) is running...
Using the web interface reload option doesn't restart it, "# service centengine start" does, no further problems apparent yet.
Parfois lors de Reload, les services sont dupliqués et parfois même les services supprimés ne sont pas supprimés.
Tout rentre dans l'ordre si on fait un restart
We should add the possibility to recursively evaluate all macros in Centreon Engine.
Hello,
I add an escalation for many services with this parameter:
It should notify contacts linked to escalation every minutes but it sends notification every 5 minutes.
So, escalation works because Centreon-Engine use good contacts but "Notification Interval" parameter doesn't work.
Regards,
The additive inheritance flag (+) has at least two issues in current version of Centreon Engine.
First this flag itself should not be inherited (to prevent additive inheritance in upstream templates if they exist).
Second additive inheritance should work only with the first template, not all of them.
Hi,
Could you use add an option to display warning message when we test configuration ?
Bonjour,
A plusieurs reprises j'ai constaté un problème de latence, ce problème arrive de façon aléatoire.
La latence atteint une valeur aberrante (exemple: 1456719256.562702), je suis désolé mais comme je ne peux pas reproduire le problème, je n'ai pas de log à fournir.
Mais comme vous pouvez le voir dans l'exemple ci-dessus, le début me fait pensé à un timestamp.
Peut-être un problème de calcul ou de concaténation de deux variables.
Hi, i have a problem with inheritance of contacts and groups.
I am using the following software-versions:
centreon-engine 1.4.13
centreon 2.6.1
centreon-broker 2.8.2
So first, i create a host which is using a hosttemplate.
There are a few servicetemplates which are connected to this hosttemplate.
Than i connect a contactgroup to this host.
After creation all contacts will be notified if a problem occurs with the host or some services of its hosttemplate.
Thats all fine, because the services inherit from the host, they are connected to.
Now i want additionally inform another group if there is a problem with any instance of one special servicetemplate.
Thats why i connect this group to this special servicetemplates.
The problem is, that only this group is informed and not anymore the group which is connected to the host.
Is it possible to create a new function to "Inherit contacts from host" which will ever work? Not only for services, also for servicetemplates!
I know there is such a function for services, but this only works if the service has no contacts defined.
Such a new function would be very helpful, because now i have to add all contactgroups of the host to the services manually after creation out of templates.
If you need more informations from me let me know it.
With nice regards.
PS: or is there a possibility / way to create such a behaviour?
Hello,
I try do apply this configuration:
define timeperiod{
name Exclusion
timeperiod_name Exclusion
alias Exclusion
november 5 10:00-13:00
}
define timeperiod{
name Enterprise
timeperiod_name Enterprise
alias Enterprise
sunday 08:00-20:00,20:20-24:00
monday 08:00-20:00,20:20-24:00
tuesday 08:00-20:00,20:20-24:00
wednesday 08:00-20:00,20:20-24:00
thursday 08:00-20:00,20:20-24:00
friday 08:00-20:00,20:20-24:00
saturday 08:00-20:00,20:20-24:00
exclude Exception
}
The problem is my service (with the check timeperiod "Enterprise") is checked the 5 november between 10:00 and 13:00.
Regards
I'm migrating from centreon 2.5.4 to CES 3.3 and I don't find if this feature is implemented
Implicit Dependencies for Services on Host:
automatically adds an implicit dependency for services on their host. That way service notifications are suppressed when a host is DOWN or UNREACHABLE.
Service checks are still executed. If you want to prevent them from happening, you can apply the following dependency to all services setting their host as parent_host_name and disabling the checks. assign where true matches on all Service objects.
Can you help me?
Regards,
Hi,
Would you think that implementing a feature allowing user to define if services of dependent host should be check (or not) could be a good idea ?
e.g :
-> my host A depends upon my host B.
-> If my host B is down then my host A is not checked anymore
-> With the proposed feature, setting the option to yes would disable all the service checks of my host B.
Thanks,
This features allows plugins to run with specific timeouts for each host/service.
The following field will be added to host and service configuration : check_timeout. It specifies the plugin check timeout. If this value is not set, the default global timeout will be used.
Example
define host {
name Central
check_command check_centreon_dummy
check_timeout 1
}
Hello,
I set all relationships on my hosts and I would like to show childs or parents hosts in notifications sent for a host. This possibility will be usefull for my techies.
Best regards.
As reported by Stephane Duret:
Retry check interval is never the same.
I execute this query in MySQL:
SELECT FROM_UNIXTIME(ctime) AS date, output, if(type='1', 'HARD', 'SOFT') as State FROM centreon_storage.logs WHERE host_id = 147 AND ctime > FROM_UNIXTIME('2014-09-30') and ctime < FROM_UNIXTIME('2014-10-01') AND service_id IS NULL ORDER BY ctime
And this is the output:
date output State
2014-09-30 16:04:47 CRITICAL - srv-mysql-01: rta 0.000ms, lost 100%\n SOFT
2014-09-30 16:05:52 CRITICAL - srv-mysql-01: rta 0.000ms, lost 100%\n SOFT
2014-09-30 16:06:12 CRITICAL - srv-mysql-01: rta 0.000ms, lost 100%\n SOFT
2014-09-30 16:06:52 CRITICAL - srv-mysql-01: rta 0.000ms, lost 100%\n SOFT
2014-09-30 16:06:57 CRITICAL - srv-mysql-01: rta 0.000ms, lost 100%\n HARD
2014-09-30 16:07:47 OK - srv-mysql-01: rta 2.651ms, lost 0%\n HARD
It's a problem, because I use this data in Centreon-BI Reports and unavailability is 50sec (HARD state), less than 1 minute.
As reported by Sébastien Boulianne:
Bonjour,
Lorsque l'on génère la configuration sur un poller, comme vous pouvez voir en bas, c'est écrit "Total Warning:0"...
Pourtant, j'en compte 20. :D
Pourriez-vous dire à Centreon Engine de retourner à l'école svp ? :P
Merci pour votre super travail et au plaisir.
Hi,
I programmed a downtime on a hostgroup, which contains 1200 hosts.
Downtime not working.
In /var/log/centreon/centcore.log, I see it's programmed, but in /var/log/centreon-engine/centengine.log it does not work.
I tested with an other hostgroup (5 hosts), there is no bug.
It's limited ?
Regards
Hello,
When I use backslashes in the macro value, Centreon-Engine doesn't work very well.
For example, this is my service:
define service {
host_name Windows-2012
service_description Disk_C
register 1
use OS-Windows-Disks-NRPE-custom
_DRIVE C:\\
_SERVICE_ID 1626
}
And this the command line found in centengine.debug :
/usr/lib/nagios/plugins/check_centreon_nrpe -H 10.50.1.158 -p 5666 -t 30 -u -m 8192 -c check_drivesize -a "drive=C:\_SERVICE_ID 1626"
Centreon > 2.7.2 add a "space" add the end of line because we had a problem with "\n".
Now I think it's a Centreon-Engine problem.
For information, in services.cfg I have a space after "_DRIVE C:\\
".
Regards,
On release 1.4.12 service revocery notifications are not send to contact or contact groups. All other service notifications like warning, critical or unknow are sended correctly. I checked out the hole configuration and I found no reason why the recovery notification are missing.
After a downgrade to release 1.4.11 all notifications are sended correctly.
I am very conviced that this has nothing to do wich what is explaind here.
I tested it out on a fresh centreon installation with the same result. Please tell me what is my mistake or let me know if this is a bug.
As reported by Martin Lunze:
i am using the following software:
centreon-engine 1.4.7
centreon 2.5.1
centreon-broker 2.6.1
After i define a host with some checks and add a single contact and a contact-group to the host, then the following behaviour happens:
If a host notification is send, then both (the single contact and all members of the contact-group) are notified.
If a service notification is send, then only the members of the contact-group are notified.
So, i created a new contact-group with only one member (the same single contact which was taken for the host first).
When i add this new contact-group to the host and restart the centreon-engine, then he will be notified too.
This behaviour is reconstructable with all groups and single contacts and every host.
Is there a bugfix?
I think it looks like the contacts are overwritten by the contactgroups, but only for the checks of the host.
I would be very pleased to hear from you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.