Git Product home page Git Product logo

lausser / check_logfiles Goto Github PK

View Code? Open in Web Editor NEW
46.0 14.0 27.0 809 KB

A plugin (monitoring-plugin, not nagios-plugin, see also http://is.gd/PP1330) which scans logfiles for patterns.

Home Page: https://omd.consol.de/docs/plugins/check_logfiles/

License: GNU General Public License v2.0

Shell 2.83% Perl 95.41% Awk 0.23% Max 0.05% Makefile 0.31% M4 0.72% VBScript 0.46%
icinga monitoring naemon nagios

check_logfiles's Introduction

Meet me at...

Monitoring-Workshop 2019

Description

check_logfiles is a Plugin for Icinga which scans log files for specific patterns.

Buy Me A Coffee

Motivation

The conventional plugins which scan log files are not adequate in a mission critical environment. Especially the missing ability to handle logfile rotation and inclusion of the rotated archives in the scan allow gaps in the monitoring. Check_logfiles was written because these deficiencies would have prevented Nagios from replacing a propritetary monitoring system.

Features

  • Detection of rotations - usually nightly logfiles are rotated and compressed. Each operating system or company has it's own naming scheme. If this rotation is done between two runs of check_logfiles also the rotated archive has to be scanned to avoid gaps. The most common rotation schemes are predefined but you can describe any strategy (shortly: where and under which name is a logfile archived).
  • More than one pattern can be defined which again can be classified as warning patterns and critical patterns.
  • Triggered actions - Usually nagios plugins return just an exit code and a line of text, describing the result of the check. Sometimes, however, you want to run some code during the scan every time you got a hit. Check_logfiles lets you call scripts either after every hit or at the beginning or the end of it's runtime.
  • Exceptions - If a pattern matches, the matched line could be a very special case which should not be counted as an error. You can define exception patterns which are more specific versions of your critical/warning patterns. Such a match would then cancel an alert.
  • Thresholds - You can define the number of matching lines which are necessary to activate an alert.
  • Protocol - The matching lines can be written to a protocol file the name of which will be included in the plugin's output.
  • Macros - Pattern definitions and logfile names may contain macros, which are resolved at runtime.
  • Performance data - The number of lines scanned and the number of warnings/criticals is output.
  • Windows - The plugin works with Unix as well as with Windows (e.g. with ActiveState Perl).

Examples

nagios$ check_logfiles --logfile /var/adm/messages \
     --criticalpattern 'Failed password' --tag ssh
CRITICAL - (4 errors) - May  9 11:33:12 localhost sshd[29742] Failed password for invalid user8 ... |ssh_lines27 ssh_warnings=0 ssh_criticals=4 ssh_unknowns=0

Homepage

The full documentation can be found here: check_logfiles @ ConSol Labs

check_logfiles's People

Contributors

adrianlzt avatar akamoto avatar datamuc avatar lausser avatar sni avatar urbnw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

check_logfiles's Issues

nosavethreshold from cmd line

Hi,

My understanding of Perl is tenuous at best but we want to define nosavethreshold & thresholdexpiry as part of the commandline Parameters (line number 5848) under the options (line 5911) section. Correct?

Whenever I try to execute the plugin with the --nosavethreshold flag I get the help manual. After adding those parameters the plugin seems to work as expected. If adding the parameters is incorrect, please let me know how I can get the expected behavior (nosavethreshold from the cmd line) from the existing src?

Thanks,
Maxwell Ramirez

seekfiles ignored after server reboot

Hi,

I gave a reboot to the server and even if the seekfile was there, the logfile was fully processed again.

$protocolsdir = '/usr/local/test/tmp/logfiles';
$seekfilesdir = '/usr/local/test/tmp/logfiles';
$protocolretention = '30';
$scriptpath = '/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/usr/local/test/bin';
$options = 'preview=1';

@searches = (
  {
    tag => 'test',
    logfile => '/usr/local/test/logs/test_log',
    criticalpatterns => [
      '\{test\}'
    ],
    options => 'nologfilenocry,nosavethresholdcount,maxlength=1024,allyoucaneat,criticalthreshold=1,script',
    script => 'logfiles_test.sh'
  }
);

Because of this, the script was run again even if was not supposed to do so.

Is this the normal behaviour? I am missing something? Shouldn't the seekfile be used all the time if exists?

Thank you.

Ubuntu Rotation

Hello,
I see your plugin supports different rotation methods. While I was looking for the best match for my default Ubuntu installation I've searched the source code. It seems the DEBIAN is almost right for it:
$self->{filenamepattern} = sprintf "%s.0|%s.*[0-9]*.gz",

However the default logfiles in ubuntu are:
$self->{filenamepattern} = sprintf "%s.1|%s.*[1-9]*.gz",
e.g.
syslog
syslog.1
syslog.2.gz
...

I'm not a linux master, I think the log rotation is managed by logrotate in ubuntu, but the logrotate method does not fit there too:
$self->{filenamepattern} = sprintf "%s.*[0-9]*.gz", $self->{logbasename};

As by default, the first archive file is not gzipped on Ubuntu installation.

Of course I see I can specify the rotation pattern by myself and I will do this, but you may consider adjusting some code and also create some docs for the available rotation methods as I couldn't find it anywhere beside the source code.

Thank you for your work!

Please add newer tags to your repository

Your git repository contians tags, but the latest tag is 3.5 from 2012. Please be so kind to tag later commits too. Thanks!

Reasoning: I plan to package the tool for Debian (#1023474). Searching for New upstream versions is easier, when they are tagged.

Logfiles with no fix name

Hi,

it would be super nice if the plugin could work with logfiles like this:

-rw-rw---- 1 user group 1081070 Jan 31 23:59 t20170131000000
-rw-rw---- 1 user group 961113 Feb 1 23:59 t20170201000000
-rw-rw---- 1 user group 1050532 Feb 2 23:59 t20170202000000
-rw-rw---- 1 user group 1471560 Feb 3 23:59 t20170203000000
-rw-rw---- 1 user group 837051 Feb 4 23:59 t20170204000000
-rw-rw---- 1 user group 438710 Feb 5 23:59 t20170205000000
-rw-rw---- 1 user group 990532 Feb 6 17:01 t20170206000000

I could not figure out how to get this working with the otherwise very good check_logfiles plugin.

Regards,

Sven

logfilemissing=critical only working in commandline not in config

Hi,

i'm using check_logfiles v3.11 and I would like to use the logfilemissing option in the config.
If i'm using it in the config its working fine

/usr/lib/nagios/plugins/check_logfiles --logfile /var/log/sgv/cmdbsync.log --logfilemissing critical

CRITICAL - (1 errors in check_logfiles.protocol-2020-05-22-15-20-08) - could not find logfile /var/log/sgv/cmdbsync.log |'default_lines'=0 'default_warnings'=0 'default_criticals'=1 'default_unknowns'=0

When i'm using the config is doesn't have no effects and returns UKNOWN and not CRITICAL

/usr/lib/nagios/plugins/check_logfiles -f /etc/sgv/check_logfiles_cmdbsync.conf

UNKNOWN - (1 unknown in check_logfiles_cmdbsync.protocol-2020-05-22-15-21-41) - could not find logfile /var/log/sgv/cmdbsync.log |'default_lines'=0 'default_warnings'=0 'default_criticals'=0 'default_unknowns'=1

Here the very simple config

$MACROS = {
        LOGFILE => '/var/log/sgv/cmdbsync.log'
};

@searches = (
{
        logfile => '$LOGFILE$',
        logfilemissing => 'critical'
});

Any idea ?

Thanks for your help

Kind regards

Scripts not working as expected?

Hi, first of all thanks for this project.
I'm trying to do a simple critical pattern search over some logs, check_logfiles works perfectly but I'm not able to trigger a basic test script that should simply echo some strings (in production I would like to make it send an email).

I'll give you and example.
This is config file:

$scriptpath = '/data';

@searches = (
  {
    tag => 'test',
    logfile => '/tmp/test.log',
    criticalpatterns => 'Error',
    script => 'alert.sh',
    options => 'script'
  }
);

If I append "Error" to /tmp/test.log and run check_logfiles I have a critical event as expected...

[root@drakaris ~]# echo "test" >> /tmp/test.log ; echo "Error" >> /tmp/test.log
[root@drakaris ~]# /usr/local/nagios/libexec/check_logfiles --timeout=30 --config /usr/local/nagios/libexec/logwarn.cfg
CRITICAL - (1 errors in logwarn.protocol-2022-05-03-17-45-44) - Error |'test_lines'=2 'test_warnings'=0 'test_criticals'=1 'test_unknowns'=0

...but I have no output from the /data/alert.sh script (which has execution permissions) that should do this

[root@gsanas ~]# ls -la /data/alert.sh
-rwxr-xr-x 1 root root 116  3 mag 17.44 /data/alert.sh
[root@gsanas ~]# /data/alert.sh
HELP!!!
HELP!!!
HELP!!!
HELP!!!
HELP!!!
HELP!!!
HELP!!!

Did I miss something?
I'm using check_logfiles v4.0.1.5 on CentOS 7

How to make check_logfiles not delete an alert?

Hi!
Thanks in advance!

I am not sure if it is my ignorance of the check_logfiles parameters or it is not possible to do it at the moment.

The fact is that it seems to me a very basic functionality, such as searching for a string and an alert until it disappears from a file.
I have a log file, in which if "string1" appears, the check_logfiles plugin will show me an alert in Nagios, but I don't want this alert to be deleted until said string "string1" disappears from the log file, even if they follow carrying out the pertinent checks on this log file.

Can somebody help me?

Thanks!

Incorrect perfdata format when tag has white spaces

According to the doc of Icinga:

the single quotes for the label are optional. Required if spaces, = or ' are in the label

When tag have white spaces perfdata is generated incorrectly:

$ check_logfiles.pl --logfile=log --tag "some space"
OK - no errors or warnings|some space_lines=0 some space_warnings=0 some space_criticals=0 some space_unknowns=0

Should be:

OK - no errors or warnings|'some space_lines'=0 'some space_warnings'=0 'some space_criticals'=0 'some space_unknowns'=0

check_logfiles.exe

How can I obtain a Windows executable check_logfiles.exe?

I can't open README.exe on Windows 10.

image

Also when I try to use the Perl script directly, I get

perl .\check_logfiles.pl
Global symbol "%ERRORS" requires explicit package name at .\check_logfiles.pl line 221.
Execution of .\check_logfiles.pl aborted due to compilation errors.

Thanks for any info on how to use this on Windows.

CL_CAPTURE_GROUPx not present in $ENV

Hello,
Lastest version 3.12 testing and on prod v3.7.3.1
We are trying to capture group and count them in supersmart postcript
@searches = (
{
tag => 'rsync',
logfile => '$MY_LOGDIR$/$MY_LOGFILE$',
type => 'virtual',
criticalpatterns => '(total size)',
options => 'capturegroups,noprotocol,noperfdata,report=long,nosavethresholdcount'
}
);

And i'm trying to get : $ENV{CHECK_LOGFILES_CL_CAPTURE_GROUPS}

Is there a mystake in my code (grouping or var) ?
or
this is not true in doc:
"The number of these macros (the highest counter of CL_CAPTURE_GROUPx) can be found in CL_CAPTURE_GROUPS. These macros are best used as environment variables in a handler script."

Thank's

exe: premature return when using --environment

When compiled as exe, check_logfiles returns prematurely when using --environment on the Windows commandline. Tested with version 3.9 on different Windows machines (7/8/10, 64bit)

Test scenario

logfile with 200k lines.

minimalistic cfg file:

@searches = ({
  logfile => 'alllogs.txt',
  type	=> "virtual", 
  warningpatterns => '.*',
});

OK without environment

C:\cl_exe_test>check_logfiles_neu.exe -f cl_exe_test.cfg [enter]
[~8s execution time]
WARNING - (199129 warnings) - _(null)_ ...|'default_lines'=199129 'default_warnings'=199129 'default_criticals'=0 'default_unknowns'=0
C:\cl_exe_test>

NOK with environment

C:\cl_exe_test>check_logfiles_neu.exe -f cl_exe_test.cfg --environment WEBMODULE=foo [enter]
[~1s execution time]
C:\cl_exe_test>[enter]
C:\cl_exe_test>[enter][enter][enter][enter][enter][enter]....[enter][enter][enter][enter]
C:\cl_exe_test>WARNING - (199129 warnings) - _(null)_ ...|'default_lines'=199129 'default_warnings'=199129 'defaul
t_criticals'=0 'default_unknowns'=0
C:\cl_exe_test>

summary

As a result, when called by NSclient, check_logfiles services do not contain a service output.

workaround

Replacing environment variables with macros worked for my requirement.

second critical pattern doesn't worked

Hi
I want to use this check to check logfiles
if it contains one of 2 critical patterns that mean show critical alert in nagios

First critical pattern is
MemoryException: xxx bytes not availableI need that check give critical if xxx more than n bytes
I know that xxx > 3.5 gb in bytes and in this case check must be In critical state
This is completed and worked

I want to add second critical pattern TradeBase: not enough memory for index

And problem is that check ignore second critical pattern
I'm trying different variants but no one worked

my config for check now is
searches = ({
logfile => "test.log",
criticalpatterns => {'MemoryException: (\d+) bytes not available',
'TradeBase: not enough memory for index(.)'},
options => "supersmartscript, capturegroups,noprotocol",
script => sub {
my $gigabytes = $ENV{CHECK_LOGFILES_CAPTURE_GROUP1} / (1024
1024*1024);
if ($gigabytes > 3.5) {
print $ENV{CHECK_LOGFILES_SERVICEOUTPUT};
return 2;
} else {
return 0;
}
},
});

Script author replied

What you can try ist o use this as an example….
criticalpatterns => {
'door-open' => 'Door is open! (.)',
'door-closed' => 'Door is closed! (.
)' },
options => 'script,capturegroups',
script => sub {
my $state = $CHECK_LOGFILES_PRIVATESTATE;
my $date = $ENV{CHECK_LOGFILES_CAPTURE_GROUP1};
printf STDERR "%s\n", $ENV{CHECK_LOGFILES_KAKA};
#... umrechnen in einen epoch-timestamp

if ($ENV{CHECK_LOGFILES_PATTERN_KEY} eq 'door-closed') {
  delete $state->{opening_time};
  printf "Door %s is closed\n", $ENV{CHECK_LOGFILES_TAG};
  return 0;
} elsif ($ENV{CHECK_LOGFILES_PATTERN_KEY} eq 'door-open') {

Criticalpatterns is written as key-value structure.
$ENV{CHECK_LOGFILES_PATTERN_KEY} is the key of the matching pattern
$ENV{CHECK_LOGFILES_CAPTURE_GROUP1} is the portion inside ()

I cant understand how to add second critical pattern (((

Can someone help?

README.windows-exe outdated

README.windows-exe is outdated and cannot be followed to be an exe anymore.

At least the following part is not usable, as PAR-Packer-0.980 is not available anymore. 1.003, 1.035, 1.036 and 1.037 are available.

 b)
   cd PAR-Packer-0.980-<a lot of crap>
   edit myldr/Makefile.PL and add
   $file =~ s/^lib// if $^O eq "MSWin32"; 
   after line 142.
   perl Makefile.PL
   dmake
   dmake install

Could this procedure be updated?

BR,
Yannick

logs with dateext (RHEL /var/log/messages)

Hi,

We are trying to configure check_logfiles for /var/log/messages on RHEL 6/7 servers which use date as a suffix of the rotated file.

Extract of our search directive:

@searches = (
  {
    tag => 'messages',
    logfile => '/var/log/messages',
    rotation => 'messages\-[0-9]{8}',
    criticalpatterns => [
      'Redundancy lost',
      '[d,D]egraded',
      '[e,E]rror',
      'ERROR',
    ],
    ...

All went fine before the log rotation of the 21.10.2018.
The log files looked like this:

/var/log/messages
/var/log/messages-20181014
/var/log/messages-20181007
/var/log/messages-20180930
/var/log/messages-20180923

After the log rotation, they looked like this:

/var/log/messages
/var/log/messages-20181021
/var/log/messages-20181014
/var/log/messages-20181007
/var/log/messages-20180930

We then got a bunch of alerts that we already had seen in /var/log/messages that probably came from the new /var/log/messages-20181021 file.

How can we handle this?
Is it better to not configure the rotation directive for these kind of logs?
Are we missing something?

Thanks,
Dominique

No performence data in external script

My check look like:

@searches = (
  {
        tag => 'ConReset',
        type => 'virtual',
        logfile => '/var/log/nginx/error.log',
        #criticalpatterns => '.*(?:failed|timed out) \(1\d\d: Connection (?:reset by peer|timed out)\) while reading .*',
        criticalpatterns => '.*directory index of .*',
#       warningpatterns => ['.*error.*, '.*fatal.*'],
#       warningthreshold => '10',
        rotation => 'debian',
        script => 'send_mail',
        scriptparams => '$MY_MAIL$ $CL_SERVICEPERFDATA$',
        options => 'perfdatai,script'
  }
);

cat scripts/send_mail

#!/bin/bash
MAIL=$1
PERFDATA=$2
HOST=$(hostname -f)
var1=$(printf "%s: %d:%d Uhr status is %s\n" \
	$HOST \
	$CHECK_LOGFILES_DATE_HH \
	$CHECK_LOGFILES_DATE_MI \
	$CHECK_LOGFILES_SERVICESTATE)
var2=$(printf "I found something:\n")
var3=$(printf "%s\n\n" "$CHECK_LOGFILES_SERVICEOUTPUT")

echo "$PERFDATA" >> /tmp/foo

#echo -e $var1 "\n\n"$var2"\n"$var3 | mail -s "check_logfile $HOST" $MAIL

Debug Output:

tail -f /tmp/foo
$
$ <- when try to get $CL_SERVICEPERFDATA$
$
$
ConReset
ConReset
ConReset
CRITICAL
CRITICAL
CRITICAL
$
$
$

I cant get the Performencedata here.
I want to loop to send all error in one mail.

Best Regards

config file option

Hi Lausser,

I really like your check_logfiles plugin, but i'm unable to make it work with -f option :(
I've created this config file:
@searches = ( { tag => 'test', logfile => '/var/log/messages', criticalpatterns => ['EXIT'] } );
very simple, just to test.
I've seen several lines like "Apr 3 01:55:21 server01 xinetd[4678]: EXIT: vnetd status=0 pid=32460 duration=0(sec)" in /var/log/messages

Then:
./check_logfiles -d --logfile /var/log/messages --criticalpattern='EXIT' --type=virtual 2.CRITICAL - (6 errors in check_logfiles.protocol-2017-04-04-16-03-12) - Apr 4 02:44:45 server01 xinetd[4678]: EXIT: vnetd status=0 pid=19013 duration=0(sec) ...|default_lines=31 default_warnings=0 default_criticals=6 default_unknowns=0
But when I use the config file option:
./check_logfiles --config /opt/nagios/conf/error_kernel_patterns.cfg --type=virtual 2.OK - no errors or warnings|test_lines=0 test_warnings=0 test_criticals=0 test_unknowns=0

What am i missing?

Feature request: Indicate the user who should execute the search, by parameters

Hi!
thank in advance for your job!thats great!

Would it be possible to execute the search for a string in a log, indicating a specific user (the owner of the log) by parameters, for example?

My problem is that I am looking for a string in a log restricted to a certain user, to be able to do it I execute check_logfiles with sudo, but I would like to be able to avoid it.

I have seen the example of doing it by touching the ACLs, but they are machines in which we cannot do it.

thanks

for some reason it is not working on network device logs

Peace,

sudo ./check_logfiles --logfile=/var/log/messages --criticalpattern "sshd"
CRITICAL - (1 errors in check_logfiles.protocol-2020-02-28-10-33-45) - Feb 28 10:33:45 [email protected] sudo[19162]:   nagios : TTY=pts/0 ; PWD=/usr/lib64/nagios/plugins ; USER=root ; COMMAND=./check_logfiles --logfile=/var/log/messages --criticalpattern sshd |'default_lines'=17 'default_warnings'=0 'default_criticals'=1 'default_unknowns'=0

but on a cisco router log:

sudo ./check_logfiles --logfile=/var/log/cisco/ciscoRouter/2020/02/28/local7.log  --criticalpattern="User=ali" 
OK - no errors or warnings|'default_lines'=308 'default_warnings'=0 'default_criticals'=0 'default_unknowns'=

am i doing something wrong, or is there some bug?

Use both command line parameter and config file

Hello,
I would use check_logfiles with command line parameter --logfile both with --config, and use a var in the config file for re-use the logfile name.
I don't find documentation for what kind of var in @searches :

  • $CL_LOGFILE$
  • $ENV{CHECK_LOGFILES_LOGFILE}
    other one ?
    Or is not possible to use both config file and parameter line for logfile ?
    In that case, it would be a good idea, no ?
    Thank's

Critical pattern found but plugin exits OK

Hi Gerhard

Version:

$ /usr/lib/nagios/plugins/check_logfiles -V
check_logfiles v4.0.1.3

Config file:

#$seekfilesdir = '/var/tmp';
$seekfilesdir = '/etc/nagios';
# where the state information will be saved.

$protocolsdir = '/var/tmp';
# where protocols with found patterns will be stored.

$scriptpath = '/usr/lib/nagios/plugins';
# where scripts will be searched for.

@searches = (
  {
    tag => 'logstash',
    logfile => '/var/log/logstash/logstash-plain.log',
    criticalpatterns => ['ERROR', 'Pipeline started'],
    options => 'noprotocol,nocount,sticky,nosavethresholdcount,nosavestate,allyoucaneat'
  }
);

Note: I set the $seekfilesdir on purpose to a non-writeable directory because all the options were ignored and that was the only way the plugin would read all lines of the log file.

Script runs with this config file, finds matched pattern (Pipeline started) but instead of showing a CRITICAL output and exit, the plugin returns OK:

$ /usr/lib/nagios/plugins/check_logfiles -f /etc/nagios/logfiles-xxx.conf -v
Fri Feb 25 08:00:07 2022: ==================== /var/log/logstash/logstash-plain.log ==================
Fri Feb 25 08:00:07 2022: try pre2seekfile /etc/nagios/logfiles-xxx.logstash-plain.log.logstash instead
Fri Feb 25 08:00:07 2022: try pre3seekfile /tmp/logfiles-xxx._var_log_logstash_logstash-plain.log.logstash instead
Fri Feb 25 08:00:07 2022: no seekfile /etc/nagios/logfiles-xxx._var_log_logstash_logstash-plain.log.logstash found
Fri Feb 25 08:00:07 2022: but logfile /var/log/logstash/logstash-plain.log found
Fri Feb 25 08:00:07 2022: eat all you can
Fri Feb 25 08:00:07 2022: ILS lastlogfile = /var/log/logstash/logstash-plain.log
Fri Feb 25 08:00:07 2022: ILS lastoffset = 0 / lasttime = 0 (Thu Jan  1 01:00:00 1970) / inode = 66305:262625
Fri Feb 25 08:00:07 2022: the logfile grew to 5010
Fri Feb 25 08:00:07 2022: opened logfile /var/log/logstash/logstash-plain.log
Fri Feb 25 08:00:07 2022: logfile /var/log/logstash/logstash-plain.log (modified Wed Jan 19 08:58:31 2022 / accessed Thu Feb 24 15:20:43 2022 / inode 262625 / inode changed Wed Jan 19 08:58:31 2022)
Fri Feb 25 08:00:07 2022: relevant files: logstash-plain.log
Fri Feb 25 08:00:07 2022: moving to position 0 in /var/log/logstash/logstash-plain.log
Fri Feb 25 08:00:07 2022: MATCH CRITICAL Pipeline started with [2022-01-19T07:41:22,202][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
Fri Feb 25 08:00:07 2022: MATCH CRITICAL Pipeline started with [2022-01-19T08:58:31,240][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
Fri Feb 25 08:00:07 2022: stopped reading at position 5010
Fri Feb 25 08:00:07 2022: no sticky error from last run
OK - no errors or warnings|'logstash_lines'=30 'logstash_warnings'=0 'logstash_criticals'=2 'logstash_unknowns'=0

What am I doing wrong? I can't find the needle.

thx

Some macros still broken in v3.7.1.1

The threshold macros don't seem to work in v3.7.1.1. They were working in v3.6.3.

  • --warning=. Complex handler-scripts can be provided with a warning-parameter (of course --critical is possible, too) this way. Inside the scripts the value is accessible as the macro CL_WARNING (resp. CL_CRITICAL).
{
        tag                 => 'some_tag',
        logfile             => 'some_log',
        criticalpatterns    => [
            'some_string',
        ],
        options             => 'supersmartscript',
        script => sub {
              foreach my $key (sort(keys %ENV)) {
                    next unless $key =~ /^CHECK/;
                     printf "%s=%s\n", $key, $ENV{$key};
            }
            return;
        }
    },

[REQUEST] Different thresholds for different time range

Hello and thanks for this best plugin for monitoring logs.

Is it possible to set different warning/critical threshold for different time range?

For example between 6:00-22:00 hours, critical state occur at 5 critical patterns are found, but for time range 22:01-5:59 you need 15 critical patterns to be found.

Can you implement this feature to your script?

thresholdexpiry

Hello,
thanks for this great job !
I dont find anything about this option: thresholdexpiry

  • for which use
  • parameter
    I am looking for this case:
  • find a pattern in a log file
  • the pattern must be present in the file in the last 24 hours prior to the execution of the check
  • beyond that it should return OK

thank you

Negative criticalpatterns

Hello, it seems negative criticalpatterns don't work in 3.7.1.4.
My log :
Tue Sep 29 17:51:28 2015: relevant files: aide.log
Tue Sep 29 17:51:28 2015: moving to position 0 in /var/log/aide/aide.log
Tue Sep 29 17:51:28 2015: negative pattern All files match AIDE database. Looks okay found.
Tue Sep 29 17:51:28 2015: stopped reading at position 2280
Tue Sep 29 17:51:28 2015: keeping position 2280 and time 1443541885 (Tue Sep 29 17:51:25 2015) for inode 64512:73740 in mind
OK - no errors or warnings|aide_linux_stat_lines=40 aide_linux_stat_warnings=0 aide_linux_stat_criticals=0 aide_linux_stat_unknowns=0

My config :
@searches = (
{
tag => 'aide_linux_stat',
logfile => '$MY_LOGDIR$/$MY_LOGFILE$',
criticalpatterns => [
'!All files match AIDE database. Looks okay'
],
options => 'allyoucaneat,protocol,nocount',
}
);

Thank's

error with check_logfiles

I have installed check_logfiles-3.8.1.2.tar.gz on centos : CentOS release 6.8 (Final)

The configuration file is : /usr/local/nagios/etc/check_logfiles.cfg
$scriptpath = '/usr/bin';
$seekfilesdir = '/usr/local/nagios/log';
$protocolsdir = '/usr/local/nagios/log';
$prescript = 'sudo';
$prescriptparams = 'setfacl -m u:$CL_USERNAME$:r-- /var/log/messages*';
$options = 'supersmartprescript';
@searches = (
{
tag => 'Messages',
logfile => '/var/log/messages',
rotation => 'Linux',
criticalpatterns => [
'xinetd.*START',
'xinetd.*EXIT'
],
criticalthreshold => 1,
warningpatterns=> [
'snmptrapd'
],
warningthreshold=> 1,
}
);

When I launch the command , I have an error message :

/usr/local/nagios/libexec/check_logfiles --config /usr/local/nagios/etc/check_logfiles.cfg
Use of uninitialized value in substr at /usr/local/nagios/libexec/check_logfiles line 1880.
Use of uninitialized value in substr at /usr/local/nagios/libexec/check_logfiles line 1880.
Use of uninitialized value in substr at /usr/local/nagios/libexec/check_logfiles line 1880.

WARNING - (28 warnings in check_logfiles.protocol-2017-10-25-14-08-18) - Oct 25 14:08:16 ptsxcent01 snmptrapd[1784]: 2017-10-25 14:08:16 192.168.100.247(via UDP: [192.168.100.247]:32772->[10.27.2.148]) TRAP, SNMP v1, community public#012#011SNMPv2-SMI::experimental.94 Enterprise Specific Trap (4) Uptime: 177 days, 22:55:44.04#012#011SNMPv2-SMI::experimental.94.1.11.1.3.16.0.80.235.26.112.240.0.0.0.0.0.0.0.0.0.216431 = INTEGER: 216431#011SNMPv2-SMI::experimental.94.1.11.1.7.16.0.80.235.26.112.240.0.0.0.0.0.0.0.0.0.216431 = INTEGER: 2#011SNMPv2-SMI::experimental.94.1.11.1.8.16.0.80.235.26.112.240.0.0.0.0.0.0.0.0.0.216431 = OID: SNMPv2-SMI::zeroDotZero#011SNMPv2-SMI::experimental.94.1.11.1.9.16.0.80.235.26.112.240.0.0.0.0.0.0.0.0.0.216431 = STRING: "SEC-1193 Security violation: Login failure attempt via HTTP. IP Addr: 192.168.100.172." ...|'Messages_lines'=28 'Messages_warnings'=28 'Messages_criticals'=0 'Messages_unknowns'=0

Could I have some help ?
Thank you

Unknown option: maxage

Hi,

i'm using check_logfiles v3.11 and I would like to use the maxage parameter. If I specify the option via command line with --maxage then I get the message "Unknown option: maxage". It has no effect in the configuration either.

/usr/lib/nagios/plugins/check_logfiles --logfile /var/log/sgv/cmdbsync.log --logfilemissing critical --maxage 1h

According to changelog, it should have been available since version 3.9.

Any idea ?

Thanks for your help

Kind regards

rotation detection fails if lastedited time of rotated file and new logfile are the same -> creates false CRITICAL states

Hello,

I encountered a bug in the rotation handling of check_logfiles which occures when the rotated file has the same timestamp for last edited as the logfile itself. This frequently happens to us when using the logback project (http://logback.qos.ch/) because this engine rotates lazy: if we configure a rotation at midnight each day the log is rotated, when the first log entry is to be written after midnight. When this happens, the log is rotated and the line to be logged is then written into the (new) logfile. This usually means, both files have the same timestamp.

The bug is reproducible. I have created a test script to reproduce the issue. Please see https://gist.github.com/betagan/3bf1356adb98b580e91a

You will need a check_logfiles executable in the same directory as the script or change the CLCMD variable in the top. Other than that, just execute the script.

This shows, that although we only add "OK" type lines after the rotation, we still get a CRITICAL return code for the second execution of check_logfiles after the rotation.

example output is shown here: https://gist.github.com/betagan/df30f9010174333863da

we tracked down the issue to the following line in the check_logfiles executable, found in the collectfiles sub:

if ((stat $archive)[9] >=
        $self->{laststate}->{logtime}) {

We have (temporarily) changed this to be > instead of >= but we don't know if this breaks anything down the road. Any comment regarding this is very much welcomed.

all tests performed with check_logfiles v3.6.2.1

We are happy to provide more information if required.

"winwarncrit" option causes Windows eventlog messages to be prefixed with "EE_(EE|WW)_TT"

When monitoring the Windows eventlog, if the winwarncrit option is specified the log messages output by the plugin are prefixed with one of the following:

  • EE_EE_TT
  • EE_WW_TT
  • EE_UU_TT

I've tested using the latest git head (v4.0.1.6 at the time of writing) and the issue is still present. Steps to reproduce the issue are provided below:

Configuration file:

$options = 'report=long, maxlength=768';
$protocolretention=1;

@searches =
(
##
## Windows System EventLog Check
##
{
                tag => 'system-eventlog',
                script => sub {
                                my $trimlength = 500;
                                $op_trim = substr($ENV{CHECK_LOGFILES_SERVICEOUTPUT},0,$trimlength);
                                print $op_trim;
                                return $ENV{CHECK_LOGFILES_SERVICESTATEID};
                },
                type => 'wevtutil',
                eventlog => {
                                eventlog => 'application',
                                include => {
                                },
                },
                criticalpatterns => [
                                '.*',
                ],
                criticalexceptions => [
                ],
                warningpatterns => [
                ],
                warningexceptions => [
                ###STARTOFWARNINGEXCEPTIONS###
                ##ENDOFWARNINGEXCEPTIONS###
                ],
                okpatterns => [
                ###STARTOFOKPATTERNS###
                ###ENDOFOKPATTERNS###  
                ],
                options => 'nocase,supersmartscript,winwarncrit,sticky=90,preferredlevel=warning',
},
)

Clear the Application event log in Event Viewer and then run check_logfiles:

PS C:\Users\xxx\check_logfiles-522ebe\plugins-scripts> perl .\check_logfiles -f .\test.conf
OK - no errors or warnings|'system-eventlog_lines'=0 'system-eventlog_warnings'=0 'system-eventlog_criticals'=0 'system-eventlog_unknowns'=0

Use Powershell to add a warning to the event log:

New-EventLog -LogName Application -Source CheckLogfilesTest
Write-EventLog -LogName "Application" -Source "CheckLogfilesTest" -EventId 4242 -EntryType Warning -Message "Test warning event"

Run check_logfiles again:

PS C:\Users\xxx\check_logfiles-522ebe\plugins-scripts> perl .\check_logfiles -f .\test.conf
WARNING - (1 warnings in test.protocol-2022-05-05-14-08-58) - EE_WW_TT2022-05-05T14:08:50 4242 Test warning event |'system-eventlog_lines'=1 'system-eventlog_warnings'=1 'system-eventlog_criticals'=0 'system-eventlog_unknowns'=0
tag system-eventlog WARNING
EE_WW_TT2022-05-05T14:08:50 4242 Test warning event

Add an error to the event log using Powershell:

Write-EventLog -LogName "Application" -Source "CheckLogfilesTest" -EventId 4242 -EntryType Error -Message "Test error event"

Run check_logfiles again:

PS C:\Users\xxx\check_logfiles-522ebe\plugins-scripts> perl .\check_logfiles -f .\test.conf
CRITICAL - (2 errors, 1 warnings in test.protocol-2022-05-05-14-09-35) - EE_EE_TT2022-05-05T14:09:30 4242 Test error event ...|'system-eventlog_lines'=1 'system-eventlog_warnings'=1 'system-eventlog_criticals'=2 'system-eventlog_unknowns'=0
tag system-eventlog CRITICAL
EE_EE_TT2022-05-05T14:09:30 4242 Test error event
EE_EE_TT2022-05-05T14:09:30 4242 Test error event
EE_WW_TT2022-05-05T14:08:50 4242 Test warning event

Note how each message is prefixed with "EE_(EE|WW)_TT". However when I remove the winwarncrit option from the configuration the output doesn't include those prefixes:

PS C:\Users\xxx\check_logfiles-522ebe\plugins-scripts> perl .\check_logfiles -f .\test.conf
CRITICAL - (3 errors in test.protocol-2022-05-05-14-15-31) - 2022-05-05T14:12:02 4242 Test error event ...|'system-eventlog_lines'=3 'system-eventlog_warnings'=0 'system-eventlog_criticals'=3 'system-eventlog_unknowns'=0
tag system-eventlog CRITICAL
2022-05-05T14:08:50 4242 Test warning event
2022-05-05T14:09:30 4242 Test error event
2022-05-05T14:12:02 4242 Test error event

I tried changing the type option from "wevtutil" to "eventlog" but the issue remained.

check_logfiles.exe fails to use a path with spaces in the argument --config

Hello,

After compiling a new version of check_logfiles.exe and using it in a different environment, I discovered that I cannot use a path with space in the argument --config. Usual path in a windows environment may contain "Program Files"

C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles>"C:\Program Files\ICINGA2\/sbin/custom_plugins\check_logfiles\check_logfiles.exe" --config "C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles\logfiles_sap.cfg" --searches backupsap
UNKNOWN - can not load configuration file C:\Program

I guess it what not supported since the beginning. Taking the last and an old version of the plugin output the same error:

C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles>"C:\Program Files\ICINGA2\/sbin/custom_plugins\check_logfiles\check_logfiles_3.4.2.exe" --config 'C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles\logfiles_sap.cfg' --searches backupsap
UNKNOWN - can not load configuration file 'C:\Program

C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles>"C:\Program Files\ICINGA2\/sbin/custom_plugins\check_logfiles\check_logfiles_3.8.1.exe" --config 'C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles\logfiles_sap.cfg' --searches backupsap
UNKNOWN - can not load configuration file 'C:\Program

Running the script in the location of the config files and adapting the --config argument works, but that's not usable by the launching agent.

C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles>"C:\Program Files\ICINGA2\/sbin/custom_plugins\check_logfiles\check_logfiles.exe" --config logfiles_sap.cfg --searches backupsap
CRITICAL - (1 errors) - cannot write status file C:\Program Files\ICINGA2\sbin\custom_plugins\check_logfiles\seek/logfiles_sap.C__scripts_logs_backupsap.cmd.log.backupsap! check your filesystem (permissions/usage/integrity) and disk devices |'backupsap_lines'=0 'backupsap_warnings'=0 'backupsap_criticals'=1 'backupsap_unknowns'=0

Would be great to be able to use paths with spaces with the --config option.

BR
Yannick

Fail or warn if $protocolsdir is not existing or not writable

Hi,

I've a custom $protocolsdir which i use in check_logfiles. When this directory isn't existing or writable, check_logfiles should be output a warning + corresponding exit code or something else (i'am monitoring the $protocolsdir for new entries/files, therefore i would like to know if there is any problem writing the protocol files)

-Thanks

specific options for one search pattern

Hey,
we use check_logfiles among others for monitoring logs on our opensips proxys. I want to set criticalthreshold and warningthreshold for just one special pattern. Our config has about 40 patterns to match, and they must be matched warning at the first match, but one special should warn or crit on first match.

Test suite tests not always reproducible

I did notice that at least some test files do not generate reproducible reesults. Have a look at two test runs on architecture ppc64el (not sure if that matters). One of the two runs were succesful, the oher failed. Both time the same source code of check_logfiles was used. AFAICT this in not the only case, were the test suire generates non-reproducile results.

Thanks for having a lokk at this!

send_nsca is not working

I'm trying to monitor access_log on Fedora Linux httpd (looking for 404).
Here is my config files:

cat check_httpd.cfg 
# Where to look for executables
$scriptpath = '/usr/bin:usr/sbin:/usr/lib64/nagios/plugins';

$MACROS = {
	NAGIOS_HOSTNAME => "igor-test",
	CL_NSCA_HOST_ADDRESS => "dummy.com",
	CL_NSCA_PORT => 5667
	};

#Change permissions to be able to read log files
$prescript = 'sudo';
$prescriptparams = 'chmod 705 /var/log/httpd';

@searches = ({
	tag => 'httpd_404',
	logfile => '/var/log/httpd/access_log',
	rotation => 'SUSE',
	criticalpatterns => [
	    '404'],
	    
	script => 'send_nsca',
	scriptparams => '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$ -to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$',
	scriptstdin => '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t$CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n',
	},
	);

Generated some of the 404 errors with non-existent link access
Now trying to run the script:

./check_logfiles --config check_httpd.cfg 
CRITICAL - (2 errors in check_httpd.protocol-2017-06-13-14-41-59) - ::1 - - [13/Jun/2017:14:41:06 +0300] "GET /1111 HTTP/1.1" 404 202 "-" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0" ...|'httpd_404_lines'=2 'httpd_404_warnings'=0 'httpd_404_criticals'=2 'httpd_404_unknowns'=0

So far, the check_logfiles can see errors in log.

But send_nsca is not working. I'm using tcpdump to check any traffic on port 5667, but there is nothing.

Binaries are in the $scriptpath:

which sudo
/usr/bin/sudo

which send_nsca
/usr/sbin/send_nsca

Config for send_nsca is readable by current user:

cat /etc/nagios/send_nsca.cfg 
####################################################
# Sample NSCA Client Config File 
# Written by: Ethan Galstad ([email protected])
# 
# Last Modified: 02-21-2002
####################################################
........

send_nsca is working from command line:

send_nsca -H dummy.com -c ./send_nsca.cfg < ./nsca_test
1 data packet(s) sent to host successfully.

Not sure what can be wrong here.

Evaluate next line after match found

Hi,

I am using check_logfiles alot & am quite happy with it. What I could not resolve so far is this:

If a critical match is found in line n, I want to hold off ringing the alarm and evaluate the next line n+1 instead. If it matches pattern ABC, I want to ignore and ring the alarm if it does not.

So if my critical pattern was ERROR, I want that these 2 would be ignored:

ERROR global catastrophical error:
just a test though

But here, a critical error should be raised:

ERROR global catastrophical error:
somethin really bad happened that should be investigated

Any hint how this can be done with a supersmartscript?

Thanks!

Missing documentation for logfilemissing

I guess that this features returns a CRITICAL or WARNING error when the logfile is missing, but I'd like to be sure before using it.
Can you add documentation for this option?

How to get yesterday's log file date

Hi,

We are trying to monitor log file today which is having yesterday's date (log_2017_09_23.txt)

Currently we have below Macro to get today's date, but unfortunately we need similar Macro to find yesterday's date.
logfile => '$LOGDIR$/log_$CL_DATE_YYYY$$CL_DATE_MM$$CL_DATE_DD$.txt',

Kindly let us know if there is any way to get yesterday's date.

Thanks

Scan log and alarm when pattern is found X in row

It is possible to scan logfile for a pattern and alarm when it is found X in row?

For example I have this log:

2018-01-13 14:53:56.468  INFO 4607 [http-nio-6011-exec-1] --- c.m.fare.comm.resource.CoreResImpl       : Core handshake
2018-01-13 14:54:07.557  INFO 4607 [pool-2-thread-1] --- c.m.f.fs.service.TrnSendingServiceImpl   : Na core odeslano 0 transakci
2018-01-13 14:54:37.549  INFO 4607 [pool-2-thread-1] --- c.m.f.fs.service.TrnSendingServiceImpl   : Na core odeslano 0 transakci
2018-01-13 14:55:04.197  INFO 4607 [http-nio-6011-exec-3] --- c.m.fare.comm.resource.CoreResImpl       : Core handshake
2018-01-13 14:55:07.552  INFO 4607 [pool-2-thread-1] --- c.m.f.fs.service.TrnSendingServiceImpl   : Na core odeslano 0 transakci
2018-01-13 14:55:37.621  INFO 4607 [pool-2-thread-1] --- c.m.f.fs.service.TrnSendingServiceImpl   : Na core odeslano 0 transakci
2018-01-13 14:56:07.560  INFO 4607 [pool-2-thread-1] --- c.m.f.fs.service.TrnSendingServiceImpl   : Na core odeslano 0 transakci
2018-01-13 14:56:37.553  INFO 4607 [pool-2-thread-1] --- c.m.f.fs.service.TrnSendingServiceImpl   : Na core odeslano 0 transakci
2018-01-13 14:57:04.297  INFO 4607 [http-nio-6011-exec-5] --- c.m.fare.comm.resource.CoreResImpl       : Core handshake

and I want critical state occur only if pattern Na core odeslano 0 transakci is found 4 or more in a row.

Thx for your time.

Multiple line output

We need an option not to only print the matching line but also context lines, i.e. n / m lines of the log before / after the matching line. Could this be implemented. Many thanks!

check_logfiles on /var/log/message resulting in frequent socket timeouts

Hello,

I'm looking for some guidance as this issue has been plaguing me for a little while now and I'm almost positive it's related to something I'm doing inefficiently.

I am using the "check_logfiles" plugin against my syslog located at /var/log/messages. I wanted the granularity of defining different properties and thresholds for different patterns so I am choosing to use different .cfg patterns and different nagios service checks. I have been receiving many socket timeouts from these service checks. They are not constant and happen on different hosts but it occurs all day long intermittently on different servers

It should be noted, there are also unrelated checks that are not exhibiting the same "socket timeout" behavior.

Here are the config files in question:

check_logfiles_messages_qla_critical.cfg
@searches = (
{
tag => 'critical qla',
logfile => '/var/log/messages',
criticalpatterns => 'Abort command issued nexus',
options => "criticalthreshold=15",
},
);

check_logfiles_messages_qla_warning.cfg
@searches = (
{
tag => 'warning qla',
logfile => '/var/log/messages',
warningpatterns => ['QUEUE FULL detected', 'FCPort state transitioned from'],
options => "warningthreshold=8",
},
);

Other examples that seem to run just fine (no intermittent socket timeouts)...
@searches = (
{
tag => 'lpfc',
logfile => '/var/log/messages',
criticalpatterns => 'kernel: lpfc',
},
);

Below is how the nagios command is being issued, sudoers has already been configured, I recently added the --rununique flag to see if that would help, it hasn't. Any help/guidance/insight into what this plugin is doing that I might be overlooking would be extremely helpful! For example, I know that a temporary index file gets created, is it possible that several of these index files are being created and conflicting with each other or somehow confusing the script?

/usr/bin/sudo /usr/lib64/nagios/plugins/check_logfiles --rununique -f /etc/nagios/plugins/check_logfiles_messages_qla_critical.cfg

Check Alerts problems

Hello and thanks for this awesome work !

Maybe im a ***ing noob and this is just because i don't know how it works but i have a problem with nagios alerts.

The first launch, all is ok : I have my logs errors and im happy !
The second check : It say, no errors .. like a reset (rotation, seekfile ?) but my log file is always full of errors !?

Some tests in test suite do fail

I downloaded version 4.1.1 and run the test suite. Most of the tests run OK, but some fail. Here is the test summary, full log is attached.

Test Summary Report

040wevtutilfilt.t    (Wstat: 512 (exited 2) Tests: 0 Failed: 0)
  Non-zero exit status: 2
  Parse errors: No plan found in TAP output
041wevtutil.t        (Wstat: 512 (exited 2) Tests: 0 Failed: 0)
  Non-zero exit status: 2
  Parse errors: No plan found in TAP output
041wevtutilps.t      (Wstat: 512 (exited 2) Tests: 0 Failed: 0)
  Non-zero exit status: 2
  Parse errors: No plan found in TAP output
100flooddetect.t     (Wstat: 256 (exited 1) Tests: 3 Failed: 1)
  Failed test:  3
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 36 tests but ran 3.
Files=69, Tests=693, 1790 wallclock secs ( 0.45 usr  0.30 sys + 24.40 cusr 11.76 csys = 36.91 CPU)
Result: FAIL
Failed 4/69 test programs. 1/693 subtests failed.

test_suite.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.