fxnn / gowatch Goto Github PK

View Code? Open in Web Editor NEW

8.0 3.0 2.0 474 KB

Configurable logfile analysis for your server.

License: MIT License

Go 100.00%

golang logfile go grok sysadmin logwatch logs

gowatch's People

Contributors

Stargazers

Watchers

Forkers

vjeantet itd2007

gowatch's Issues

Support for parsing gzipped logfiles

In #22, we implement wildcards to support rotated logs. This mostly implies that we must be able to handle gzipped log files (transparently, i.e. just as if the file was uncompressed).

Predicates on timestamps

Each logentry has a timestamp. One must be able to compare them, at least based on the current time, or even with some given timestamp.

Therefore, we could have predicates as:

timestamp: {before: "-2d"} // not newer than 2 days in past
timestamp: {after: "-1h30m"} // not older than 1 hour and 30 minutes
timestamp: {after: "2015-01-01T00:00:00Z"} // in 2015

Note that the last version would need to use some fixed timestamp format. RFC3339 aka ISO 8601 feels right here.

Allow fixing timestamps that miss the year

Turns out that Syslog doesn't log the year.

Apr  5 05:39:01 lvps176-28-9-153 CRON[5773]: pam_unix(cron:session): session opened for user root by (uid=0)

Looks like we need to calculate the date. We could do this

using the timestamp of the logfile itself, or
using our current datetime, assuming the year is either the current year or, if the timestamp would be in future then, the year before.

Remove log.Fatalf error handling

Guess it's bad style to have

log.Fatalf("Error message")
return logentry.AcceptNothingPredicate{} // actually never executed

all over the code. The decision to quit immediately should be made only at one place in code; until then, normal error handling (by return) should be used.

Make Grok the default

Currently, on parsing files with grok, one has to say parser: grok. Equaly, when summarizing log entries with grok, one has to say summarizer: grokcounter.

Currently, those seem to be the most useful to me, so we could just make them default.

"slice bounds out of range " bug in golang.org/x/text

See build #44.2: it fails because of what looks like a bug inside golang.org/x/text, causing the call

collate.New(language.AmericanEnglish, collate.Numeric)

to panic as shown. Currently, we fix it by hard-coding language.Und.

Support wildcards in logfile path

To be able to read rotated log files, we need to implement wildcards in the log file path.

Now, it would be a problem if we piped log files from several months (or even years) through gowatch just to find out that every single line is filtered out because of a timestamp predicate. Therefore, the parser should apply timestamp predicates to file modification timestamps also.

Make "No configuration file specified" message more helpful

Add tag when predicate matches

While currently, one can only add tags to each line of a logfile, it should be possible to add tags only to lines matching a given predicate.

Though this is currently possible by parsing the same file multiple times with different predicates, that's not the way it's supposed to work.

Possible syntax would be

logfiles:
- filename: /path/to/file.log
  tags:
  - tag_for_each_line
  - tag_for_few_lines: {
      Message: { contains: some text }
    }

Alternatively, we could provide this feature only by using an extended mapping section, in addition to the existing logfiles and summary sections.

(Note, that this already uses the new predicate syntax from #4)

Import configuration from URL

Setting up a configuration for many logfiles including expressive summaries is a hard piece of work to do, therefore, users will want to share pieces of configuration.

To facilitate sharing, we should make it as easy as possible. Therefore, one should be able to import configuration from an URL. Except for the remote location, everything should work as described in #6.

Supported protocols should be HTTP and HTTPS for the beginning. For faster execution, gowatch should cache the imported files between invocations. We should keep in mind that some users will want to use gowatch in an offline mode; we could support this by downloading once and never again -- maybe just by using the cache.

Make configuration keywords more intuitive

Currently, a configuration file looks as follows:

logfiles:

- filename: /var/log/auth.log
  tags: ['auth.log']
  timelayout: Stamp
  config: {pattern: '%{SYSLOGBASE} %{GREEDYDATA:Message}'}

summary:

- summarizer: count
  title: auth.log
  where: {tags: {contains: 'auth.log'}}
  config: {
    'sudo [%{user}->%{effective_user}] %{command}': '\s*%{USER:user}\s*: TTY=%{DATA:tty} ; PWD=%{PATH:pwd} ; USER=%{USER:effective_user} ; COMMAND=%{PATH:command}(: %{GREEDYDATA:arguments})?'
  }

Parts of it are made to be easy to read, like where: {tags: {contains: 'auth.log'}}. Everyone should know what's ment, and I also feel that it's quite intuitive and thus easy to write and remember.

This should be done with all keywords in the file (as far as possible). Ideas:

do: count (instead of summarizer)
with: {pattern: 'abc'} (instead of config)

Allow for adding grok patterns

The config file should provide a means of adding grok patterns, ether by naming a file in the usual format

PATTERN_NAME (my)?p[at]tern

or by just providing patterns inside the yaml file. The syntax could be

patterns: {
    PATTERN_NAME: '(my)?p[at]tern'
}
patternsource: /path/to/patternfile

Support listing summarizer

The main summarizer we currently have is the GrokCounter, allowing to have a set of patterns (each with a name), which counts the occurences of each pattern.

Dovecot: Failed Login Attempts
==============================
5.196.31.23: 1
49.248.147.211: 1
52.6.24.186: 4
52.6.71.222: 3
52.6.130.221: 2
54.208.194.166: 1

Now, what I'd like to see is that we not just only have the number of occurences per pattern, but that we can also see what happened. In the above example, we could list the user names per IP.

Dovecot: Failed Login Attempts
==============================
5.196.31.23: webmaster
49.248.147.211: admin
52.6.24.186: joe, webmaster, admin, adm
52.6.71.222: adm, admin, joe
52.6.130.221: frank, joe
54.208.194.166: user

It's yet unclear to me how to specify the match to be displayed. The configuration for the GrokCounter is

- summarizer: count
  config: {
    '%{login_host}': 'auth\(%{PROG}\): %{PROG}\(%{USER},%{IPORHOST:login_host}\): unknown user'
  }

Guess we need a tuple or something, so that we can specify the pattern and the match to be displayed:

- summarizer: count
  config: {
    '%{login_host}': ['%{user}', 'auth\(%{PROG}\): %{PROG}\(%{USER:user},%{IPORHOST:login_host}\): unknown user']
  }

Unfortunately, tuples are bad to read. So, another map?

- summarizer: count
  config: {
    '%{login_host}': {
      list: '%{user}',
      for: 'auth\(%{PROG}\): %{PROG}\(%{USER:user},%{IPORHOST:login_host}\): unknown user'
    }
  }

Make predicate syntax more concise

Currently, predicates work like

where: {
  allof: [{
    field: Message,
    contains: some text
  }, {
    field: Message,
    matches: '%{SOME_PATTERN}'
  }]
}

This could be made a lot more concise by removing the "field" key and mapping fields to conditions instead:

where: {
  Message: {contains: some text, matches: '%{SOME_PATTERN}'},
  not: {Message: {matches: '%{ANOTHER_PATTERN}'}}
}

This of course is at the expense of prohibiting the custom fields not, allof, anyof. Guess we can live with that.

Allow for importing configuration files

Within a configuration file, it should be possible to import other configuration files by using an import statement on the top level.

Syntax should be as follows. At first, we should be able to import a single file.

import: /path/to/import.yml

Then, we should be able to import multiple files at once.

import:
- /path/to/import1.yml
- /path/to/import2.yml

The semantics shall be as follows. On importing a configuration file, every logfile defined therein must be added to the logfiles defined in the importing configuration file. Equally, every summary defined in the imported configuration file must be added to those defined in the importing configuration file.

Use MapItems for summary configuration

The following summarizer is malfunctioning:

- summarizer: count
  title: Hosts of Discarded and Junk Mails
  where: {
    allof: {
      tags: {contains: 'mail.log'},
      SYSLOGPROG: {contains: 'dovecot'}
    }
  }
  config: {
    '%{msgid_host}': 'deliver\(%{USER:user}\): sieve: msgid=<%{DATA:msgid_nonhost}@%{IPORHOST:msgid_host}>: marked message to be discarded if not explicitly delivered',
    '%{msgid_host}': "deliver\\(%{USER:user}\\): sieve: msgid=<%{DATA:msgid_nonhost}@%{IPORHOST:msgid_host}>: stored mail into mailbox 'Junk'"
  }

The reason for this is the duplicate key in the config section. Currently, only one of both entries gets processed.

To resolve the problem, we must use YAML maps instead.