Git Product home page Git Product logo

alert_manager's People

Contributors

iceeyz avatar johnfromthefuture avatar lkm avatar ltmon avatar mkldon avatar moshekaplan avatar my2ndhead avatar neo23x0 avatar simcen avatar wseng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

alert_manager's Issues

Should we deliver the urgencies as a sample file?

Probably we should deliver the app with an alert_urgencies.csv.sample and point transforms.conf to that file.

This allows users to have their own urgency definitions and we don't overwrite the csv when updating the app.

Idea blatantly stolen from ES.

Change incident drilldown to a pulldown menu

Currently, there are two drilldowns:

  1. Click on magnifier to open a new window and run the alert search
  2. Click on incident row to load alert job results in inpage-drilldown panel

But there should be a third option, to run the event search of the alert (without statistic command).

To better organize the drilldown, there should be a pulldown menu integrated to the incident row.

Add installation verification

  • Check if TA is installed
  • Check if "alerts" index exists, warn user
  • Check if "custom" index already exists, warn if so

App should provide an auto_subsequent_resolve option

Currently we have an auto_previous_resolve option, that closes existing alerts in state new, when a new alert with the same name is arriving.

This works for use cases, where we don't care about the previous alert, and just want to know which alerts are in an active state.

An alternative use case is, when we want to know how long an alert has persisted. Therefore we need an option to preserve the first incident and close all subsequent alerts.

Specification for custom hooks/integration of external systems [CustomIncidentHandler]

The plan is to provide an interface to build custom python classes, that catch events/hooks of the alert handler, in order to interrogate the incident workflow with custom actions.

List of current known hooks:

  • create incident
  • auto previous resolve incident
  • auto ttl resolve incident
  • change incident detail
    • assign
    • change status
    • change priority

List of information needed on each hook:
[ to be defined ]

@my2ndhead : Let's discuss this on friday how to proceed.

Create alert_manager add-on for distributed environments

Split Splunk configuration of the alert_manager related to index time configuration to a separate addon:

  • TA-alert_manager
    • props.conf: index time configurations for alert_metadata, alert_results, incident_change
    • indexes.conf: alerts index

The question is, if the addon has to be used too, if an all-in-one instance is used (indexing & searching on the same box) or if the main alert_manager app containts necessary configurations, as it already does by today.

Add trend indicators to the incident posture dashboard

Add trend indicators for single values (number of incidents by urgency) to the incident posture dashboard.
The information should include.

  • Up/down arrow indicating increase/decrease of nr of incidents
  • Increase/decrease as number

Integrate e-mail notifications

There should be an option to activate email notification for each hook/action by alert (alert settings).
Also it should be possible to define a html template (or rtf?) per notification type and alert.

Current hooks when email notification could be sent:

  • create incident
  • auto previous resolve incident
  • auto ttl resolve incident
  • change incident detail
    • assign
    • change status
    • change priority

List of information needed on each hook:
[ to be defined ]

Can may be integrated as CustomIncidentHandler (see issue #7 )

Previous status logged wrong

previous_status has the same value as status:

time=2014-12-28T11:20:30.044053 severity=INFO origin="alert_manager_scheduler" event_id="b4725db3c989febc75aa76382f27d432" user="splunk-system-user" action="auto_ttl_resolve" previous_status="auto_ttl_resolved" status="auto_ttl_resolved" job_id="scheduler__admin__testapp__RMD5384fa1812534b4c4_at_1419761700_97"

Enrichment of incidents with results

It would be good, if incidents could be enriched with data from results.

The detail section could show per alert selected (configurable) fields and their values. This would help incident investigators, who have to decide about further actions, without looking at the full results.

Either fields are fully configurable, or we provide a set of often used fields from CIM, like src, host, user, action.

When there are multiple rows we have to think about presentation of the fields.

curl --get -k https://localhost:8089/servicesNS/admin/testapp/search/jobs/1418908235.66/results -d "output_mode=csv" -d "f=host f=action f=user" -u

Doku + Readme: Typos and Optimization

README.md

old: Due the usage of the App Key Value Store
new: Due to the usage of the App Key Value Store

old: Disable saving Alert results to index: Wheter to
new Disable saving Alert results to index: Whether to

old: configured as globally visible are showed in the list
new: configured as globally visible are shown in the list

old: E-mail notifications on incident assignement
new: E-mail notifications on incident assignment

old: cd $SPLUNK_HOME/bin/script && ln -s ../..
new: cd $SPLUNK_HOME/bin/scripts && ln -s ../..

alert_manager_scheduler is not working on Windows

The Splunk scripted input for the alert_manager_scheduler (e.g. for the auto_ttl_resolve alert scenario) is currently only available on Linux, since a shell script is used to wrap the python script.

For the final submission, the scheduler should also be working on Windows.

alert_handler.py needs to be app aware

This gets only searches from app search or when shared globally.

Get savedsearch settings

uri = '/servicesNS/nobody/search/admin/savedsearch/%s' % alert

The token after "nobody" should change according to the alert's app.

index is hardcoded

We should not hardcode the index name in the data model.

  1. Create an eventtype for the datamodel including the index name
  2. Adjust dashboards containing the index name, refer to the eventtype
  3. Add instructions how to change the index name manually
  4. Let the app setup configure the index

Add usage instructions to a help dashboard

According to the rules, we should add a help dashboard containing usage instructions:

  • How to configure alert and incident settings
  • How to use the incident posture and drilldowns
  • How to use the reporting dashbaord
  • How to use the KPI dashboards

Provide demo data

  • The app should provide demo / sample data, to work without full configuration of alerts
  • The demo data should cover all features
  • If possible, activate/deactivate demo data through de UI

Object name refactoring

  • alert_settings should be incident_settings
  • alert_users should be ???
  • alert_users: "user" should be "name"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.