alertmanager / alert_manager Goto Github PK
View Code? Open in Web Editor NEWSplunk Alert Manager with advanced reporting on alerts, workflows (modify assignee, status, severity) and auto-resolve features
License: Other
Splunk Alert Manager with advanced reporting on alerts, workflows (modify assignee, status, severity) and auto-resolve features
License: Other
sourcetype = alert_results
There seems to be a limitation around 10000 characters per result event produced by alert_handler.py
If there is more data the result does not get written (indexed) completely and therefore breaks the JSON envelope.
Probably we should deliver the app with an alert_urgencies.csv.sample and point transforms.conf to that file.
This allows users to have their own urgency definitions and we don't overwrite the csv when updating the app.
Idea blatantly stolen from ES.
Please check in eventtypes.conf
Now: -24h to -24h
Should be: 2x outer timerange
Deleting an Alert under settings does not seem to work.
Currently, there are two drilldowns:
But there should be a third option, to run the event search of the alert (without statistic command).
To better organize the drilldown, there should be a pulldown menu integrated to the incident row.
If this is the preferred behavior, maybe we should warn about this?
Or maybe we should keep the old incidents open, and only start closing incidents from the time of checkboxing onwards?
When adding a new row to the alert_users or alert_settings handsontable and trying to save, the backend throws a 500
The double negative irritates. What about replacing it with a positive: f.ex. save_to_index [0|1]
Typo in alert_manager.conf.spec: Wheter to save results to alerts index or not
For reporting, the previous_status should be logged for the auto_resolve_* stati.
I think the trend numbers should be in white color, as red might be badly readable (eg, on a monitoring screen)
Currently we have an auto_previous_resolve option, that closes existing alerts in state new, when a new alert with the same name is arriving.
This works for use cases, where we don't care about the previous alert, and just want to know which alerts are in an active state.
An alternative use case is, when we want to know how long an alert has persisted. Therefore we need an option to preserve the first incident and close all subsequent alerts.
Der Suchstring scheint etwas komisch:
[...] |search status="new" status="resolved" [...]
Kann das sein?
The alert_handler.py script breaks trying to get the job at
serverResponse, serverContent = rest.simpleRequest(uri, sessionKey=sessionKey, getargs={'output_mode': 'json'})
(line 115).
Currently opening the dialog show the default owner (?)
According to http://docs.splunk.com/Documentation/Splunk/latest/Admin/savedsearchesconf, alert.severity may have a value from 1 to 6. The alert manager only knows severites from 1 to 5.
The plan is to provide an interface to build custom python classes, that catch events/hooks of the alert handler, in order to interrogate the incident workflow with custom actions.
List of current known hooks:
List of information needed on each hook:
[ to be defined ]
@my2ndhead : Let's discuss this on friday how to proceed.
See title.
Looks like auto_previous_resolve does not log at all status changes...
Use saved search description, parse HTML
Split Splunk configuration of the alert_manager related to index time configuration to a separate addon:
The question is, if the addon has to be used too, if an all-in-one instance is used (indexing & searching on the same box) or if the main alert_manager app containts necessary configurations, as it already does by today.
The alert handler grabs incidents from the collection when auto_previous_resolve is active for a certain alert. The filter used with the query returns all incidents related to the alert instead of only new ones.
Add trend indicators for single values (number of incidents by urgency) to the incident posture dashboard.
The information should include.
There should be an option to activate email notification for each hook/action by alert (alert settings).
Also it should be possible to define a html template (or rtf?) per notification type and alert.
Current hooks when email notification could be sent:
List of information needed on each hook:
[ to be defined ]
Can may be integrated as CustomIncidentHandler (see issue #7 )
After editing the urgency of an Alert the concerning single value in Incident Posture does not get updated.
Unfortunately we can't use this TTL in this way, as it is an auto-counter towards 0:
entry['ttl'] = job['entry'][0]['content']['ttl']
We could use alert.expires
http://docs.splunk.com/Documentation/Splunk/6.2.0/RESTREF/RESTsearch#saved.2Fsearches
Valid values: [number][time-unit]
Sets the period of time to show the alert in the dashboard. Defaults to 24h.
Use [number][time-unit] to specify a time. For example: 60 = 60 seconds, 1m = 1 minute, 1h = 60 minutes = 1 hour.
previous_status has the same value as status:
time=2014-12-28T11:20:30.044053 severity=INFO origin="alert_manager_scheduler" event_id="b4725db3c989febc75aa76382f27d432" user="splunk-system-user" action="auto_ttl_resolve" previous_status="auto_ttl_resolved" status="auto_ttl_resolved" job_id="scheduler__admin__testapp__RMD5384fa1812534b4c4_at_1419761700_97"
Use bootstrap views instead
After conversion to HTML, the dashboard shows the filename as label in the App nav
It would be good, if incidents could be enriched with data from results.
The detail section could show per alert selected (configurable) fields and their values. This would help incident investigators, who have to decide about further actions, without looking at the full results.
Either fields are fully configurable, or we provide a set of often used fields from CIM, like src, host, user, action.
When there are multiple rows we have to think about presentation of the fields.
curl --get -k https://localhost:8089/servicesNS/admin/testapp/search/jobs/1418908235.66/results -d "output_mode=csv" -d "f=host f=action f=user" -u
old: Due the usage of the App Key Value Store
new: Due to the usage of the App Key Value Store
old: Disable saving Alert results to index: Wheter to
new Disable saving Alert results to index: Whether to
old: configured as globally visible are showed in the list
new: configured as globally visible are shown in the list
old: E-mail notifications on incident assignement
new: E-mail notifications on incident assignment
old: cd $SPLUNK_HOME/bin/script && ln -s ../..
new: cd $SPLUNK_HOME/bin/scripts && ln -s ../..
They only show up if the alert is set to be export = global or if it's placed in the alert manager app
The Splunk scripted input for the alert_manager_scheduler (e.g. for the auto_ttl_resolve alert scenario) is currently only available on Linux, since a shell script is used to wrap the python script.
For the final submission, the scheduler should also be working on Windows.
This gets only searches from app search or when shared globally.
uri = '/servicesNS/nobody/search/admin/savedsearch/%s' % alert
The token after "nobody" should change according to the alert's app.
Several places in in dashboards and configurations:
The words "Severity", "Urgency" and "Priority" are used in an mixed way that could irritate users.
Status change not logged.
We should not hardcode the index name in the data model.
I have already 7 users and can't see the save button....
According to the rules, we should add a help dashboard containing usage instructions:
Configure the alert manger as new type of alert actions, make it activatable as option in savedsearches.conf and if possible in the settings UI
.py scripts in ../bin/ must be commited to git with executable permissions.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.