Git Product home page Git Product logo

incident-bot's Introduction

incident-bot

tests version

Incident management framework centered around a ChatOps bot for Slack to allow your teams to easily and effectively identify and manage technical incidents impacting your cloud infrastructure, your products, or your customers' ability to use your applications and services.

Check out Incident Bot's Documentation

Need support or just want to chat with us? Join us on Discord.

Interacting with the bot is incredibly easy through the use of modals and simplified commands:

Featuring a rich web management UI:

Features at a Glance

  • Helps you declare and run incidents - All the automation you'll need to organize, strategize, and explain
    • Create a a war room Slack channel - Create a Slack channel automatically and prepopulate with key information and manage all of your incidents in a centralized digest channel
    • Control from start to finish - Shift the incident through status and severity from a management menu - never leave the channel
  • Helps you find the right people to assist - Page teams, automatically add groups or users, and start putting out fires
    • Manage user participation - Invite key users to an incident channel automatically - users can be elected to roles or can claim them
    • Send out internal updates - Keep your internal users up to date via the incident digest channel
  • Handles organizing facts, documentation, and evidence - Automatically build a postmortem doc with a timeline, attach evidence, and collect relevant data
  • Integrates with your favorite tools
    • Confluence - Automatically format and create a postmortem document in Confluence
    • Jira - Create and associate Issues for your incidents directly from the channel
    • PagerDuty & OpsGenie - Interact with teams and page or invite them to incidents
    • Statuspage - Create and manage a Statuspage incident directly within the Slack channel
    • Zoom - Create a Zoom meeting for each incident and populate the channel with the link

New features are being added all the time.

Quick Start

  • Create a Slack app for this application. You can name it whatever you'd like, but incident-bot seems to make the most sense.
  • Select from an app manifest and copy manifest.yaml out of this repository and paste it in to automatically configure the app.
  • You'll need the app token, bot token, and user token for your application and provide those as SLACK_APP_TOKEN, SLACK_BOT_TOKEN, and SLACK_USER_TOKEN - these can be found within the app's configuration page in Slack. For more information on Slack tokens, see the documentation here.
  • You'll need a Postgres instance to connect to.
  • Configure the app using config.yaml and deploy it to Kubernetes, Docker, or whichever platform you choose. The structure of the config.yaml is explained in the documentation linked below.

Full setup documentation is available here.

Kubernetes

  • The Helm chart is the recommended way to deploy - instructions are available here.
  • You can use kustomize. More details available here.

Testing

Tests will run on each pull request and merge to the primary branch. To run them locally:

make -C backend run-tests

Feedback

This application is not meant to solve every problem with regard to incident management. It was created as an open-source alternative to paid solutions that integrate with Slack.

If you encounter issues with functionality or wish to see new features, please open an issue and let us know.

We encourage you to join the community Discord if you wish to interact with us directly.

Contributing

A pull request template will ask required questions for each pull request. Most importantly, you should make sure to bump all version refs throughout the app. There is a script for this in which the only argument is the version to bump to:

./scripts/version-bump.sh v1.6.3

incident-bot's People

Contributors

aflansburg avatar dependabot[bot] avatar echoboomer avatar imdevinc avatar jklnr avatar lancesandino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

incident-bot's Issues

Document zoom integration

Document how to create the zoom app.

I have created it but not sure what to put for the oauth request page.

image

bot errors out and fails to create incident when slack fields are not set

image

ERROR:slack_bolt.App:Error: 'NoneType' object has no attribute 'get'
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/slack_bolt/listener/thread_runner.py", line 120, in run_ack_function_asynchronously
    listener.run_ack_function(request=request, response=response)
  File "/usr/local/lib/python3.11/site-packages/slack_bolt/listener/custom_listener.py", line 50, in run_ack_function
    return self.ack_function(
           ^^^^^^^^^^^^^^^^^^
  File "/incident-bot/bot/slack/modals.py", line 344, in handle_submission
    parsed = parse_modal_values(body)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/incident-bot/bot/templates/tools.py", line 24, in parse_modal_values
    result[title] = content.get("selected_option").get("value")
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'

WARNING:slack_bolt.App:handle_submission didn't call ack()
WARNING:slack_bolt.App:Unhandled request ({'type': 'view_submission', 'view': {'type': 'modal', 'callback_id': 'open_incident_modal'}})

[Suggestion] You can handle this type of event with the following listener function:

@app.view("open_incident_modal")
def handle_view_submission_events(ack, body, logger):
    ack()
    logger.info(body)

On repeat Start presses, you do get this error

image


IMO there should be defaults, or the error message should be handled better (if possible)

Failing to insert pinned messages from Slack - user missing

Hi,

We're experinecing an issue where the Slack integration is failing to insert the pinned messages do to a missing user ID. This might be an issue in the Slack App, but not sure how to fix it. It's working for at least one user, but haven't found a pattern.

We're running the latest release v1.4.10.

image

Application log output:

ERROR:slack.logging:Audit log row create failed for incident inc-202362793-test-pinning-in-channel: (psycopg2.errors.NotNullViolation) null value in column "user" of relation "incident_logging" violates not-null constraint

DETAIL:  Failing row contains (9, inc-202362793-test-pinning-in-channel, , then vi have a very important message, \x, , 27/06/2023 09:04:57 CEST, null).

[SQL: INSERT INTO incident_logging (incident_id, title, content, img, mimetype, ts, "user") VALUES (%(incident_id)s, %(title)s, %(content)s, %(img)s, %(mimetype)s, %(ts)s, %(user)s) RETURNING incident_logging.id]

[parameters: {'incident_id': 'inc-202362793-test-pinning-in-channel', 'title': '', 'content': 'then vi have a very important message', 'img': <psycopg2.extensions.Binary object at 0x7fd41d8ef030>, 'mimetype': '', 'ts': '27/06/2023 09:04:57 CEST', 'user': None}]

emoji documentation

There should be documentation around what emojis should be created and even examples/downloads of them to import to the slack channel.

Jira ticket does not get created

Is it possible that if the hardcoded jira ticket priority values don't align with those set up in the project, then the ticket does not get created? The default options currently are: low, medium, high, but in my project the priority options are: blocker, critical, major, minor, trivial. Would it be possible to either dynamically fetch them or be able to define them in values.yaml manually?

Part of the logs that might be relevant:
WARNING:atlassian.jira:Creating issue "test13" DEBUG:atlassian.rest_client:curl --silent -X POST -H 'Content-Type: application/json' -H 'Accept: application/json' --data '"{\"fields\": {\"description\": \"new test\", \"issuetype\": {\"name\": \"Task\"}, \"labels\": [\"incident-management\", \"etc\", \"inc-20239291912-test13\"], \"priority\": {\"id\": null}, \"project\": {\"id\": \"10019\"}, \"summary\": \"test13\"}}"' 'https://team-xxxxxxxxxxxx.atlassian.net/rest/api/2/issue' DEBUG:urllib3.connectionpool:https://team-xxxxxxxxxxxx.atlassian.net:443 "POST /rest/api/2/issue HTTP/1.1" 400 None DEBUG:atlassian.rest_client:HTTP: POST rest/api/2/issue -> 400 Bad Request DEBUG:atlassian.rest_client:HTTP: Response text -> {"errorMessages":[],"errors":{"priority":"Specify the Priority (id or name) in the string format"}} ERROR:atlassian.rest_client:'str' object has no attribute 'get' ERROR:jira:Error creating Jira issue: 400 Client Error: Bad Request for url: https://team-xxxxxxxxxxxx.atlassian.net/rest/api/2/issue ERROR:slack.modals:'NoneType' object has no attribute 'get'

Slack API rate limited when getting list of channels from a workspace

ERROR:slack.client:Error getting channel list from Slack workspace: The request to the Slack API failed. (url: https://www.slack.com/api/conversations.list)
The server responded with: {'ok': False, 'error': 'ratelimited'}

Would be best to store channel list at startup and refresh on an interval.

Support OpsGenie

OpsGenie is Atlassian's alert management tool, similar to PagerDuty.

[BUG] Upgrading severities

Describe the bug
I updated the severities on values.yaml and when I open an incident I have the old and the new ones

Version
1.8.3

To Reproduce
Steps to reproduce the behavior:

  1. Change severities on values.yaml and deploy them

Environment (please complete the following information):

  • Platform: Kubernetes

change loglevel of healthchecks

Healthchecks are logging under info which can get quite noisy and cause actual errors to be lost.

In our case, it can also cause a cost increase since we ship logs to datadog.

INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:14:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:14:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:14:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:14:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK

INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:15:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:15:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:15:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:15:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK

INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:16:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:16:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:16:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:16:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK

INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:17:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:17:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:17:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:17:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK

INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:18:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:18:17 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:18:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK
INFO:api:10.230.5.173:3000 10.230.5.1 [09/03/2023 14:18:47 CST] "GET /api/v1/health" - kube-probe/1.25 - HTTP/1.1 200 OK

Consider moving healthchecks to debug or similar loglevel.

PagerDuty integration - resolve the PD incident too

Discussed in #72

Originally posted by chadcancode December 14, 2022
It looks like incident-bot can send a new PagerDuty alert when an incident is created. This is great. What would be cool too is if incident-bot would resolve the PD alert.

fix(slack_app_manifest): files:read is missing

Hello here :)
that's a very nice project !
One thought, I think the files:read is missing, as I've seen the files:write in slack_app_manifest.yaml but the read is required to save the attachment in db (otherwise we have a default slack html content).
Cheers !

zoom puts people in waiting room

Setting up the Server-to-Server Oauth was working about a week ago, now when the meeting is created, people are thrown in a waiting room and it won't start.

I have reached out to Zoom but haven't heard anything. My first assumption was something in the API changed?

We have default zoom waiting rooms off, and allow people to join meetings without the host.

Allow for automatic Jira ticket creation

Currently, creating a Jira ticket can only be done manually once the incident is created. For our use case, having the JIRA ticket automatically created would be beneficial. I think we can make this a configurable option for users, something like:

jira:
  auto_create_incident: True
  auto_create_incident_type: 'Task'
  project: INCMGMT
  issue_types: ['Task', 'Epic', 'Story']
  priorities: ['High', 'Medium', 'Low']
  labels:
    - incident-management
    - etc

App home view in Slack shows an error when many incidents are open

incident-bot-7fbcc8c89-nqsrq incident-bot 07-25 14:08:07 ERROR:handler.py:update_home_tab:Error publishing home tab: The request to the Slack API failed. (url: https://www.slack.com/api/views.publish)
incident-bot-7fbcc8c89-nqsrq incident-bot 07-25 14:08:07 The server responded with: {'ok': False, 'error': 'invalid_arguments', 'response_metadata': {'messages': ['[ERROR] failed to match all allowed schemas [json-pointer:/view]', '[ERROR] no more than 5 items allowed [json-pointer:/view/blocks/20/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/view/blocks/21/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/view/blocks/22/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/view/blocks/23/accessory/options]']}}

Verify whether or not integration APIs need pagination

Integration APIs generally haven't been implemented with pagination which may be a problem for larger orgs. We should go through and verify for each one whether or not endpoints being used will need pagination or other types of protection.

Pushpin feature not working with custom channel names

Hi,

After we started using the new custom channel name feature i noticed that the pushpin feature (saving messages to timeline) stopped working.

I found that case handling the pushpin reaction has a condition that the channel name starts with inc- and our channel names now start with incident-.

Could we change this to use the channel_name_prefix config item in stead?

https://github.com/echoboomer/incident-bot/blob/main/backend/bot/slack/handler.py#L409

License is missing

Hi,

this project looks very interesting to me. I was planning on writing a similar bot. Which brings me to my question: What license is this source code released under?

Best regards

Should allow deleting database record on resolve

It's possible in some environments that some things stored in the database per-record, like the meeting link, could be considered PII. There should be an option to delete records when an incident is resolved. This does prevent reopening an incident and may introduce other issues.

Attachments not displayed on FO

Discussed in #78

Originally posted by virgileJT December 15, 2022
I can't have the attachment (image) properly displayed in the FO, but get a 401 "Missing Authorization Header" error.

So far, I can see the BO requires a token
backend/bot/api/routes/iincident.py l147 :
@jwt_required()
def get_delete_item_by_id(incident_id, id):

but there is no token sent when displaying the picture on the FO
frontend/src/incident/Signle-incidents.js l678 :
{imgData.map((item) => (
{item.title}

I don't know what is the good practice here, maybe rewrite the REACT part to get attachments one by one ?

Slack Pager command does not work

When trying to get a list of people who are oncall, it no longer works. Likely similar to the issue of the slackbot home page not working:

incident-bot 09-11 17:36:57 ERROR:slack_bolt.App:Error: The request to the Slack API failed. (url: https://www.slack.com/api/chat.postMessage)
incident-bot 09-11 17:36:57 The server responded with: {'ok': False, 'error': 'invalid_blocks', 'errors': ['no more than 5 items allowed [json-pointer:/blocks/2/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/3/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/4/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/5/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/6/accessory/options]'], 'response_metadata': {'messages': ['[ERROR] no more than 5 items allowed [json-pointer:/blocks/2/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/3/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/4/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/5/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/6/accessory/options]']}}
incident-bot 09-11 17:36:57 Traceback (most recent call last):
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_bolt/listener/thread_runner.py", line 120, in run_ack_function_asynchronously
incident-bot 09-11 17:36:57     listener.run_ack_function(request=request, response=response)
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_bolt/listener/custom_listener.py", line 50, in run_ack_function
incident-bot 09-11 17:36:57     return self.ack_function(
incident-bot 09-11 17:36:57            ^^^^^^^^^^^^^^^^^^
incident-bot 09-11 17:36:57   File "/incident-bot/bot/slack/handler.py", line 79, in handle_mention
incident-bot 09-11 17:36:57     say(blocks=resp, text="")
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_bolt/context/say/say.py", line 49, in __call__
incident-bot 09-11 17:36:57     return self.client.chat_postMessage(
incident-bot 09-11 17:36:57            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_sdk/web/client.py", line 2112, in chat_postMessage
incident-bot 09-11 17:36:57     return self.api_call("chat.postMessage", json=kwargs)
incident-bot 09-11 17:36:57            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_sdk/web/base_client.py", line 156, in api_call
incident-bot 09-11 17:36:57     return self._sync_send(api_url=api_url, req_args=req_args)
incident-bot 09-11 17:36:57            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_sdk/web/base_client.py", line 187, in _sync_send
incident-bot 09-11 17:36:57     return self._urllib_api_call(
incident-bot 09-11 17:36:57            ^^^^^^^^^^^^^^^^^^^^^^
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_sdk/web/base_client.py", line 317, in _urllib_api_call
incident-bot 09-11 17:36:57     ).validate()
incident-bot 09-11 17:36:57       ^^^^^^^^^^
incident-bot 09-11 17:36:57   File "/usr/local/lib/python3.11/site-packages/slack_sdk/web/slack_response.py", line 199, in validate
incident-bot 09-11 17:36:57     raise e.SlackApiError(message=msg, response=self)
incident-bot 09-11 17:36:57 slack_sdk.errors.SlackApiError: The request to the Slack API failed. (url: https://www.slack.com/api/chat.postMessage)
incident-bot 09-11 17:36:57 The server responded with: {'ok': False, 'error': 'invalid_blocks', 'errors': ['no more than 5 items allowed [json-pointer:/blocks/2/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/3/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/4/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/5/accessory/options]', 'no more than 5 items allowed [json-pointer:/blocks/6/accessory/options]'], 'response_metadata': {'messages': ['[ERROR] no more than 5 items allowed [json-pointer:/blocks/2/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/3/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/4/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/5/accessory/options]', '[ERROR] no more than 5 items allowed [json-pointer:/blocks/6/accessory/options]']}}

Jira ticket issue types are not limited with selected project

As of the latest update (ver 1.4.25) when creating a new jira ticket, the dynamically fetched Issue types are not restricted with the selected project, but are gathered from all the Jira projects. Meaning issue types are repeated as well if they are the same. Luckily it doesn't matter which one you choose, the ticket gets created, but if you choose an Issue type that does not exist in the given project, you get an error: Hmmm.. that didn't work. Check my logs for more information.

image
image

Block creation needs to be cleaned up and streamlined

Block creation for Slack modals needs to be cleaned up. It would be nice to use classes and generate boilerplate things like headers. There's also a lot of indexing for dictionaries that could easily be broken if block order ever changes, so we should really be using get and looking for specific block ids instead.

No incident recorded in the database

it seems that the incidents are not saved in DB
A critical error is visible in the logs

INFO:bot.incident.incident:Creating incident channel: inc-202210281751-test-4
ERROR:bot.incident.incident:Error sending message to incident digest channel: The request to the Slack API failed. (url: https://www.slack.com/api/chat.postMessage)
The server responded with: {'ok': False, 'error': 'not_in_channel'}
INFO:bot.incident.incident:Sending message to digest channel for: inc-202210281751-test-4
INFO:bot.incident.incident:Writing incident entry to database for inc-202210281751-test-4...
CRITICAL:bot.incident.incident:Error writing entry to database: local variable 'digest_message' referenced before assignment
ERROR:bot.models.incident:Incident update failed for inc-202210281751-test-4: No row was found when one was required```

how to disable integrations

I thought by leaving them commented out or even as empty {} they would be disabled, but so far it seems that we must specify (even if not in use)

  • jira
  • confluence
  • statuspage

[FEATURE] Disable tracking in digest channel

Is your feature request related to a problem? Please describe.
We've just gone live with incident-bot and replaced our old incident manager bot. We're using the main engineering channel (with more that 400 members) as the digest channel to ensure people are notified when a incident i opened (yes, we don't have a lot of incidents). This means that we're continuously getting the notification below:

image

We would prefer to keep the main channel as our incident channel as incident bot has a nice level of information.

Describe the solution you'd like
A config option to disable tracking of chatter in the digest channel (or at least ignore it). https://github.com/echoboomer/incident-bot/blob/main/backend/bot/slack/handler.py#L521-L565

Describe alternatives you've considered
N/A

Additional context
N/A

Zoom shouldn't be the only meeting provider

Discussed in #60

Originally posted by echoboomer December 6, 2022
Make it more flexible for which provider is used for meetings and meeting links.

In extending this logic, what is the list of meeting link providers we should work with to automatically generate meeting links and post them to the channels?

Bot fails to create jira ticket when Issue/Priority are not selected

When you click on Create a Jira Ticket the modal seems to default to Task as the Issue Type and low as the Priority (as seen in the image) but it doesn't actually default.

image

If you do not select an Issue Type and Priority you get the following error:

incident-bot 07-13 18:14:45 ERROR:slack_bolt.App:Error: 'NoneType' object has no attribute 'get'
incident-bot 07-13 18:14:45 Traceback (most recent call last):
incident-bot 07-13 18:14:45   File "/usr/local/lib/python3.11/site-packages/slack_bolt/listener/thread_runner.py", line 120, in run_ack_function_asynchronously
incident-bot 07-13 18:14:45     listener.run_ack_function(request=request, response=response)
incident-bot 07-13 18:14:45   File "/usr/local/lib/python3.11/site-packages/slack_bolt/listener/custom_listener.py", line 50, in run_ack_function
incident-bot 07-13 18:14:45     return self.ack_function(
incident-bot 07-13 18:14:45            ^^^^^^^^^^^^^^^^^^
incident-bot 07-13 18:14:45   File "/incident-bot/bot/slack/modals.py", line 1700, in handle_submission
incident-bot 07-13 18:14:45     parsed = parse_modal_values(body)
incident-bot 07-13 18:14:45              ^^^^^^^^^^^^^^^^^^^^^^^^
incident-bot 07-13 18:14:45   File "/incident-bot/bot/templates/tools.py", line 24, in parse_modal_values
incident-bot 07-13 18:14:45     result[title] = content.get("selected_option").get("value")
incident-bot 07-13 18:14:45                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
incident-bot 07-13 18:14:45 AttributeError: 'NoneType' object has no attribute 'get'
incident-bot 07-13 18:14:48 WARNING:slack_bolt.App:handle_submission didn't call ack()
incident-bot 07-13 18:14:48 WARNING:slack_bolt.App:Unhandled request ({'type': 'view_submission', 'view': {'type': 'modal', 'callback_id': 'open_incident_create_jira_issue_modal'}})
incident-bot 07-13 18:14:48 ---
incident-bot 07-13 18:14:48 [Suggestion] You can handle this type of event with the following listener function:
incident-bot 07-13 18:14:48
incident-bot 07-13 18:14:48 @app.view("open_incident_create_jira_issue_modal")
incident-bot 07-13 18:14:48 def handle_view_submission_events(ack, body, logger):
incident-bot 07-13 18:14:48     ack()
incident-bot 07-13 18:14:48     logger.info(body)
incident-bot 07-13 18:14:48

RCA Confluence - Pinned Messages - No Spaces or Punctuation

Seeing behavior in which the pinned messages in a slack channel are persisted correctly, but when an RCA is created in confluence, all spaces and punctuation are taken out.

In the incident page itself on incident bot web:

Jack Johnson
27/10/2023 20:05:08 UTC
<https://github.com/namespace/repo/pull/4759> filed

Mary Ann
27/10/2023 20:55:21 UTC
PR is tested and validated locally, merged to main. After it builds we will deploy to pluto

In the RCA Confluence page:

Pinned Messages
These messages were pinned during the incident by users in Slack.
This information is useful for establishing the incident timeline and providing diagnostic data.
Jack Johnson @ 27/10/2023 20:01:35 UTC -  httpsgithubcomnamespacerepopull4759filed
Mary Ann @ 27/10/2023 20:55:21 UTC -  PRistestedandvalidatedlocallymergedtomainAfteritbuildswewilldeploytopluto

just for information

Hi Team ,

Can you put a tagged version here which has only the slackbot functionality without any flask setup /web app functionality .

I want to test and use it for demo purpose without setting up any integrations at all . How can i do that

Allow for updating of Jira ticket status as incident changes status

It would be nice if we could map incident statuses to Jira status optionally, if a ticket exists. Something like the example below. If a status_mapping doesn't exist for an incident status, then the Jira ticket wouldn't get an update during that time.

jira:
  update_status: True
  status_mapping:
    - incident_status: Investigating
      jira_status: Open
    - incident_status: Identified
      jira_status: In Progress
    - incident_status: Monitoring
      jira_status: In Review
    - incident_status: Resolved
      jira_status: Done
  project: INCMGMT
  issue_types: ['Task', 'Epic', 'Story']
  priorities: ['High', 'Medium', 'Low']
  labels:
    - incident-management
    - etc

consider json format logs

Consider making an optional config to have logs in json format

This helps when shipping logs to third parties like kibana or datadog

Add runbooks/workflows

A way to define incident management-related runbooks/playbooks/workflows could be useful.

RCA confluence failure to create

We've run into this quite a few times recently (randomly), haven't been able to determine the cause. Although if I had to guess, it was related to this other issue as it might be happening when images are pinned.

incident-bot 09-08 15:30:03 INFO:incident.actions:Creating rca channel: inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist-rca
incident-bot 09-08 15:30:04 INFO:slack.client:User already in channel or is one of ['api', 'web']. Skipping invite.
incident-bot 09-08 15:30:04 INFO:confluence:Creating RCA 2023-09-08 - inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist - User Replay Deleted Companyusers That Should Still Exist in Confluence space IR under parent 2023...
incident-bot 09-08 15:30:09 INFO:atlassian.confluence:Creating page "IR" -> "2023-09-08 - inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist - User Replay Deleted Companyusers That Should Still Exist"
incident-bot 09-08 15:30:09 ERROR:confluence:com.atlassian.confluence.api.service.exceptions.BadRequestException: Error parsing xhtml: Unexpected character '@' (code 64) in content after '<' (malformed start element?).
incident-bot 09-08 15:30:09  at [row,col {unknown-source}]: [167,73]
incident-bot 09-08 15:30:09 ERROR:incident.actions:Error sending RCA update to RCA channel: The request to the Slack API failed. (url: https://www.slack.com/api/chat.postMessage)
incident-bot 09-08 15:30:09 The server responded with: {'ok': False, 'error': 'invalid_blocks', 'errors': ['must provide a string [json-pointer:/blocks/4/elements/0/url]'], 'response_metadata': {'messages': ['[ERROR] must provide a string [json-pointer:/blocks/4/elements/0/url]']}}
incident-bot 09-08 15:30:09 INFO:incident.actions:Sent resolution info to inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist.
incident-bot 09-08 15:30:10 INFO:incident.actions:Updating incident record in database with new status for inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist
incident-bot 09-08 15:30:11 INFO:incident.actions:Updated incident status for inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist to resolved.
incident-bot 09-08 15:31:23 INFO:incident.actions:Sending chat transcript to inc-20238241622-user-replay-deleted-companyusers-that-should-still-exist.
incident-bot 09-08 15:31:30 WARNING:slack_bolt.App:Unhandled request ({'type': 'block_actions', 'block_id': 'resolution_buttons', 'action_id': 'n8u'})
incident-bot 09-08 15:31:30 ---
incident-bot 09-08 15:31:30 [Suggestion] You can handle this type of event with the following listener function:
incident-bot 09-08 15:31:30
incident-bot 09-08 15:31:30 @app.action("n8u")
incident-bot 09-08 15:31:30 def handle_some_action(ack, body, logger):
incident-bot 09-08 15:31:30     ack()
incident-bot 09-08 15:31:30     logger.info(body)
incident-bot 09-08 15:31:30

Assigning roles

Discussed in #73

Originally posted by chadcancode December 14, 2022

  1. It would be nice to not have to require assigning certain roles before resolving an incident. For example, we don't use the communications liason role. That's part of being the incident commander.
  2. When I open an incident, I automatically page an Incident Commander. I would love to auto-assign them to that role.

Add Jira integration

It could be useful to add a Jira integration so issues can be created in relation to incidents.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.