rajnandan1 / kener Goto Github PK

View Code? Open in Web Editor NEW

2.0K 6.0 66.0 16.67 MB

Kener is a Modern Self hosted Status Page, batteries included

Home Page: https://kener.ing/

License: MIT License

JavaScript 51.61% HTML 0.36% CSS 2.13% Svelte 38.80% TypeScript 5.82% Dockerfile 1.28%

monitoring monitoring-tool nodejs status-page sveltekit

kener's People

Contributors

Stargazers

Watchers

kener's Issues

reading the per minute json data is too slow

Describe the bug
hey, love the site! But my site is taking 8+ seconds to respond now that we have 20+ days across 5 monitors.

Expected behavior
reasonable response time for webpage is <1 second

Additional context
Could we summarize the json files into something that is faster to read on page load? Perhaps a separate json file that has pre-computed the per day counts for UP/DOWN/DEGRADED so it doesn't need to count every minute on every page load? That should be a lot faster. Once we hit 90 days that'll be 60 * 24 * 90 = 130k data points per monitor, and 130k * 5 monitors = 650k datapoints to process on every page load.

thanks!

Testing degraded status

Test incident body

[start_datetime:1705675160] [end_datetime:1705675250]

eval monitor not working correctly

Describe the bug
Docs unclear or app not working as intended. I'm not getting the eval monitor to work

To Reproduce
Steps to reproduce the behavior:
Copy just the eval monitor from the docs, paste it in an empty monitor config.

Expected behavior
A working monitor

Additional context
On latest release. I tried everything with the syntax, without any luck.

Kener is running on port 3000! undefined:1 SyntaxError: Unexpected end of JSON input at JSON.parse (<anonymous>) at eval (eval at Startup (file:///app/scripts/startup.js:114:32), <anonymous>:2:19) at eval (eval at Startup (file:///app/scripts/startup.js:114:32), <anonymous>:12:1) at Startup (file:///app/scripts/startup.js:114:32) at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

2nd icondent test

this is second incident

EU Login Cluster Downtime

Impact Description:

The EU Login Cluster Downtime refers to the temporary unavailability of the EU Login authentication service due to maintenance, upgrades, or unexpected technical issues. During this downtime, users will be unable to access or authenticate themselves on various EU platforms and services that rely on EU Login for user authentication.

Impacted Services:

European Commission websites and portals
EU Member State platforms and services
European Union Agency systems
EU-funded project management platforms

Expected Consequences:

User Access Disruption: Users will be unable to log in or access services that require EU Login authentication. This includes accessing personal accounts, submitting applications, and interacting with EU services.
Service Interruptions: EU platforms and services relying on EU Login will experience disruptions, potentially leading to unavailability or limited functionality.
Delays in Workflows: Processes depending on EU Login authentication for user verification or authorization will be temporarily halted, causing delays in project management, application processing, and other critical workflows.
User Frustration: Users may experience frustration and inconvenience due to the inability to access services and perform necessary tasks during the EU Login downtime.
Potential Data Loss: In rare cases, if the downtime coincides with data submission or processing, there is a slight risk of data loss or inconsistencies.

Mitigation Measures:

Communication: Advance notice will be provided to users and stakeholders about the scheduled downtime, its duration, and any alternative access methods or workarounds.
Redundancy and Failover: The EU Login system is designed with redundancy and failover mechanisms to minimize the impact of unexpected outages and ensure service availability.
Rapid Response and Recovery: Dedicated teams will be on standby to address any technical issues promptly and restore the service as quickly as possible.
Monitoring and Updates: Continuous monitoring will be in place to track the progress of the downtime, and regular updates will be provided to inform users about the status and expected resolution time.

Please note that this is a sample impact description and should be customized based on the specific details and requirements of the EU Login Cluster Downtime.

Impact in Mumbai

Show "Status OK" in Green after the outage is resolved

Is your feature request related to a problem? Please describe.

If there has been >=1 minute of downtime today the main status description will show Down for N minutes in red all day, even if we are currently back UP until it rolls to the next day.

I think it should be more clear that we are back UP so like it to say Status OK in green again if we are back UP, but continue show today's bar as red. Then the user can still mouse over the red bar to see how many minutes we were down today.

Describe the solution you'd like

Currently:

Preferred:

Describe alternatives you've considered

None

Additional context

I hacked this in my system by looking for the last update and forcing it to green.

I did something like this in ninety.js but there is probably a better solution.

in getDayData() I added:

    // modified to show green if we are up right now
    const last = day0[Object.keys(day0)[Object.keys(day0).length-1]].status
    if (last === "UP") {
        cssClass = StatusObj.UP;
        message = "Status OK";
    }

Hello

Your Incident Description goes here. Markdown Supported

[start_datetime:1706267776]

Impact in Mumbai

"Request to add page Chinese support."

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

API

[start_datetime:1702405860]

Impact in Mumbai

Horizontal scaling?

Is your feature request related to a problem? Please describe.
I am worried that if I adopt this product, it won't scale well in high traffic. I would want to be able to have multiple node servers handling requests in a round-robin fashion, hosted on a fleet of containers. But from what I gather from looking at the docs and code, this isn't possible. If I run multiple instances, I think I'd be running multiple collectors, which would duplicate metrics.

Describe the solution you'd like
I would like to be able run the server in "view only mode", and disable metric collection. This way, I can run one metric collector, and leave the rest of the instances as view only, so that I can scale out new instances to meet traffic needs.

Describe alternatives you've considered

modify the source code myself,
simply have 1 instance and hope it handles traffic load,
simply have many instances, and don't care that the metrics are duplicated. This would mean that different users may have a slightly different metric reporting experience.

Additional context
I think this project is awesome! I would love to adopt it for my use case :)

Testing schedulded status

Test incident body

[start_datetime:1705675617] [end_datetime:1705675717]

docker image seems cannnot work

Describe the bug
docker cannot run. need commit a new tag to docker hub.

<<<<
kener | Error [ERR_MODULE_NOT_FOUND]: Cannot find module '/app/src/lib/helpers.js' imported from /app/scripts/ninety.js
kener | at new NodeError (node:internal/errors:405:5)
kener | at finalizeResolution (node:internal/modules/esm/resolve:327:11)
kener | at moduleResolve (node:internal/modules/esm/resolve:980:10)
kener | at defaultResolve (node:internal/modules/esm/resolve:1193:11)
kener | at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:403:12)
kener | at ModuleLoader.resolve (node:internal/modules/esm/loader:372:25)
kener | at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:249:38)
kener | at ModuleWrap. (node:internal/modules/esm/module_job:76:39)
kener | at link (node:internal/modules/esm/module_job:75:36) {
kener | url: 'file:///app/src/lib/helpers.js',
kener | code: 'ERR_MODULE_NOT_FOUND'
kener | }

When can this be a static page that has no backend component

Is your feature request related to a problem? Please describe.
There are plenty of free hosted products out there in the market for status page management. Such as instatus. We currently use one of these, and we don't have to run literally anything. And it is already free.

Describe the solution you'd like
We would love to host an static webpage ourselves such as kener, but it would have to completely fit in an S3 bucket. The generation of all other data for the site we would want to deploy with the website on incident updates.

Additional context
We never want to have to run a service. Alternatively, since we use AWS, we would be okay with an S3 bucket static website AND a Lambda Function that ran on a CloudWatch Schedule Rule that would update the bucket based on the feedback. Running a full container as well as SSR to accomplish this will never work for us.

Impact in Mumbai

API

[start_datetime:1702405860]

CLosed Incident

third inciden which is closed

Widget error

Hi
it's not possible add widget / script to any site.
It's show" Refused to display 'https://xxxxxxxxx' in a frame because it set 'X-Frame-Options' to 'sameorigin'."

Supp

this is supp [start_datetime:1702651340]

Email notifications

Is your feature request related to a problem? Please describe.
This is not related to a problem. It's purely a new feature request.

Describe the solution you'd like
When a monitored service doesn't respond to Kerner's health check there should be an option for it to email an interested party. For example, the admin for that service. Other notification types would be nice, like SMS or DMing a person on a site like Discord or Slack, but not required for the initial implementation of this feature.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
The email notification interval should be separate from the polling interval for the health check to prevent flooding the interested parties inbox. It may also be useful to implement this as two separate intervals - one interval for the initial detection of the outage and a second interval used to send reminders. For example, 1 minute for the first and increments of 15 minutes for the second.

Impact in Mumbai

automatic page refresh

hi, I took this for a test drive today. It is very nice!

One thing I noticed was that the page doesn't refresh itself when the monitor status or incident changes and it requires a manual refresh of the page to see the update. Is this something that you are considering adding in the future?

thanks! -Ray

API

[start_datetime:1702405860]

Impact in Mumbai

Outage in Mumbai

Slow to render main page

Describe the bug

Hi, love the site, we are using it in production now!

But it is taking over 2 seconds to render the main page with 5 monitors.

Is there anything we can do to speed it up? Maybe send the headers to allow caching for a minute or two?

Lighthouse Report

Impact in Mumbai

Bugs reported by team

If I update end time of a closed incident it should reflect again in the grid
Maintenance tag is should show up

API

[start_datetime:1702405860]

API

body 1
[start_datetime:1702405860]

Impact in Mumbai

Hello Incident

A sample Incident for our planet

[start_datetime:1703245156]
[end_datetime:1703246256]

Feature: Any plans on making this agnostic to gitlab, bitbucket etc?

Hi... I stumbled on this via Hacker News & looks nice 👍🏾
Any plans on making this agnostic to gitlab, bitbucket etc?

There are also a number of grammatical errors scattered throughout the homepage and codebase in case you were not aware 😉

Document 'timeout'

https://kener.ing/docs#h2understanding-monitors

timeout: 4000 is in the sample config for api/post request....

Can timeout be used with api get requests?
What is the timescale used? 4000 is equal to?

API

body 1
[start_datetime:1702405860]

A sample Incident for our planet

[start_datetime:1703158756]
[end_datetime:1703159856]

Hello Kener, here is an incident for you

Outage in Mumbai

Test only

{
  "name": "Kener"
}

[start_datetime:1702405740] [end_datetime:1702405920]

Impact in Mumbai

New test incident

Hello,

Multiline

Empty space

Indentation

Test incident

Test incident body

[start_datetime:1705673298] [end_datetime:1705673390]

Support navigating to the website

Is your feature request related to a problem? Please describe.

Currently I use the description field to paste the website address that is being checked:

- name: Website
  description: https://orhun.dev
  tag: website
  image: "https://orhun.dev/img/crow.png"
  api:
    method: GET
    url: https://orhun.dev

Describe the solution you'd like

It would be nice to have a button to open the website in a new tab.

Describe alternatives you've considered
None.

Additional context
Maybe this is possible somehow but I couldn't find it.

My instance is here: https://status.orhun.dev

Outage in Mumbai

Testing schedulded status 2

Test incident body

[start_datetime:1705675917] [end_datetime:1705676200]

Show daily status bar as Red only if it passes a threshold

Is your feature request related to a problem? Please describe.

Some of the services that I am monitoring are not the most reliable so sometimes I get 1 or 2 random missed pings per day and then I get a red bar for that whole day and it looks bad. I would like to set a threshold of down minutes per day before I get a red bar.

Describe the solution you'd like

I would like a threshold setting where I can set it to only show the day as DOWN (red) if it was down for more than N minutes. Most likely having this setting per monitor would be best, although I'm fine using it globally across all monitors.

Describe alternatives you've considered

None

Additional context

I did something like this in ninety.js and it seems to work.

from

    if (dayData.DOWN > 0) {
        cssClass = StatusObj.DOWN;
        message = getDayMessage("Down", dayData.DOWN);
    }

    if (dayData.DOWN > 3) { // modification to suppress a few misses
        cssClass = StatusObj.DOWN;
        message = getDayMessage("Down", dayData.DOWN);
    }

The simple way I implement it here may be problematic as a real outage will have to pass this threshold before it will be displayed on the page, not sure if there is a way around that at least for an outage declared through a GitHub issue (I have not tested this).

rajnandan1 / kener Goto Github PK

kener's People

Contributors

Stargazers

Watchers

Forkers

kener's Issues

Test incident body

EU Login Cluster Downtime

Test incident body

Test only

Test incident body

Test incident body

This is a test incident

Recommend Projects

Recommend Topics

Recommend Org