robusta-dev / robusta Goto Github PK
View Code? Open in Web Editor NEWKubernetes observability and automation, with an awesome Prometheus integration
Home Page: https://home.robusta.dev/
License: MIT License
Kubernetes observability and automation, with an awesome Prometheus integration
Home Page: https://home.robusta.dev/
License: MIT License
Motivation
Lots of people have issues with persistent volumes. We should add detections for these and recommend fixes.
Example issues
We love CIVO cloud and this would make it easier for people to install robusta on their clusters
Today we try to write Robusta actions that can be re-used in multiple scenarios. For example, an action that fetches logs should be usable both in the case of a crashing pod and a prometheus alert.
In the past, we sometimes wrote actions that included triggering logic too. For example, the restart_loop_reporter
action is connected to the trigger on_pod_update
. This fires very frequently and not only when a pod restarts. Therefore the restart_loop_reporter
action has triggering-logic which decides when the action should even do anything.
This breaks the normal separation of triggers and actions. To solve this problem, we introduced the ability to write custom triggers. For example, you can write a crashloop_backoff
trigger which inherits from on_pod_update
and only fires on pod updates which are due to a crashing pod.
We should rewrite old actions to use the new custom-triggers API. This will lead to more re-usable code.
Actions to rewrite:
logs_enricher
action (already exists) and a new on_restart_loop
custom triggeroffer_to_resize_hpa
action and a new on_hpa_max
custom triggerIs your feature request related to a problem? Please describe.
No
Describe the solution you'd like
add Mattermost support in sinks
Describe alternatives you've considered
N/A
Additional context
N/A
Reporting on behalf of @shfisher
If you enter something wrong in robusta gen-config
(e.g. a bad slack channel) then we shouldn't abort the installation and force you to start again. Instead give the user a second chance.
I would like to have a way to inspect and analyze failed requests to the AWS API, mainly those that occur when a deployment to Kubernetes is made. In this case, I usually have access to the request's UUID, but not to the request itself or to the response.
For example, I failed to deploy an ingress controller due to a permissions issue. During the deployment process, I got events describing an unauthorized operation with a specific UUID. If I had information about the request, I could deduce what the operation was, and what were the missing permissions.
We can start by supporting only one cloud provider (e.g. EKS) - that's OK
Goal is to make it easier for people to submit contribution to the docs. Many people don't understand today that the docs are open source.
Thank you to @Sayanta66 for the idea!
Describe the bug
Python 3.10.1
Error installing Helm chart: Error: create: failed to create: Request entity too large: limit is 3145728
To Reproduce
Steps to reproduce the behavior:
./helm/robusta
pip install -U robusta-cli
robusta gen-config
helm upgrade -i robusta ./ -f ./generated_values.yaml --debug
Expected behavior
Helm chart installation success
Logs
๎ฐ helm upgrade -i robusta ./ -f ./generated_values.yaml --debug
history.go:56: [debug] getting history for release robusta
Release "robusta" does not exist. Installing it now.
install.go:178: [debug] Original chart version: ""
install.go:199: [debug] CHART PATH: /home/nolche/Git/robusta/helm/robusta
walk.go:74: found symbolic link in path: /home/nolche/Git/robusta/helm/robusta/venv/bin/python resolves to /usr/bin/python3.10
walk.go:74: found symbolic link in path: /home/nolche/Git/robusta/helm/robusta/venv/bin/python3 resolves to /usr/bin/python3.10
walk.go:74: found symbolic link in path: /home/nolche/Git/robusta/helm/robusta/venv/bin/python3.10 resolves to /usr/bin/python3.10
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD alertmanagerconfigs.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD alertmanagers.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD podmonitors.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD probes.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD prometheuses.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD prometheusrules.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD servicemonitors.monitoring.coreos.com is already present. Skipping.
client.go:128: [debug] creating 1 resource(s)
install.go:151: [debug] CRD thanosrulers.monitoring.coreos.com is already present. Skipping.
Error: create: failed to create: Request entity too large: limit is 3145728
helm.go:88: [debug] Request entity too large: limit is 3145728
create: failed to create
helm.sh/helm/v3/pkg/storage/driver.(*Secrets).Create
helm.sh/helm/v3/pkg/storage/driver/secrets.go:164
helm.sh/helm/v3/pkg/storage.(*Storage).Create
helm.sh/helm/v3/pkg/storage/storage.go:69
helm.sh/helm/v3/pkg/action.(*Install).RunWithContext
helm.sh/helm/v3/pkg/action/install.go:340
main.runInstall
helm.sh/helm/v3/cmd/helm/install.go:267
main.newUpgradeCmd.func2
helm.sh/helm/v3/cmd/helm/upgrade.go:124
github.com/spf13/cobra.(*Command).execute
github.com/spf13/cobra@v1.2.1/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/cobra@v1.2.1/command.go:974
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/cobra@v1.2.1/command.go:902
main.main
helm.sh/helm/v3/cmd/helm/helm.go:87
runtime.main
runtime/proc.go:255
runtime.goexit
runtime/asm_amd64.s:1581
Describe the bug
After some pod got CPU Throttling, Robusta know about this when connected to Prometheus. But after that, Robusta lost connection to Prometheus
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Robusta should send a notify to let user know about the error (can't do some queries to Prometheus)
Logs
2022-02-17 04:35:15.024 INFO Successfully loaded Kubernetes resource happy-backend-7bdcf76cf-p54t7 for alert CPUThrottlingHigh
2022-02-17 04:35:15.231 INFO Successfully loaded Kubernetes resource happy-backend-7bdcf76cf-p54t7 for alert CPUThrottlingHigh
2022-02-17 04:37:24.557 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f578cc17310>: Failed to establish a new connection: [Errno 110] Connection timed out')': /api/v1/query_range?query=sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_throttled_periods_total%7Bcontainer%21%3D%22%22%7D%5B5m%5D%29%29+%2F+sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_periods_total%5B5m%5D%29%29+%3E+%2825+%2F+100%29&start=1644971170&end=1645072515&step=1689.0859742833334&timeout=90.0
2022-02-17 04:37:24.558 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f578ccdfd30>: Failed to establish a new connection: [Errno 110] Connection timed out')': /api/v1/query_range?query=sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_throttled_periods_total%7Bcontainer%21%3D%22%22%7D%5B5m%5D%29%29+%2F+sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_periods_total%5B5m%5D%29%29+%3E+%2825+%2F+100%29&start=1645068915&end=1645072515&step=60.0&timeout=90.0
2022-02-17 04:37:24.560 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f578cbb5160>: Failed to establish a new connection: [Errno 110] Connection timed out')': /api/v1/query_range?query=sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_throttled_periods_total%7Bcontainer%21%3D%22%22%7D%5B5m%5D%29%29+%2F+sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_periods_total%5B5m%5D%29%29+%3E+%2825+%2F+100%29&start=1644970270&end=1645072515&step=1704.0781680833331&timeout=90.0
2022-02-17 04:39:37.662 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5770701190>: Failed to establish a new connection: [Errno 110] Connection timed out')': /api/v1/query_range?query=sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_throttled_periods_total%7Bcontainer%21%3D%22%22%7D%5B5m%5D%29%29+%2F+sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_periods_total%5B5m%5D%29%29+%3E+%2825+%2F+100%29&start=1644970270&end=1645072515&step=1704.0781680833331&timeout=90.0
2022-02-17 04:39:37.663 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f57707014f0>: Failed to establish a new connection: [Errno 110] Connection timed out')': /api/v1/query_range?query=sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_throttled_periods_total%7Bcontainer%21%3D%22%22%7D%5B5m%5D%29%29+%2F+sum+by%28container%2C+pod%2C+namespace%29+%28increase%28container_cpu_cfs_periods_total%5B5m%5D%29%29+%3E+%2825+%2F+100%29&start=1645068915&end=1645072515&step=60.0&timeout=90.0
Additional infos
After restart the deployment Robusta can connect to prometheus again and send alerts.
I use Prometheus Stack with the below configuration
config:
global:
resolve_timeout: 5m
route:
group_by: ['job']
group_wait: 30s
group_interval: 30m
repeat_interval: 4h
receiver: 'robusta'
routes:
- match:
alertname: Watchdog
receiver: 'null'
receivers:
- name: 'null'
- name: 'robusta'
webhook_configs:
- url: 'http://robusta-runner.robusta.svc.cluster.local/api/alerts'
send_resolved: true
Motivation
It is useful to track changes to ClusterRoleBindings to stay on top of who has what permissions.
Suggested Feature
Robusta already has triggers for ClusterRoleBinding changes (see docs) but there are no builtin actions setup for those triggers. We should add an action called cluster_permissions_watcher
which notifies when ClusterRoleBindings change and outputs summarized information about the change.
Alternatives
You can monitor ClusterRoleBindings today using the resource_babysitter
action (see tutorial and docs) but the output there is very generic and technical. (It just shows a diff.) If we are going to implement an action for this it should be optimized for ClusterRoleBindings and print more useful data like "The ClusterRole named XYZ now has permission to...."
Motivation
Many golang applications expose debug information using pprof. We should add a playbook action to collect that data.
I received a report that one pod didn't start running when checking Robusta on the Kubernetes bundled inside Docker Desktop. (https://docs.docker.com/desktop/kubernetes/)
I believe it was the node-exporter pod. (@piomin is that right?)
We should check this and fix support for Docker Desktop if it doesn't work.
CallbackBlocks are part of Robusta's API. They're used to write playbooks where the user clickes a button (e.g. in Slack) and this triggers another playbook which runs only when the user clicked.
This powers, for example, the playbook that lets you increase the HPA max replicas. (See docs)
Today CallbackBlocks aren't documented well in the docs on writing playbook actions
We should document them there in a new page in that section of the docs. The page will be all about callbacks, how they work, and how to write a playbook that uses them.
For reference, you can search for CallbackBlock
in the playbooks/ directory and read the existing playbooks to see how it works.
Is your feature request related to a problem? Please describe.
Using the python memory profiler is tricky, it is unclear when the profiling actually starts and stops, so timing related actions in difficult.
Describe the solution you'd like
I would like an indication to when the profiling starts and when it stops.
Motivation
Sometimes alerts are really noisy and you just want to silence them temporarily. You can do so by opening AlertManager and creating a silence in the UI (or by using the amtool cli) but you can't do so directly from Slack.
Suggested Solution
CallbackBlock
which lets the user trigger (1) when they chooseTitle says it all. It should match the other badges already there.
The pod ps output is with the global pid (not namespaced) - this should be documented
Trying to enable git playbooks in my private generated_values.yaml
file causes an error
coalesce.go:163: warning: skipped value for playbookRepos: Not a table.
Steps to reproduce the behavior:
generated_values.yaml
file add:playbookRepos:
some_git_playbooks:
url: "[email protected]:robusta-dev/my-playbooks.git"
helm instal robusta ...
Expected behavior
The git playbooks repo should be installed and loaded into Robusta runner
In other words, Robusta doesn't create the channel.
Add something to error messages in gen-config and perhaps in robusta log output. E.g.
Cannot send to Slack channel XYZ. Please verify that the channel exists and the Robusta app was added to the channel. (See video in [docs](https://docs.robusta.dev/master/catalog/sinks/slack.html#sending-robusta-notifications-to-a-private-channel))
Thank you to @orclassiq for reporting
Today our sinks are documented in two places:
I've seen a few people confused by this. They go to the Sinks page and read that we support, e.g., Kafka but they can't find details on how to configure Kafka.
I suggest we move the details on how to configure each sink out of the Configuration Guide page (the 2nd page mentioned above) and into a unique page for each sink. Each sink should have a subpage under the Sinks page and it should have both the screenshot and the details on yaml configuration.
@arikalon1 wdyt?
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Usually, when installing a helm chart, it will install all resources related to the deployment into a separate ns. However, when installing robusta, it does not create a new ns but installs everything into the default ns.
Describe the solution you'd like
A clear and concise description of what you want to happen.
I would like a new ns to be created called robusta.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
N/A
Additional context
Add any other context or screenshots about the feature request here.
I think it is common practice to have a separate ns so it would be nice if that was the case when installing the Robusta Helm Chart.
Motivation
This is a common Prometheus alert and we should enrich it out of the box.
Today images are attached to Slack messages as attachments. It would be nice to embed the images directly into the message itself so that you don't need to click to see them.
This StackOverflow question seems relevant
When implementing this, we will need to verify that images remain private and are not available outside of the Slack community in which they're shared.
We probably can't embed SVG images and that's OK.
Thanks to @tim-sendible for the idea
Describe the bug
robusta playbooks push
fails if robusta is installed in a namespace other than the default namespace and the --namespace
flag is not specified. Since it is not recommended to have pods in the default namespace, it is common to install robusta in its own namespace.
To Reproduce
Steps to reproduce the behavior:
helm install robusta robusta/robusta -f ./generated_values.yaml -n robusta --create-namespace
robusta playbooks push ./my-playbooks-project-root
Expected behavior
An appropriate error should be displayed instead of the python erorr.
Errors
======================================================================
Uploading playbooks code...
======================================================================
No resources found in default namespace.
======================================================================
Runner pod not found.
======================================================================
======================================================================
Fetching logs...
======================================================================
Error from server (NotFound): deployments.apps "robusta-runner" not found
Traceback (most recent call last):
File "/home/avinashupadhyaya/.local/bin/robusta", line 8, in <module>
sys.exit(app())
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
return get_command(self)(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/robusta/cli/playbooks_cmd.py", line 84, in push
return
File "/usr/lib/python3.9/contextlib.py", line 124, in __exit__
next(self.gen)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/robusta/cli/utils.py", line 93, in fetch_runner_logs
subprocess.check_call(
File "/usr/lib/python3.9/subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'kubectl logs deployment/robusta-runner -c runner --since=1s' returned non-zero exit status 1.
Robusta version
robusta version
version 0.8.26
Additional context
Sorry, I opened the issue without checking for the namespace flag. I found the namespace flag but would like appropriate error messages to be displayed if possible.
This is based on feedback from @tim-sendible and I believe we've heard it at least once before from an existing user.
The general idea is that when you define a PrometheusRule (e.g. using the Prometheus Operator) you should be able to define alongside that rule the automations to run when that alert fires. This way you don't need to define the alert in one place (PrometheusRule) and the automation in another (Robusta's values.yaml)
For simple automations, this should be easy. We can parse an annotation like:
robusta.dev/action: logs_enricher
For actions which take parameters, the syntax might get more complicated.
@tim-sendible thank you for the feedback and feel free to comment if I've misunderstood or missed anything
Line 27 in cbef6bd
We might need to update pip in the Dockerfile before running the command
Hi everyone, I am new here as a member I would love to contribute to the troubleshooting section. As far as robusta docs have a python & java section guide with its own respective SDK.
I wanted to know and create SDK for golang so documentation and robusta's feature will be much more enhanced. Can you provide any guidance on it how we can make it and how it's functioning on python/java SDK? I wanted to learn and enhance my skills side by sidewise. Looking forward to hearing from the community.
Describe the solution you'd like
Telegram is a very famous tool right now for sending alerting and I think that It would be very handy if Telegram Sink is supported
when setting playbooksPersistentVolume: true
, the chart is trying to create a PVC of 128Mi
Some storage classes doesn't support such a small size, and the PVC creation fails.
Steps to reproduce the behavior:
playbooksPersistentVolume: true
in the values.yaml
filehelm install robusta ...
Expected behavior
There should be a helm value that allows overriding the default PVC size
In default messages we send to Slack, we should specify the cluster name. This is important for multi-cluster environments / environments with more than one AlertManager.
We should make sure that we send the cluster name for:
If possible, it would be nice to let users customize our bot-name (Robusta) or at least to specify the cluster name there too.
Thanks again to @tim-sendible for the feedback
I have amount of containers inside the pod, and when the pod crashed I want to see in the Slack message which container exactly caused it. For example, my workflow is to launch some init containers before the pod starts (fetch secrets, internal dependencies, flyway), and for now I get the same message 'Crashing pod in namespace ', and I need to describe the pod to understand which container failed. I want to look ah the message and immediately know - ah, it's flyway problem, no panic.
Describe the bug
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Alt text should be meaningful.
If the logo is the logo for Robusta
, then the word Robusta
should be in the alt text.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
Is your feature request related to a problem? Please describe.
how we can send events to ElasticSearch
Describe the bug
resource_babysitter display no resource name if 'metadata' as a whole included in omitted_fields
To Reproduce
Expected behavior
Change deployment name is displayed
Thank you @tuananh2508 for reporting it
Motivation
It can be hard to understand exactly what data is contained inside a Kubernetes volume. We can provide visibility with Robusta actions
Suggested Implementation
Add a Robusta action which:
ls -R
on the volume to show all files.Bonus
Add support for VolumeSnapshots too. For VolumeSnapshots you will first need to create a temporary Volume based on the snapshot, then do the above, and finally delete the temporary volume.
Caveats
The above sometims wont work if the Volume is in use - depending on the mount's AccessMode. This can be fixed various ways. For example, for ReadWriteOnce it can be bypassed by running the reader pod on the same node as the pod that is currrently using the volume. Alternatively, it could be fixed for all AccessModes by always creating a VolumeSnapshot and reading the snapshot not the original.
In any event, a first version doesn't need to support any of this.
When defining multiple playbooks, with different sinks, sinks of one playbook may override other playbooks sinks.
Steps to reproduce the behavior:
Expected behavior
An update message is sent to both sinks
thanks @tuananh2508 for reporting this
Describe the bug
Providing an invalid account token to the robusta CLI during the generating of the value.yaml file throws different python errors for different scenarios.
To Reproduce
Steps to reproduce the behavior:
robusta gen-config
Would you like to use Robusta UI? This is HIGHLY recommended. [y/N]:
and to Do you already have a Robusta account? [y/N]:
Please insert your Robusta account token:
. Some examples are inserting the account_id or the actual account token with a few characters removed/changed.Expected behavior
Robusta should catch these errors and display an informative error message.
Errors
Traceback (most recent call last):
File "/home/avinashupadhyaya/.local/bin/robusta", line 8, in <module>
sys.exit(app())
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
return get_command(self)(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/robusta/cli/main.py", line 234, in gen_config
token = json.loads(base64.b64decode(robusta_api_key))
File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.9/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 375 (char 374)
Traceback (most recent call last):
File "/home/avinashupadhyaya/.local/bin/robusta", line 8, in <module>
sys.exit(app())
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
return get_command(self)(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/robusta/cli/main.py", line 234, in gen_config
token = json.loads(base64.b64decode(robusta_api_key))
File "/usr/lib/python3.9/base64.py", line 87, in b64decode
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
Traceback (most recent call last):
File "/home/avinashupadhyaya/.local/bin/robusta", line 8, in <module>
sys.exit(app())
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 214, in __call__
return get_command(self)(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/home/avinashupadhyaya/.local/lib/python3.9/site-packages/robusta/cli/main.py", line 234, in gen_config
token = json.loads(base64.b64decode(robusta_api_key))
File "/usr/lib/python3.9/json/__init__.py", line 341, in loads
s = s.decode(detect_encoding(s), 'surrogatepass')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 0: invalid continuation byte
Additional context
N/A
I got the above error when trying to manually trigger an action as instructed in the documentation.
"robusta playbooks trigger python_debugger name=podname namespace=default"
I'm new to kubernetes and everything related, so please let me know what information you need to help me help you debug this issue. Someone mentioned that they will need to see the logs for robusta-runner. Let me know where I can find these logs so I can get them to you for you to review.
Thanks.
Describe the bug
Playbooks pushed to namespace other than default are not present in robusta playbooks list
To Reproduce
Steps to reproduce the behavior:
helm install robusta robusta/robusta -f ./generated_values.yaml -n robusta --create-namespace
robusta playbooks push ./my-playbooks-project-root --namespace robusta
robusta playbooks list --namespace robusta | grep 'my_action'
, it returns nothing.Expected behavior
The custom playbook pushed should be present in the list of playbooks returned by robusta playbooks list --namespace robusta
Screenshots
Happy to provide the contents of pyproject.toml
and the file in my_action
Robusta version
robusta version
version 0.8.26
Additional context
N/A
Hi,
thank you for the great tool!
We might be able to contribute on the feature request below but I wanted to first check if you had any thoughts, design concerns or suggestions:
It would be nice to be able to route the robusta messages similarly to how alertmanger does it.
Here's an example for Slack:
- slack_sink:
name: prod_alerts
match:
namespace: prod-*
slack_channel: prod-issues
would send messages where namespace
matches prod-*
to the prod-issues
slack channel.
Ideally it should be possible to match against all attributes of the Finding
class
thanks!
Currently I can't use robusta to get notifications about ingresses.
I would like to get notifications about ingresses (or other resources) the same way I get them about pods.
Today we don't send resource creations/deletions to the Robusta UI. We should send them because they add useful information.
The relevant code to change is in values.yaml
. The change is probably trivial, but it requires testing. If you work on this, please document any manual testing that you did to verify this works.
Relevant code in values.yaml:
platformPlaybooks:
- triggers:
- on_deployment_update: {}
actions:
- resource_babysitter: {}
sinks:
- "robusta_ui_sink"
- triggers:
- on_daemonset_update: {}
actions:
- resource_babysitter: {}
sinks:
- "robusta_ui_sink"
- triggers:
- on_statefulset_update: {}
actions:
- resource_babysitter: {}
sinks:
- "robusta_ui_sink"
See also: https://docs.robusta.dev/master/catalog/triggers/kubernetes.html
Is your feature request related to a problem? Please describe.
I'd like an additional sink, and one that's very extensible.
Describe the solution you'd like
I use ntfy.sh for push notifications to my phone. It accepts PUT/POST to an endpoint, and then pushes them to the subscriber. I suspect there are other tools that would also accept a simple payload that could make use of this kind of output.
Is your feature request related to a problem? Please describe.
I'd like to be able to watch for changes to CRDs, but the docs don't describe that. It appears it's not supported.
Describe the solution you'd like
Extend support to register CRDs to trigger on changes
Describe alternatives you've considered
Doing it myself (writing a small service), looking at kube-watch and the open PRs there for how this may be supported.
Additional context
N/A
Thank you!
When I logout/login to UI, I need to select the cluster and namespace (instead of default ALL). I want to configure what are my default cluster and namespace to show, and at least to remember the last choose
Describe the bug
Helm values file is defaulted to runner version 0.0.0, which causes the installation documented on the website to fail.
To Reproduce
Steps to reproduce the behavior:
The runner never deploys because the image tag is defaulted to 0.0.0
Expected behavior
A clear and concise description of what you expected to happen.
I should be able to install this like a normal Helm chart, with sensible defaults.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.