Comments (9)
Hello @bharathguvvala
Thanks for reporting the gap in docs and sorry the problem :(
I think that the problem should be addressed by improving our docs about this and how to solve it. I think to address this from docs because updating the CR is part of the KEDA operation (not depending on the fallback) and it's part of the CRD, so users can build checking systems on top of it.
Just giving some context about that change, the underlying controller added the limit rate support with a quite restrictive values (5 and 10 IIRC), we increased those values for the most common user type (< 100 ScaledObject) to prevent bursting API server by mistake.
If you are willing to help with the documentation, it'd be awesome! Real user experiences are the best teachers for other folks
from keda.
Willing to participate in the discussion and to contribute to code/documentation.
from keda.
@JorTurFer Whichever scaledobjects have not been configured with the fallback option , isn't it better to skip updating the fallback health status for every metrics read call, which would avoid redundant API calls to the K8s API Server? Instead this condition can be updated from the controller during scaledobject reconciliations? Also means that the fallback health stats are only updated for scaledobjects where the fallback is enabled and for the rest there are defaulted during the time of the scaledobject reconciliation.
from keda.
Regarding the documentation, I'll raise a PR in the next the couple of days which provides guidance to setup and configure KEDA on large clusters -- with high number of deployments.
from keda.
@JorTurFer Whichever scaledobjects have not been configured with the fallback option , isn't it better to skip updating the fallback health status for every metrics read call, which would avoid redundant API calls to the K8s API Server? Instead this condition can be updated from the controller during scaledobject reconciliations? Also means that the fallback health stats are only updated for scaledobjects where the fallback is enabled and for the rest there are defaulted during the time of the scaledobject reconciliation.
Your point it's interesting, it's true that updating the fallback all the time if the feature is disabled doesn't make sense. Checking the code, I have noticed that we can be updating the value although there isn't any change.
Checking the fallback logic, we are callign to updateStatus
, and there, we don't check if the status has really changed before patching the resource:
Lines 115 to 129 in f2d86a8
I think that we can improve that logic to reduce the calls to the API server. @zroubalik @dttung2905 @wozniakjan WDYT?
from keda.
Checking the fallback logic, we are callign to updateStatus, and there, we don't check if the status has really changed before patching
That is a pretty good optimization, especially given there is already DeepCopy
available
Line 116 in f2d86a8
and the check could be as simple as
!reflect.DeepEqual()
Regarding the documentation, I'll raise a PR in the next the couple of days which provides guidance to setup and configure KEDA on large clusters -- with high number of deployments.
Thank you @bharathguvvala, that would be terrific. Also, feel free to introduce code improvements, generally it's easier to get merged smaller PRs.
from keda.
@wozniakjan @JorTurFer I have made a change to disable health status updates if the fallback for the scaledobject is not configured. I will go ahead with adding the tests if this change is okayed in terms of the intent. We could do additional logic to avoid redundant updates where a fallback is configured in a scaledobject , on top of this.
from keda.
@JorTurFer @wozniakjan Taking a step back, I was thinking if this information around the error count needs to be updated back to the scaledobject status. Since this is transient information used to implement some sort of circuit breaking isn't it appropriate to keep this information inside the operator (in memory) and only update the condition whenever it flips?
I presume this information is updated in the status only to make it persistent and survive across multiple operator restarts but it's also expensive considering that an update is performed in the read path of the GetMetrics and can potentially affect the GetMetrics latencies which in turn can affect the autoscaler SLOs. If the same error count information can be reconstructed from scratch based on the new set of errors then why not avoid persistenting it?
from keda.
Related Issues (20)
- Failed to create HTTPScaledObject, stuck in 'PendingCreation' state HOT 1
- Document which (observability) metrics are initialized and which not HOT 5
- MongoDB Scalers open many open connections HOT 6
- Keda GCP PubSub Triggering Unnecessary ScaledJobs After Ack HOT 1
- GitLab Runner Scaler HOT 2
- New Relic Scaler causes panic when query returns no results HOT 2
- ScaledObject downscales deployment to 0 replicas outside specified timeframe in cron trigger HOT 2
- Increase operator resiliency to unexpected scaler failures HOT 5
- Introduce authentication CRD that is not scoped to just triggers
- Reconciler crashes with ScaledObject postgres trigger and Vault for authenticationRef HOT 2
- Scalers for Azure Function Apps running under AKS are not really supported - although the documentation make it seem like they HOT 1
- Trying to integrate with Azure Managed Prometheus but getting Unauthorised issue HOT 6
- Enhance Security and Self-Service by Allowing Service Account Specification in Target Namespace for Workload Identity HOT 5
- Missing parameters 'rate' and 'count' for GCP Stackdriver Scaler alignment HOT 1
- Extend search scope of kubernetes workload scaler to all namespaces HOT 8
- KEDA Operator Not Exposing kube_horizontalpodautoscaler_status_current_replicas Metric HOT 2
- Restart of keda-operator causes ScaledJob object to be updated HOT 2
- Keep hpa active when one of triggers failed HOT 4
- Keda 2.13.1 Sysdig scan Vulnerabilities CVE-2024-27304 CVE-2024-24786 CVE-2024-28110 CVE-2024-28180 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keda.