castai / helm-charts Goto Github PK
View Code? Open in Web Editor NEWCAST AI Kubernetes helm charts
License: Apache License 2.0
CAST AI Kubernetes helm charts
License: Apache License 2.0
We are using https://external-secrets.io to manage syncing of our secrets between the backend and kubernetes. Would it be possibly to implement one of the following options to make the secret in the helm chart conditionally create if no "self-managed" secret is specified or if the secret already exists?
This is a helm antipattern - namespace should be managed by Helm and you can get the value from helm by referencing Release.Namespace
When installing any version of evictor over version .0.24
it fails with a duplicate error.
Helm install failed for release castai-agent/castai-evictor with chart [email protected]: error while running post render on files: map[string]interface {}(nil): yaml: unmarshal errors: line 98: mapping key "resources" already defined at line 73
Looks like there are two fields for resources with the addition of the vpa. So no deployments of evictor will work with .0.24 until this is resolved. I have forced our installations to use the latest working one - .0.23.57
.
My values for reference:
values:
dryRun: false
aggressiveMode: true
apiKeySecretRef: castai
managedByCASTAI: true
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
Getting these warnings while deploying the chart, we need to upgrade to latest Kubernetes and need to fix these warnings , but we don't want to have our own local copy of the chart, instead we want to use this upstream chart
helm upgrade --install cluster-controller -f values.yaml castai-helm/castai-cluster-controller -n castai-agent
W0720 06:02:27.182588 588 gcp.go:120] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.25+; use gcloud instead.
To learn more, consult https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
W0720 06:02:44.999440 588 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0720 06:02:45.323757 588 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0720 06:02:45.662278 588 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
Release "cluster-controller" has been upgraded. Happy Helming!
Most other cast.ai repos allow for:
apiKeySecretRef: ""
Which is handy if secrets are managed and synced to the cluster outside of the helm apply, like we do.
The chart does have secretName
but that creates a secret with that name using whatever is passed in .Values.apiKey
.
Do you have any clear guide for installing it with Openshift 4.x ? I mean recommended values
As you may know OCP 4.x cames with monitoring stack by default, as well as it forces some security on runtime.
Hey Team!
How to deploy all the charts without manually copy-pasting cluster ID, which will be created only after castai-agent is deployed?
Right now I have multiple clusters with castai-agint and want to add castai-kvisor and stuck on cluster id.
If I tried to add the STATIC_CLUSTER_ID parameter to castai-agent it failed with the "cluster not found" error.
HI Team,
We are seeing failure while setting up castai-agent as :
k logs -n castai-agent castai-agent-bc568646c-dpngq
time="2023-02-08T13:39:13Z" level=info msg="running agent version: GitCommit=\"0ebd8d1fa65524cebda49b791a4e9e4a1fceb0b2\" GitRef=\"refs/tags/v0.42.1\" Version=\"v0.42.1\"" version=v0.42.1
time="2023-02-08T13:39:13Z" level=info msg="platform URL: https://api.cast.ai" version=v0.42.1
time="2023-02-08T13:39:13Z" level=info msg="starting healthz on port: 9876" version=v0.42.1
time="2023-02-08T13:39:13Z" level=error msg="agent stopped with an error: healthz server: http: Server closed" version=v0.42.1
time="2023-02-08T13:39:13Z" level=fatal msg="agent failed: getting provider: configuring aws client: getting instance region: EC2MetadataRequestError: failed to get EC2 instance identity document\ncaused by: EC2MetadataError: failed to make EC2Metadata request\nrequest blocked by allow-route-regexp \"^$\": /latest/dynamic/instance-identity/document\n\n\tstatus code: 404, request id: " version=v0.42.1
Best Regards
Ganesh Kumar
We create all our namespaces from a centralized repo to make it easy to apply any kind of labels or annotations we desire to all the workloads running in the namespace.
Before Installing CastAI Agent to our cluster using helm, we created the required namespace (castai-agent) from the above centralized repo. When we try to run a helm install or helm upgrade to deploy castai agent into our cluster, we're running into an issue unable to render manifests for namespace resource as that already exists as shown below
Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. Unable to continue with update: Namespace "castai-agent" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "castai-agent"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "castai-agent"
Please remove the namespace creation manifest from the helm chart for castai-agent so that helm can create the namespace if needed.
I see a similar issue opened in the past but got closed # #135
Running a Kubernetes cluster with TLS 1.3 only:
$ curl -vkI https://10.96.0.1:443/api
* Trying 10.96.0.1:443...
* Connected to 10.96.0.1 (10.96.0.1) port 443 (#0)
* ALPN: offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN: server accepted h2
* Server certificate:
* subject: CN=kube-apiserver
* start date: Jul 5 15:28:54 2023 GMT
* expire date: Sep 11 09:17:14 2024 GMT
* issuer: CN=kubernetes
* SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* using HTTP/2
* h2h3 [:method: HEAD]
* h2h3 [:path: /api]
* h2h3 [:scheme: https]
* h2h3 [:authority: 10.96.0.1]
* h2h3 [user-agent: curl/8.0.1]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0x7f1030631af0)
> HEAD /api HTTP/2
> Host: 10.96.0.1
> user-agent: curl/8.0.1
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 405
HTTP/2 405
< audit-id: 793432e5-8d19-4714-88f7-b74a1737d9c5
audit-id: 793432e5-8d19-4714-88f7-b74a1737d9c5
< cache-control: no-cache, private
cache-control: no-cache, private
< content-type: application/json
content-type: application/json
< x-kubernetes-pf-flowschema-uid: b7b912df-2347-4ff0-bc46-03db625f6b68
x-kubernetes-pf-flowschema-uid: b7b912df-2347-4ff0-bc46-03db625f6b68
< x-kubernetes-pf-prioritylevel-uid: 91aefef1-aa34-43dd-bcba-e0d14858263e
x-kubernetes-pf-prioritylevel-uid: 91aefef1-aa34-43dd-bcba-e0d14858263e
< content-length: 229
content-length: 229
< date: Thu, 25 Jan 2024 18:18:54 GMT
date: Thu, 25 Jan 2024 18:18:54 GMT
<
* Connection #0 to host 10.96.0.1 left intact
Castai-agent fails to connect to the API server:
I0125 18:10:12.686315 1 autoscaler.go:46] Scaling namespace: castai-agent, target: deployment/castai-agent
E0125 18:10:12.692345 1 autoscaler.go:49] failed to discover preferred resources: Get https://10.96.0.1:443/api?timeout=32s: remote error: tls: protocol version not supported
Agent expects the namespace to be castai-agent
- otherwise we get:
E0801 03:04:48.260087 1 leaderelection.go:334] error initially creating leader election record: namespaces "castai-agent" not found
E0801 03:04:51.692050 1 leaderelection.go:334] error initially creating leader election record: namespaces "castai-agent" not found
E0801 03:04:55.826509 1 leaderelection.go:334] error initially creating leader election record: namespaces "castai-agent" not found
E0801 03:04:58.587909 1 leaderelection.go:334] error initially creating leader election record: namespaces "castai-agent" not found
E0801 03:05:01.360483 1 leaderelection.go:334] error initially creating leader election record: namespaces "castai-agent" not found
E0801 03:05:05.235647 1 leaderelection.go:334] error initially creating leader election record: namespaces "castai-agent" not found
Castai agent was installed and it is not detecting a spot node, but as on-demand on EKS cluster. do we know how to make castai agent identify it as spot?
Can you add apiKeySecretRef for aks-init instead of relying on the secret: castai-cluster-controller
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.