community-charts / helm-charts Goto Github PK
View Code? Open in Web Editor NEWCommunity Helm Charts
License: MIT License
Community Helm Charts
License: MIT License
There is a reported critical remote file access vulnerability within the older version of MLflow, with the mlflow server
and mlflow ui
. More details here: GHSA-xg73-94fp-g449
Can we please get an update to MLflow >= 2.21
!
No response
I try to deploy MLflow with in a local Kubernetes.
First, I created an empty database in Postgres 15 at postgresql://postgres-service.hm-postgres.svc:5432/hm_mlflow_db
:
However, I failed to deploy MLflow with databaseMigration: true
by
helm upgrade \
mlflow \
mlflow \
--install \
--repo=https://community-charts.github.io/helm-charts \
--namespace=hm-mlflow \
--create-namespace \
--values=my-values.yaml
my-values.yaml:
backendStore:
databaseMigration: true
databaseConnectionCheck: true
postgres:
enabled: true
host: postgres-service.hm-postgres.svc
port: 5432
database: hm_mlflow_db
user: admin
password: passw0rd
Any guide would be appreciate, thanks! ๐
version.BuildInfo{Version:"v3.12.0", GitCommit:"c9f554d75773799f72ceef38c51210f1842a1dea", GitTreeState:"clean", GoVersion:"go1.20.3"}
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.2", GitCommit:"5835544ca568b757a8ecae5c153f317e5736700e", GitTreeState:"clean", BuildDate:"2022-09-21T14:33:49Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"darwin/amd64"} Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3+k3s1", GitCommit:"01ea3ff27be0b04f945179171cec5a8e11a14f7b", GitTreeState:"clean", BuildDate:"2023-03-27T22:04:57Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/arm64"}
mlflow
0.7.19
dbchecker instance shows
[INFO] Waiting for Database to become ready...
[INFO] Database OK โ
However, mlflow-db-migration shows this and keeps retrying:
2023/07/03 23:39:31 WARNING mlflow.store.db.utils: SQLAlchemy engine could not be created. The following exception is caught.
(psycopg2.OperationalError) SCRAM authentication requires libpq version 10 or above
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Operation will be retried in 0.1 seconds
2023/07/03 23:39:31 WARNING mlflow.store.db.utils: SQLAlchemy engine could not be created. The following exception is caught.
(psycopg2.OperationalError) SCRAM authentication requires libpq version 10 or above
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Operation will be retried in 0.3 seconds
2023/07/03 23:39:31 WARNING mlflow.store.db.utils: SQLAlchemy engine could not be created. The following exception is caught.
(psycopg2.OperationalError) SCRAM authentication requires libpq version 10 or above
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Operation will be retried in 0.7 seconds
2023/07/03 23:39:32 WARNING mlflow.store.db.utils: SQLAlchemy engine could not be created. The following exception is caught.
(psycopg2.OperationalError) SCRAM authentication requires libpq version 10 or above
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Operation will be retried in 1.5 seconds
2023/07/03 23:39:33 WARNING mlflow.store.db.utils: SQLAlchemy engine could not be created. The following exception is caught.
(psycopg2.OperationalError) SCRAM authentication requires libpq version 10 or above
First of all, thanks to everyone creating this Helm Chart as it is really good and easy to use.
However, I encountered a problem when choosing to include ServiceMonitor and Prometheus metrics along the Deployment. Generally, the created ServiceMonitor for MLFlow is correct, yet in the current form it does not work for me.
I use the latest Prometheus deployed using the official Helm Chart and the MLFlow metrics did not show up in the Targets, yet it was visible in Service Discovery panel in Prometheus Dashboard, but appeared as 0/1 active targets
.
After a couple of hours of educated debugging I changed manually the targetPort: 80
to port: http
in the deployed ServiceMonitor manifest. It worked straightaway!
What I propose is a simple fix:
According to official Prometheus Troubleshooting docs the port specified in ServiceMonitor should use name
instead of port number (Link to docs)
Simple fix would be to change targetPort: 80
to port: http
in templates/servicemonitor.yaml
. Port name http
is already hardcoded, so can be used directly or new parameter could be introduced to give the freedom to choose port name.
I am aware that port number of type Integer should also work...
3.6.0
1.19
mlflow
0.2.21
No response
No response
No response
No response
helm install --namespace mlflow mlflow-tracking-server community-charts/mlflow --set serviceMonitor.enabled=true
No response
Right now the way how user/password are exposed in the chart require to put both as plain text into the repository. It would be better to provide a predefined secrets. I can think of various ways to achieve this.
One way how this could work:
Cannot think of many alternatives right now
No response
Add exist contributors to readme file
When trying to install mlflow chart I'm trying to migrate from old mlflow version to the new one. I'm using backendStore.databaseMigration: true
value for that. But mlflow pod failed to start with error:
mlflow.exceptions.MlflowException: Detected out-of-date database schema (found version c48cb773bb87, but expected cc1f77228345). Take a backup of your database, then run 'mlflow db upgrade <database_uri>' to migrate your database to the latest schema. NOTE: schema migration may result in database downtime - please consult your database's documentation for more detail.
From the looks of things migration Job should have pre-install,pre-upgrade
hooks instead of post-install,post-upgrade
but I can be wrong here.
Running Job from the chart manually with kubectl fixed this issue for me, but it will probably appear with the next release.
Thanks!
v3.9.3
v1.24.3
mlflow
0.6.0
No response
DB migration job should run before mlflow pod upgrade.
mlflow:
nodeSelector:
redacted: Shared
ingress:
enabled: true
artifactRoot:
s3:
enabled: true
bucket: "redacted"
awsAccessKeyId: ""
awsSecretAccessKey: ""
extraEnvVars:
AWS_DEFAULT_REGION: eu-central-1
MLFLOW_S3_ENDPOINT_URL: https://bucket.redacted.s3.eu-central-1.vpce.amazonaws.com
backendStore:
databaseMigration: true
databaseConnectionCheck: true
mysql:
enabled: true
host: "redacted.eu-central-1.rds.amazonaws.com"
database: "mlflow"
user: ""
password: ""
helm upgrade --install --values override.yaml --wait --create-namespace --atomic --timeout 15m0s -f secrets://secrets.yaml shared-services ./shared-services
Chart was installed as a part of another umbrella chart
The new staticPrefix argument being under extraArgs breaks the chart for users that need to use the extraArgs
version.BuildInfo{Version:"v3.8.1", GitCommit:"5cb9af4b1b271d11d7a97a71df3ac337dd94ad37", GitTreeState:"clean", GoVersion:"go1.17.8"}
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:38:33Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"darwin/arm64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/arm64"}
mlflow
0.2.7
The newly added staticPrefix parameter under extraArgs breaks the chart when used because it tries to add an extra argument to the mlflow server command that doesnt exist.
No response
No response
No response
helm install -f mlflow/values.yaml mlflow ./mlflow/
I am just creating a pull request to address this in a bit different way and havent tested it yet. Just wanted to create a request to highlight a solution.
You could also handle the staticPrefix as a separate argument in the extraEnv when starting up the mlflow server to make this work smoother for a final user, but this solution should work as well.
unable to get deploy config.","error":"configmaps \"inferenceservice-config\" is forbidden: User \"system:serviceaccount:kserve:default\" cannot get resource \"configmaps\" in API group \"\" in the namespace \"kserve\"",
3.8.0
1.22.5
kserve
1.0.1
After installing the current kserve Helm chart, I had an issue where the kserve deployment could never successfully deploy because the kserve default Service Account does not have a cluster role nor a cluster role binding so it couldn't access the configmap it needed.
Once I created a cluster role binding to a cluster role that had the proper API group permissions, the deployment was able to access the inferenceservice-config configmap and continue in it's process.
No response
No response
No response
helm install [RELEASE_NAME] community-charts/kserve
No response
Currently, it's not easy to define MySQL DB settings to the Mlflow Helm chart.
MySQL driver already added to mlflow docker image with version 1.27.0.36. We need to have following settings in the values.yaml file.
backendStore:
mysql:
# -- Specifies if you want to use mysql backend storage
enabled: false
# -- MySQL host address. e.g. your Amazon RDS for MySQL
host: "" # required
# -- MySQL service port
port: 3306 # required
# -- mlflow database name created before in the mysql instance
database: "" # required
# -- mysql database user name which can access to mlflow database
user: "" # required
# -- mysql database user password which can access to mlflow database
password: "" # required
NONE
No response
When you start mlflow behind a proxy, you often do not want to serve it on root. This means that you configure mlflow with --static-prefix and ensure it is served with a prefix mentioned.
The way the chart is designed right now, it allows for the mlflow server to be started up with this extra argument, but the readiness probe and liveness probe arent configurable to use the new prefix added to the mlflow server.
Parameterize the readiness probe and liveness probe path in the deployment to ensure it can be configured by users of the chart.
NONE
No response
Yes old version of kserve has some bugs. Hence we need to update kserve helm chart to the latest version
Update kserve helm chart to the latest version
NA
No response
I have local minikube cluster. I installed the helm chart with some changed settings. See below for the changed values. Everthing else is same as per default values yaml file. For db backend I am using bitnami/postgresql
and for s3 storage minio instance. I also have created a initial bucket named "mlflow" in minio.
And then I created a simple k8s pod to run the simple training example from mlflow docs. This pod has env variables set as : MLFLOW_TRACKING_URI=http://mlflow.airflow.svc.cluster.local:5000
Here is the link to that code. I can see the metadata about the model in UI however , artifact section in UI is empty and also the bucket is empty.
version.BuildInfo{Version:"v3.9.0", GitCommit:"7ceeda6c585217a19a1131663d8cd1f7d641b2a7", GitTreeState:"clean", GoVersion:"go1.17.5"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
mlflow
latest
No response
I would expect the artifacts in minio bucket.
install the helm chart with minio and postgresql config. Run a simple exmple frpom docs.
backendStore:
databaseMigration: true
databaseConnectionCheck: true
postgres:
enabled: true
host: mlflow-postgres-postgresql.airflow.svc.cluster.local
database: mlflow_db
user: mlflow
password: mlflow
artifactRoot:
proxiedArtifactStorage: true
s3:
enabled: true
bucket: mlflow
awsAccessKeyId: {{ requiredEnv "MINIO_USERNAME" }}
awsSecretAccessKey: {{ requiredEnv "MINIO_PASSWORD" }}
extraEnvVars:
MLFLOW_S3_ENDPOINT_URL: minio.airflow.svc.cluster.local
helm install mlflow-release community-charts/mlflow --values values.yaml
No response
When we run the pipeline for the chart kserve, kind tests fails because certmanager not installed in the cluster.
v3.9.0
v1.24.2
kserve
0.0.2
Kserve controller waits forever and getting the following error.
Warning FailedMount 19s (x8 over 83s) kubelet MountVolume.SetUp failed for volume "cert" : secret "kserve-webhook-server-cert" not found
No response
No response
No response
helm install mlflow community-charts/kserve
No response
In case one would create an ingress resource with annotations referring to a secret it would be nice to be able to deploy that dependency along with the chart.
Additional helm value to add secret with a list of key-value pairs which represents the data of the secret
No response
When we open a pull request, chart-testing (lint) step in release.yaml file getting the following error.
Error: Error linting charts: Error processing charts
------------------------------------------------------------------------------------------------------------------------
โ๏ธ mlflow => (version: "0.1.47", path: "charts/mlflow") > Error validating maintainer 'Burak Ince': 404 Not Found
------------------------------------------------------------------------------------------------------------------------
Because of maintainer name for the ct lint
command must be a GitHub username rather than a real name.
v3.9.0
v1.24.2
mlflow
0.1.47
No response
No response
No response
No response
ct lint --debug --config ./.github/configs/ct-lint.yaml --lint-conf ./.github/configs/lintconf.yaml
No response
When we install the Mlflow to on-promise systems like RaspberryPi-based Kubernetes cluster, they can't have a SaaS solution for blob storage and DB systems.
Maybe we can have one flag for the Postgres DB and another flag for the Minio installation. They could be false on default and they can be sub-chart for our helm chart.
No response
If we have some automated task to update readme files, it fails the pipeline because there is no new helm chart version existing. Helm chart creation process must be ignored from readme-files changes.
NONE
NONE
all charts
NONE
No response
No response
No response
No response
NONE
No response
When I deploy the mlflow chart into the cluster, I don't have any postgres database running externally. So I have to deploy the postgres helm chart first and then set the pointers to the database manually. It'd be preferable if the mlflow chart had postgresql dependency available.
Postgresql helm chart as an optional dependency.
Installing postgresql separately or using a cloud-provided database. None of which I'm very happy about.
No response
Apparently, the intention of the default database setting was to use an in-memory SQLite database.
In order to do this, the special filename ":memory:" is utilized (see https://www.sqlite.org/inmemorydb.html ) .
However, in the deployment file the ending colon is missing, which means that SQLite is creating a file /mlflow/:memory.
Whereas I find it would also be perfectly fine to persist the SQLite3 database (and for this, a PersistentVolumeClaim would be needed), using the ill ":memory" creates a strange filename and is NOT serving the purpose of the in-memory database.
Irrelevant
Irrelevant
mlflow
0.7.19
I found a weird-looking file ":memory" within /mlflow in the mlflow container.
Reading the output of "helm template ..." I see that a URI ending with ":memory" without an ending semicolon is used. I check the source and in effect the source has a typo.
I would expect the URI to end with ":memory:".
Whenever you install the chart without specifying a PostgreSQL or MySQL database.
None
A normal install without values.
I understand the error might have arisen since YAML will not let you end a string literal with a colon unless you quote the string. At some point the ending colon can have been removed without reflection about the meaning.
backend-store-uri
did not render properly when using postgresql
3.12.2
1.28.1
mlflow
0.7.19
No response
No response
No response
No response
helm install mlflow community-charts/mlflow --values values.yaml
No response
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.