Git Product home page Git Product logo

bentoml / yatai Goto Github PK

View Code? Open in Web Editor NEW
766.0 19.0 70.0 11.9 MB

Model Deployment at Scale on Kubernetes ๐Ÿฆ„๏ธ

Home Page: https://bentoml.com

License: Other

Go 36.95% PLpgSQL 0.88% HTML 0.30% JavaScript 0.33% TypeScript 54.25% SCSS 0.11% CSS 2.94% Makefile 0.38% Dockerfile 0.01% Shell 2.77% Smarty 0.33% Nix 0.60% Mustache 0.15%
bentoml kubernetes mlops model-deployment model-serving k8s machine-learning

yatai's Introduction

๐Ÿฆ„๏ธ Yatai: Model Deployment at Scale on Kubernetes

actions_status join_slack

โš ๏ธ Yatai for BentoML 1.2 is currently under construction. See Yatai 2.0 Proposal for more details.


Yatai (ๅฑ‹ๅฐ, food cart) is the Kubernetes deployment operator for BentoML.

It let DevOps teams to seamlessly integrate BentoML into their GitOps workflow, for deploying and scaling Machine Learning services on any Kubernetes cluster.

๐Ÿ‘‰ Join our Slack community today!


Why Yatai?

Yatai empowers developers to deploy BentoML on Kubernetes, optimized for CI/CD and DevOps workflow.

Yatai is Cloud native and DevOps friendly. Via its Kubernetes-native workflow, specifically the BentoDeployment CRD (Custom Resource Definition), DevOps teams can easily fit BentoML powered services into their existing workflow.

Getting Started

  • ๐Ÿ“– Documentation - Overview of the Yatai docs and related resources
  • โš™๏ธ Installation - Hands-on instruction on how to install Yatai for production use
  • ๐Ÿ‘‰ Join Community Slack - Get help from our community and maintainers

Quick Tour

Let's try out Yatai locally in a minikube cluster!

โš™๏ธ Prerequisites:

  • Install latest minikube: https://minikube.sigs.k8s.io/docs/start/
  • Install latest Helm: https://helm.sh/docs/intro/install/
  • Start a minikube Kubernetes cluster: minikube start --cpus 4 --memory 4096, if you are using macOS, you should use hyperkit driver to prevent the macOS docker desktop network limitation
  • Check that minikube cluster status is "running": minikube status
  • Make sure your kubectl is configured with minikube context: kubectl config current-context
  • Enable ingress controller: minikube addons enable ingress

๐Ÿšง Install Yatai

Install Yatai with the following script:

bash <(curl -s "https://raw.githubusercontent.com/bentoml/yatai/main/scripts/quick-install-yatai.sh")

This script will install Yatai along with its dependencies (PostgreSQL and MinIO) on your minikube cluster.

Note that this installation script is made for development and testing use only. For production deployment, check out the Installation Guide.

To access Yatai web UI, run the following command and keep the terminal open:

kubectl --namespace yatai-system port-forward svc/yatai 8080:80

In a separate terminal, run:

YATAI_INITIALIZATION_TOKEN=$(kubectl get secret yatai-env --namespace yatai-system -o jsonpath="{.data.YATAI_INITIALIZATION_TOKEN}" | base64 --decode)
echo "Open in browser: http://127.0.0.1:8080/setup?token=$YATAI_INITIALIZATION_TOKEN"

Open the URL printed above from your browser to finish admin account setup.

๐Ÿฑ Push Bento to Yatai

First, get an API token and login to the BentoML CLI:

  • Keep the kubectl port-forward command in the step above running

  • Go to Yatai's API tokens page: http://127.0.0.1:8080/api_tokens

  • Create a new API token from the UI, making sure to assign "API" access under "Scopes"

  • Copy the login command upon token creation and run as a shell command, e.g.:

    bentoml yatai login --api-token {YOUR_TOKEN} --endpoint http://127.0.0.1:8080

If you don't already have a Bento built, run the following commands from the BentoML Quickstart Project to build a sample Bento:

git clone https://github.com/bentoml/bentoml.git && cd ./examples/quickstart
pip install -r ./requirements.txt
python train.py
bentoml build

Push your newly built Bento to Yatai:

bentoml push iris_classifier:latest

๐Ÿ”ง Install yatai-image-builder component

Yatai's image builder feature comes as a separate component, you can install it via the following script:

bash <(curl -s "https://raw.githubusercontent.com/bentoml/yatai-image-builder/main/scripts/quick-install-yatai-image-builder.sh")

This will install the BentoRequest CRD(Custom Resource Definition) and Bento CRD in your cluster. Similarly, this script is made for development and testing purposes only.

๐Ÿ”ง Install yatai-deployment component

Yatai's Deployment feature comes as a separate component, you can install it via the following script:

bash <(curl -s "https://raw.githubusercontent.com/bentoml/yatai-deployment/main/scripts/quick-install-yatai-deployment.sh")

This will install the BentoDeployment CRD(Custom Resource Definition) in your cluster and enable the deployment UI on Yatai. Similarly, this script is made for development and testing purposes only.

๐Ÿšข Deploy Bento!

Once the yatai-deployment component was installed, Bentos pushed to Yatai can be deployed to your Kubernetes cluster and exposed via a Service endpoint.

A Bento Deployment can be created via applying a BentoDeployment resource:

Define your Bento deployment in a my_deployment.yaml file:

apiVersion: resources.yatai.ai/v1alpha1
kind: BentoRequest
metadata:
    name: iris-classifier
    namespace: yatai
spec:
    bentoTag: iris_classifier:3oevmqfvnkvwvuqj  # check the tag by `bentoml list iris_classifier`
---
apiVersion: serving.yatai.ai/v2alpha1
kind: BentoDeployment
metadata:
    name: my-bento-deployment
    namespace: yatai
spec:
    bento: iris-classifier
    ingress:
        enabled: true
    resources:
        limits:
            cpu: "500m"
            memory: "512Mi"
        requests:
            cpu: "250m"
            memory: "128Mi"
    autoscaling:
        maxReplicas: 10
        minReplicas: 2
    runners:
        - name: iris_clf
          resources:
              limits:
                  cpu: "1000m"
                  memory: "1Gi"
              requests:
                  cpu: "500m"
                  memory: "512Mi"
          autoscaling:
              maxReplicas: 4
              minReplicas: 1

Apply the deployment to your minikube cluster:

kubectl apply -f my_deployment.yaml

Now you can check the deployment status via kubectl get BentoDeployment -n my-bento-deployment

Community

Contributing

There are many ways to contribute to the project:

  • If you have any feedback on the project, share it with the community in GitHub Discussions under the BentoML repo.
  • Report issues you're facing and "Thumbs up" on issues and feature requests that are relevant to you.
  • Investigate bugs and review other developers' pull requests.
  • Contributing code or documentation to the project by submitting a GitHub pull request. See the development guide.

Licence

Elastic License 2.0 (ELv2)

yatai's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yatai's Issues

Local Container bento (works) <-> Yatai build Bento (fails with async issue)

Hi Team,

I'm trying to serve my models using the yatai server. But when i try to do inference on the endpoint it errors out with the following

API response -
"An error has occurred in BentoML user code when handling this request, find the error details in server logs"

Container Logs -

/home/bentoml/bento/src/service.py:23 in predict โ”‚

                       โ”‚                                                   โ”‚
                       โ”‚   20 @svc.api(input=io.Image(), output=io.JSON()) โ”‚
                       โ”‚   21 def predict(img):                            โ”‚
                       โ”‚   22 โ”‚   img = np.asarray(exif_transpose(img))    โ”‚
                       โ”‚ โฑ 23 โ”‚   results = bl_model_runner.run(img)       โ”‚
                       โ”‚   24 โ”‚   p = results.pandas().xyxy[0]             โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/bentoml/_i โ”‚
                       โ”‚ nternal/runner/runner.py:165 in run               โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   162 โ”‚                                           โ”‚
                       โ”‚   163 โ”‚   @final                                  โ”‚
                       โ”‚   164 โ”‚   def run(self, *args: t.Any, **kwargs: t โ”‚
                       โ”‚ โฑ 165 โ”‚   โ”‚   return self._impl.run(*args, **kwar โ”‚
                       โ”‚   166 โ”‚                                           โ”‚
                       โ”‚   167 โ”‚   @final                                  โ”‚
                       โ”‚   168 โ”‚   def run_batch(self, *args: t.Any, **kwa โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/bentoml/_i โ”‚
                       โ”‚ nternal/runner/remote.py:152 in run               โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   149 โ”‚   def run(self, *args: t.Any, **kwargs: t โ”‚
                       โ”‚   150 โ”‚   โ”‚   import anyio                        โ”‚
                       โ”‚   151 โ”‚   โ”‚                                       โ”‚
                       โ”‚ โฑ 152 โ”‚   โ”‚   return anyio.from_thread.run(self.a โ”‚
                       โ”‚   153 โ”‚                                           โ”‚
                       โ”‚   154 โ”‚   def run_batch(self, *args: t.Any, **kwa โ”‚
                       โ”‚   155 โ”‚   โ”‚   import anyio                        โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/anyio/from โ”‚
                       โ”‚ _thread.py:35 in run                              โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚    32 โ”‚   except AttributeError:                  โ”‚
                       โ”‚    33 โ”‚   โ”‚   raise RuntimeError('This function c โ”‚
                       โ”‚    34 โ”‚                                           โ”‚
                       โ”‚ โฑ  35 โ”‚   return asynclib.run_async_from_thread(f โ”‚
                       โ”‚    36                                             โ”‚
                       โ”‚    37                                             โ”‚
                       โ”‚    38 def run_async_from_thread(func: Callable[.. โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/anyio/_bac โ”‚
                       โ”‚ kends/_asyncio.py:847 in run_async_from_thread    โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚    844 ) -> T_Retval:                             โ”‚
                       โ”‚    845 โ”‚   f: concurrent.futures.Future[T_Retval] โ”‚
                       โ”‚    846 โ”‚   โ”‚   func(*args), threadlocals.loop)    โ”‚
                       โ”‚ โฑ  847 โ”‚   return f.result()                      โ”‚
                       โ”‚    848                                            โ”‚
                       โ”‚    849                                            โ”‚
                       โ”‚    850 class BlockingPortal(abc.BlockingPortal):  โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/concurrent/futures/_base โ”‚
                       โ”‚ .py:445 in result                                 โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   442 โ”‚   โ”‚   โ”‚   โ”‚   if self._state in [CANCELLE โ”‚
                       โ”‚   443 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   raise CancelledError()  โ”‚
                       โ”‚   444 โ”‚   โ”‚   โ”‚   โ”‚   elif self._state == FINISHE โ”‚
                       โ”‚ โฑ 445 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   return self.__get_resul โ”‚
                       โ”‚   446 โ”‚   โ”‚   โ”‚   โ”‚   else:                       โ”‚
                       โ”‚   447 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   raise TimeoutError()    โ”‚
                       โ”‚   448 โ”‚   โ”‚   finally:                            โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/concurrent/futures/_base โ”‚
                       โ”‚ .py:390 in __get_result                           โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   387 โ”‚   def __get_result(self):                 โ”‚
                       โ”‚   388 โ”‚   โ”‚   if self._exception:                 โ”‚
                       โ”‚   389 โ”‚   โ”‚   โ”‚   try:                            โ”‚
                       โ”‚ โฑ 390 โ”‚   โ”‚   โ”‚   โ”‚   raise self._exception       โ”‚
                       โ”‚   391 โ”‚   โ”‚   โ”‚   finally:                        โ”‚
                       โ”‚   392 โ”‚   โ”‚   โ”‚   โ”‚   # Break a reference cycle w โ”‚
                       โ”‚   393 โ”‚   โ”‚   โ”‚   โ”‚   self = None                 โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/bentoml/_i โ”‚
                       โ”‚ nternal/runner/remote.py:144 in async_run         โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   141 โ”‚   โ”‚   return AutoContainer.payload_to_sin โ”‚
                       โ”‚   142 โ”‚                                           โ”‚
                       โ”‚   143 โ”‚   async def async_run(self, *args: t.Any, โ”‚
                       โ”‚ โฑ 144 โ”‚   โ”‚   return await self._async_req("run", โ”‚
                       โ”‚   145 โ”‚                                           โ”‚
                       โ”‚   146 โ”‚   async def async_run_batch(self, *args:  โ”‚
                       โ”‚   147 โ”‚   โ”‚   return await self._async_req("run_b โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/bentoml/_i โ”‚
                       โ”‚ nternal/runner/remote.py:125 in _async_req        โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   122 โ”‚   โ”‚   params = Params(*args, **kwargs).ma โ”‚
                       โ”‚   123 โ”‚   โ”‚   multipart = payload_params_to_multi โ”‚
                       โ”‚   124 โ”‚   โ”‚   client = self._get_client()         โ”‚
                       โ”‚ โฑ 125 โ”‚   โ”‚   async with client.post(f"{self._add โ”‚
                       โ”‚   126 โ”‚   โ”‚   โ”‚   body = await resp.read()        โ”‚
                       โ”‚   127 โ”‚   โ”‚   try:                                โ”‚
                       โ”‚   128 โ”‚   โ”‚   โ”‚   meta_header = resp.headers[PAYL โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/aiohttp/cl โ”‚
                       โ”‚ ient.py:1138 in __aenter__                        โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   1135 โ”‚   โ”‚   return self.__await__()            โ”‚
                       โ”‚   1136 โ”‚                                          โ”‚
                       โ”‚   1137 โ”‚   async def __aenter__(self) -> _RetType โ”‚
                       โ”‚ โฑ 1138 โ”‚   โ”‚   self._resp = await self._coro      โ”‚
                       โ”‚   1139 โ”‚   โ”‚   return self._resp                  โ”‚
                       โ”‚   1140                                            โ”‚
                       โ”‚   1141                                            โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/aiohttp/cl โ”‚
                       โ”‚ ient.py:559 in _request                           โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚    556 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   try:               โ”‚
                       โ”‚    557 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   resp = await r โ”‚
                       โ”‚    558 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   try:           โ”‚
                       โ”‚ โฑ  559 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   await resp โ”‚
                       โ”‚    560 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   except BaseExc โ”‚
                       โ”‚    561 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   resp.close โ”‚
                       โ”‚    562 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   raise      โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/aiohttp/cl โ”‚
                       โ”‚ ient_reqrep.py:898 in start                       โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚    895 โ”‚   โ”‚   โ”‚   โ”‚   # read response            โ”‚
                       โ”‚    896 โ”‚   โ”‚   โ”‚   โ”‚   try:                       โ”‚
                       โ”‚    897 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   protocol = self._proto โ”‚
                       โ”‚ โฑ  898 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   message, payload = awa โ”‚
                       โ”‚    899 โ”‚   โ”‚   โ”‚   โ”‚   except http.HttpProcessing โ”‚
                       โ”‚    900 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   raise ClientResponseEr โ”‚
                       โ”‚    901 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   self.request_info, โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚ /opt/conda/lib/python3.9/site-packages/aiohttp/st โ”‚
                       โ”‚ reams.py:616 in read                              โ”‚
                       โ”‚                                                   โ”‚
                       โ”‚   613 โ”‚   โ”‚   โ”‚   assert not self._waiter         โ”‚
                       โ”‚   614 โ”‚   โ”‚   โ”‚   self._waiter = self._loop.creat โ”‚
                       โ”‚   615 โ”‚   โ”‚   โ”‚   try:                            โ”‚
                       โ”‚ โฑ 616 โ”‚   โ”‚   โ”‚   โ”‚   await self._waiter          โ”‚
                       โ”‚   617 โ”‚   โ”‚   โ”‚   except (asyncio.CancelledError, โ”‚
                       โ”‚   618 โ”‚   โ”‚   โ”‚   โ”‚   self._waiter = None         โ”‚
                       โ”‚   619 โ”‚   โ”‚   โ”‚   โ”‚   raise                       โ”‚
                       โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

If you see i have nowhere used any async functionality in the bento service code, but still the bento build on the yatai server fails with the above "async - ServerDisconnectedError: Server disconnected" errors.

@svc.api(input=io.Image(), output=io.JSON())
def predict(img):
    img = np.asarray(exif_transpose(img))
    results = bl_model_runner.run(img)
    p = results.pandas().xyxy[0]
    out = defaultdict(list)

If i simply run the 'bentoml containerize' command and then do a docker run, the service works without any errors.
Am i missing something ? Please impart some wisdom ๐Ÿ™

Thanks

Poor yatai document

No any document about serving and routing to diffent bentos, what's the api? I found nothing

Yatai NGINX controller scope

Hi bentoml team ๐Ÿ‘‹

I have a shared cluster when NGINX is already installed.
With the current version, yatai NGINX controller always want to reconcile ingress objects which are not in Yatai namespace and with ingress class nginx, even if if Yatai controller is installed with ingress classname: yatai-nginx

  1. Is it possible to scope namespace of the yatai NGINX to look only yatai namespace with specific labels?
    https://docs.nginx.com/nginx-ingress-controller/installation/running-multiple-ingress-controllers/

  2. Is it possible to fix Yatai nginx installation to look only ingress class with yatai-ingress?
    I noticed that controller-valueis not set in the release, it may be a clue

[Docs] Missing Logging and Monitoring install procedure

Hello bentoml team ๐Ÿ‘‹

  • I deployed yatai following the getting started guide on the readme
  • On the UI, in the deployments section, I see that See Logs and Monitor are not available
  • The message is "Please install yatai component X"
    image

I have looked at the readme.md, the Yatai Administrator's Guide and the bentoml docs but found no reference on installing yatai components
I know that the 1.0.0 release mark is going to be hit soon, maybe the documentation will cover additional components then ?

thanks for you help

Error: [cli] `push` failed: request failed with status code 400: {"error":"ToModelRepositorySchemas: ToModelSchemas: GetImageName: cannot get ingress yatai-docker-registry: ingresses.networking.k8s.io \"yatai-docker-registry\" not found"}

Following getting started tutorial from github with minikube https://github.com/bentoml/Yatai.
I'm stuck at this step
bentoml push iris_classifier:latest
I have successfully completed other step like building bento and created the admin account in yatai web interface at http://yatai.127.0.0.1.sslip.io/

some how docker registry service in Minikube cluster not working.
Anyone facing the issue? i'm using docker desktop for Windows 10.

Yatai UI Design

  • model list detail page
  • bento repo, version list
  • bento details page
  • home page
  • Deployment creation page
  • Canary rollout workflow
  • Deployment list page
  • Deployment details page

How to install Yatai with the latest BentoML version ?

I would like to test a fix recently pushed in BentoML (bentoml/BentoML#2371).
I could test the fix locally on a running pod, and it seems to work, although I get another error further in the call stack...
In order to test things properly, I would like to re-install / upgrade Yatai on my K8s cluster so that the pods created at deployment time take the proper BentoML version.
Can you tell me how I can do ?

[Push] Model Push Fails with Getting Started Examples

Hello bentoml team ๐Ÿ‘‹ ๐Ÿฑ

Thanks fro open sourcing your work ๐Ÿ“–
First time playing with your tools, feels smooth so far !

I'm running the Getting Started Example on my machine

I use Minikube locally to setup the services, as suggested.

Small bug first for step 1:

  • sudo minikube tunnel doesn't work for me
๐Ÿคท  Profile "minikube" not found. Run "minikube profile list" to view all profiles.
๐Ÿ‘‰  To start a cluster, run: "minikube start"
  • However, minikube tunnel does work and does prompt for sudo password

Anyway, the not so small bug from step 3:

  • login to yatai seems to have gone smoothly : [cli] login successfully!
  • Bento build went well as well: [cli] Successfully built Bento(tag="iris_classifier:[...]
  • However, the bento push command did not go well:
Error: [cli] `push` failed: request failed with status code 400: {"error":"ToModelRepositorySchemas: ToModelSchemas: GetImageName: cannot get ingress yatai-docker-registry: ingresses.networking.k8s.io \"yatai-docker-registry\" not found"}

I'm wondering if there is an option/variable to change in the values for the helm chart ?
Indeed k9s doesn't report a yatai-docker-registry service :
image

I will continue to look on my side, will provide answer if I make it work ๐Ÿ‘
Thanks for your feedback, have a nice week-end

feat: Support using k8s configmap/secret as environment variable value in deployment

Currently deployment creation allow users to set an env var and pass to the BentoServer container. However in production scenarios, users may want to get env var values directly from existing configmap or secret resources in their k8s cluster. e.g.:

env:
        # Define the environment variable
        - name: SPECIAL_LEVEL_KEY
          valueFrom:
            configMapKeyRef:
              # The ConfigMap containing the value you want to assign to SPECIAL_LEVEL_KEY
              name: special-config
              # Specify the key associated with the value
              key: special.how

Yatai should allow users to configure that when creating a deployment, and show an error to the user when the given configmap/secret is missing in the cluster.

Yatai - Disable component installation if external

Hi bento team,

When specifying to use an external docker registry, Yatai still want to install a local registry in namespace yatai-component.
Is it possible to bypass this installation if we specify to use external registry: externalDockerRegistry.enabled: true ?
We have the same behaviour with externalS3.enabled: true

The only component which is bypass if not needed is PostgreSQL.

Allow admin to create new accounts

  • admin can create new accounts, and assign a role for the new account
  • user can change their own password
  • admin can reset password or delete an account
  • Show "Delete User" on UI, for resources created by an account that was deleted

Delete deployment from UI

We should allow users to delete/stop a deployment on the Web UI, perhaps a button next to the update?

Screen Shot 2021-11-10 at 1 03 14 PM

PostGres connection string with special character

Hello bentoml team ๐Ÿ‘‹

I am trying to build a Yatai production ready platform on AZURE.
The database is hosted on an external Azure Postgres Single server but I am unable to configure because the username and the password contain special characters.
For example with a password like special>#aec, I have the following error:

Error: migrate up db: cannot create migrate: parse "postgres://dbuser:special>": invalid port ":special>" after host
Usage:
  yatai-api-server serve [flags]

Flags:
  -c, --config string    (default "./yatai-config.dev.yaml")
  -h, --help            help for serve

Global Flags:
  -d, --debug   debug mode, output verbose output

error: migrate up db: cannot create migrate: parse "postgres://dbuser:special>": invalid port ":special>" after host

Is there a way to handle complex password?
NB: On azure, the admin username of the database is something like admin@db-postgres-azdzazd so the @ is not handle too

Failed to deploy Yatai

Trying to deploy Yatai version 0.3.12 using helm chart.
After installing we get this error in the console:

Error: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://yatai-ingress-controller-ingress-nginx-controller-admission.yatai-components.svc:443/networking/v1/ingresses?timeout=10s": service "yatai-ingress-controller-ingress-nginx-controller-admission" not found

Also yatai-components/yatai-yatai-deployment-operator-bdb69fd-9n8b8 component keeps being in ContainerCreating state. When checking the pod description it says:

Warning  FailedMount  119s (x23 over 63m)   kubelet  Unable to attach or mount volumes: unmounted volumes=[cert], unattached volumes=[cert kube-api-access-shb9b]: timed out waiting for the condition

Components, operators and builders resources not created when installing Yatai with postgresql.enabled=false

When I install Yatai using the following values:

postgresql:
  enabled: false

externalPostgresql:
  host: <my RDS host>
  port: 5432
  user: <my username>
  password: <my password>
  database: yatai

none of the resources that should be created in components, operators and builders are created.
If I comment postgresql.enabled and reinstall (after having deleted all Yatai resources), then all components get created.
I can then upgrade with postgresql.enabled = false.

I tried the operation many times and consistently got the issue

Do you confirm the issue ? Am I missing something ?

Cannot deploy with Yatai 0.2.1

I was using Yatai 0.1.4 successfully so far.
I upgraded to 0.2.1 in order to get the limits / requests fix (#187).
Here is how I did:

  • uninstalled Yatai from the cluster using delete-yatai.sh
  • dropped my yatai DB and recreated it (I'm using RDS)
  • reinstalled Yatai 0.2.1 helm chart
  • when to /setup?token=xxx for the admin setup
  • logged in and pushed my bento

Since then, I can't deploy bentos anymore, and constanty get this message in the "deployment events":

[2022-04-05 14:44:50] [BentoDeployment] [mytest] [GetBento] Fetching Bento mybento:4ctcgsfupga6svtp
[2022-04-05 14:44:50] [BentoDeployment] [mytest] [GetBento] Failed to fetch Bento mybento:4ctcgsfupga6svtp: DoJsonRequest Error: [GET]http://localhost:7777/api/v1/bento_repositories/mybento/bentos/4ctcgsfupga6svtp: Get "http://localhost:7777/api/v1/bento_repositories/mybento/bentos/4ctcgsfupga6svtp": dial tcp 127.0.0.1:7777: connect: connection refused

I also see the following log for the Yatai system pod, at the time I press "Submit" in the deployment panel:

2022/04/05 12:58:10 /home/runner/work/Yatai/Yatai/api-server/services/deployment.go:303 driver: bad connection
[2.078ms] [rows:0] UPDATE "deployment" SET "status"='deploying',"status_updated_at"='2022-04-05 12:58:10.71',"updated_at"='2022-04-05 12:58:10.71' WHERE id = 2 AND "deployment"."deleted_at" IS NULL

Does that ring any bell ?
Thanks

Allow for disabling of Components

In the Admin Guide it states:

* **Yatai-components:**

    Yatai groups dependency services into components for easy management. These dependency services are deployed in the `yatai-components` namespace. Yatai installs the default deployment component after the service start.

    *Components and their managed services:*

    * *Deployment*: Nginx Ingress Controller, Minio, Docker Registry

    * *Logging*: Loki, Grafana

    * *Monitoring*: Prometheus, Grafana

This should not be done by yatai/the operator, and instead be done by the helm chart that installs the operator.

Having it in the operator prevents a lot of control on the part of the user who may want to use their own grafana, prometheus, alternative logging systems, ingresses, etc.

Either provide a way to disable the creation of these components, or the better option, move them to the helm chart.

Yatai UI possible bug

Hi

I'm not sure if this how it is suppose to work but the "Accessing Endpoint" button appears when the deployment is terminating/terminated, but not when it is active/Success. As a user the inverse display logic is more useful.

Screenshot 2022-05-21 at 12 22 31 PM

Screenshot 2022-05-21 at 12 22 16 PM

feat: declarative deployment management for GitOps workflow on Yatai

This is something from our original design for Yatai but still missing in the current implementation. We would want to allow users to describe a deployment's detailed spec via a YAML file, which guarantees reproducing an identical model deployment on Kubernetes. This is especially useful for a GitOps workflow, where user can test out a deployment in test/dev environment, check in the YAML file to git repo, and apply the same deployment config to production cluster.

  1. DevOps centric workflow with kubectl and Kubernetes CRD resource config:
kubectl apply ./bento_deployment.yaml
# bento_deployment.yaml
apiVersion: yatai.bentoml.org/v1beta1
kind: BentoDeployment
metadata:
  ...
spec:
	bento_tag: 'fraud_detector:dpijemevl6nlhlg6'
	autoscaling:
	  minReplicas: 3
	  maxReplicas: 20
	resources:
		limits:
			cpu: 500m
		requests:
			cpu: 200m
	runners:
		model_runner_a:
			autoscaling:
				minReplicas: 1
				maxReplicas: 5
			resources:
				limits:
					nvidia.com/gpu: 1
					cpu: 2000m
				...

Note: we will not provide a bentoml yatai CLI command for creating/updating/deleting deployments for now. This is something we could potentially provide in the future. The benefit of it is that if a team wants its ML team to manage deployments, DevOps don't need to worry about each ML scientists' k8s cluster access, they only manage Yatai's service role permissions & quota, and Yatai will handle the API token management within ML team.

bentoml yatai deploy -f ./my_deployment.yaml --cluster default --yatai-context default

Yatai customize DNS domain for internal components

Hello bentoml team ๐Ÿ‘‹

My goal is to have a production ready BentoML platform on AZURE, I managed to have these features working:

  • external PG database on AZURE (PostGres)
  • external docker registry
  • external NGINX controller (delete the current one installed by default)
  • expose Yatai to custom URL with custom SSL

I still have one painful point concerning the storage.
For AZURE, I can't find a suitable option for storing data outside of the cluster (MinIO gateway for blob storage is end of support: https://blog.min.io/deprecation-of-the-minio-gateway/).

If I used the default MinIO installation there are settings which are incompatible with our security policies:

  1. MinIO standard installation in yatai-components is installed and exposed on URL like *.apps.yatai.dev
  2. Ingress has no SSL

Even if I customize the NGINX setting after the installation, there are still some reference to the old URL MinIO (when we push local bento to Yatai) that lead to error.

Is it possible to customize the domain name (through an env vars for example) for NGINX exposition of internal components and add custom annotation in NGINX exposition?

Support tolerations for internal components

Hi bento team,

Toleration is supported only for yatai-server
Internal components doesn't support tolerations so we can't have a dedicated nodepool for yatai.
Is it possible to include this property for all deployment done by Yatai?

Thanks.

ERROR exporting to image

I deployed the default model(iris_clf) provided by bentoml to yatai with the command below, but the model status is failed, how can I fix it?

uploaded model

git clone https://github.com/bentoml/gallery.git && cd ./gallery/quickstart
pip install -r ./requirements.txt
python train.py
bentoml build
bentoml push iris_classifier:latest

model error log:

2022-07-24T10:06:34.094284541Z 
2022-07-24T10:06:34.215977057Z 
2022-07-24T10:06:34.215993849Z  => => transferring context: 2B                                            0.0s
2022-07-24T10:06:34.215996600Z  => [internal] load build definition from Dockerfile                       0.0s
2022-07-24T10:06:34.215999164Z  => => transferring dockerfile: 67B                                        0.0s
2022-07-24T10:06:34.216002336Z  => [internal] load build context                                          0.0s
2022-07-24T10:06:34.216005685Z  => => transferring context: 6.00kB                                        0.0s
2022-07-24T10:06:34.216009748Z  => [1/1] COPY . /model                                                    0.0s
2022-07-24T10:06:34.216013132Z  => exporting to image                                                     0.0s
2022-07-24T10:06:34.216016275Z  => => exporting layers                                                    0.0s
2022-07-24T10:06:34.232096742Z 
2022-07-24T10:06:34.232192336Z  => [internal] load .dockerignore                                          0.0s
2022-07-24T10:06:34.232199784Z  => => transferring context: 2B                                            0.0s
2022-07-24T10:06:34.232203245Z  => [internal] load build definition from Dockerfile                       0.0s
2022-07-24T10:06:34.232206504Z  => => transferring dockerfile: 67B                                        0.0s
2022-07-24T10:06:34.232209719Z  => [internal] load build context                                          0.0s
2022-07-24T10:06:34.232212781Z  => => transferring context: 6.00kB                                        0.0s
2022-07-24T10:06:34.232215739Z  => [1/1] COPY . /model                                                    0.0s
2022-07-24T10:06:34.232218992Z  => ERROR exporting to image                                               0.1s
2022-07-24T10:06:34.232222200Z  => => exporting layers                                                    0.0s
2022-07-24T10:06:34.232225356Z  => => exporting manifest sha256:569f9899458eafc642e494c0f52bf2c6018ec64a  0.0s
2022-07-24T10:06:34.232228490Z  => => exporting config sha256:3fd3a3435134a3249402ce659870d60be4956921ff  0.0s
2022-07-24T10:06:34.232232398Z 
2022-07-24T10:06:34.232283977Z  > exporting to image:
2022-07-24T10:06:34.232288879Z ------
2022-07-24T10:06:34.233997747Z error: failed to solve: invalid reference format

image

yatai install code

helm install yatai yatai/yatai \
  --set ingress.enabled=false \
  --set service.type=LoadBalancer \
  --set externalS3.enabled=true \
  --set externalS3.endpoint=$S3_ENDPOINT \
  --set externalS3.region=$MY_REGION \
  --set externalS3.bucketName=$BUCKET_NAME \
  --set externalS3.secure=true \
  --set externalS3.existingSecret=yatai-s3-credentials \
  --set externalS3.existingSecretAccessKeyKey=accessKeyId \
  --set externalS3.existingSecretSecretKeyKey=secretAccessKey \
	--set externalDockerRegistry.enabled=true \
	--set externalDockerRegistry.server=$ECR_ENDPOINT \
	--set externalDockerRegistry.username=AWS \
	--set externalDockerRegistry.secure=true \
	--set externalDockerRegistry.bentoRepositoryName=$BENTO_REPO \
	--set externalDockerRegistry.modelRepositoryName=$MODEL_REPO \
	--set externalDockerRegistry.existingSecret=yatai-docker-registry-credentials \
	--set externalDockerRegistry.existingSecretPasswordKey=password \
	-n yatai-system \
	--create-namespace

Is it a problem with docker runtime in kube? (kube version=1.21)

Yatai UI not showing deployments done via kubectl command

I have performed yatai installation with external minio storage and external private docker registry
I am able to upload bento and models to console and observed it on the UI
however when I perform bento deployment using kubectl command I an unable to see it on the UI
only if I create the deployment from console can I see it reflected on the UI

Below is deployment file I am using

apiVersion: serving.yatai.ai/v1alpha2
kind: BentoDeployment
metadata:
  name: f2
  namespace: yatai
spec:
  autoscaling:
    max_replicas: 1
    min_replicas: 1
  bento_tag: onnx_vad:vdrgpiqjxcuw2pko
  envs: []
  ingress:
    enabled: true
  resources:
    limits:
      cpu: 1000m
      memory: 1024Mi
    requests:
      cpu: 500m
      memory: 500Mi
  runners:
  - autoscaling:
      max_replicas: 1
      min_replicas: 1
    envs: []
    name: onnx_vad
    resources:
      limits:
        cpu: 1000m
        memory: 1024Mi
      requests:
        cpu: 500m
        memory: 500Mi

Dashboard: error when creating Bento version

  • Push a Bento to Yatai
  • Log in to the dashboard
  • Open the pushed Bento
  • Create a version (Hit the Create button)
  • Fill in the required fields (version and date)
  • After confirmation, an error is displayed
  • Then trying to open the bento version consistently throws the same error

image

It seems that the call to:

curl -X 'GET' \
  'http://localhost:7777/api/v1/bento_repositories/iris_third_classifier/bentos/1.2' \
  -H 'accept: application/json'

returns a null manifest, whereas the dashboard assumes a manifest is here (see

<div className={styles.value}>{bento.manifest.bentoml_version}</div>
)

Allow deletion of models, bentos, and deployments via yatai UI

Currently it appears to be impossible to delete models, bentos, or deployments from a running yatai service. Over time this leads to visual clutter and makes it more difficult to interact with yatai via UI.

Ideally the user that pushed the model/bento to yatai (or created the deployment) should be allowed to delete them. Alternatively it could be limited to just admin accounts.

potential Yatai regression in 0.3.10

Issue:

Here is my test:
Install Yatai 0.3.8 with local DB --> Push / Build / Deploy : OK
Upgrade helm release to 0.3.9 --> Push / Build / Deploy: OK
Upgrade helm release to 0.3.10 --> Push / Build / deploy: OK
Then I want to use external database on AZURE so I update the values of helm chart and upgrade release.
As this is a new DB, I need to recreate a user + token
Push / Build: OK
The problem is that I cannot create new deployment anymore, the status is building and no pods are created in the yatai namespace...
Here are the output of the controller:
W0705 17:14:22.142726       1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:yatai-components:yatai-yatai-deployment-operator" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
E0705 17:14:22.142762       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:yatai-components:yatai-yatai-deployment-operator" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope
1.6570412863266902e+09	INFO	start cleaning up abandoned runner services	{"func": "doCleanUpAbandonedRunnerServices"}

Updates:

  1. created an issue: #266

Cannot push bento via CLI

I have installed Yatai in minikube
When I try to push a bento with bentoml push, I get the following error from the yatai server:

Error: cli push failed: request failed with status code 400: {"error":"pre sign s3 upload url: get bucket yatai exist: Get \"https://yatai-minio-yatai-infra-cluster-10-109-170-163.apps.yatai.dev/yatai/?location=\": dial tcp: lookup yatai-minio-yatai-infra-cluster-10-109-170-163.apps.yatai.dev on 10.96.0.10:53: no such host"}

Looking at the Yatai UI, the model is successfully uploaded, but not the bento.

Grafana exposition

Hi bento team โœ‹

When I install Monitoring component, it installs Prometheus and Grafana stack + an ingress is created on a domain own by yatai: http://grafana-yatai-infra-external-<IP>.apps.yatai.dev
In the webUI on monitoring panel, there is a link to Grafana WebUI and it refers to another entrypoint like: https://<server>.<domain>/api/v1/clusters/default/grafana/?orgId=1

My question is: why do we need to expose Grafana on apps.yatai.dev, as it is already exposed through the Yatai url (and secure by our own annotation vs the yatai internal ingress for component where we have no control).

autoscaling not working because resources/requests are not specified when creating deployment

When I'm creating new deployment from Yatai UI, I have a possibility to set up autoscaling configuration; there's even a page asking for specific CPU/Memory resources

However, deployment/pod that is created does not have this information, which results in

the HPA was unable to compute the replica count: failed to get cpu utilization: missing request for cpu

error and autoscaling not working. (The log line is from kubectl describe hpa

Support OIDC for user Management

In order to not have to create users manually, and support non-yatai specific accounts. It's also more secure.

Generic feature set is:

  1. Get Groups for a user
  2. Assign permissions to a group instead of a user

docs: Add user documentation for using Yatai

  • CICD (covered in BentoML docs)
  • Architecture of distributed runners (P0)
  • Architecture of Yatai (P0)
  • Dependencies and explanations
  • Resource and replica configuration guide (P0)
  • Configure deployment with env vars
  • Guide for observability integration
  • Auto scaling guide
  • GPU support
  • Istio configuration (P0)

feat: support Azure blob storage for storing model files

Currently only MinIO and AWS S3 are supported, users can run Yatai in AKS(Azure kubernetes service), but their will be limited to using either a in-cluster MinIO deployment or connect to an S3 bucket, which is not ideal for teams that have all their infra on Azure. This feature request is to add support for using Azure blob storage as the model/bento storage backend, in addition to MinIO and S3.

Enabling external S3 doesn't disable Minio

When deploying Yatai in EKS and enabling externalS3 it still uses Minio which claims 16(!) persistent volumes 20G each. There doesn't seem to be a way to control this via values file.

feat: support propagating liveness/readiness probe config to model deployment

Bento deployment in Yatai currently has a default liveness probe and readiness probe via BentoServer's corresponding endpoint. However, user may want to further configure it based on the type of workload(e.g. some workloads may have longer initialization time, some may want to set a different timeout value).

This feature request is to propose a deployment config in the advanced config section, allowing user to customize a deployment's liveness probe and readiness probe. It should allow user to set the following values:

readiness_probe:
   initial_delay_seconds: ..
   timeout_seconds: ..
   period_seconds: ..
   success_threshold: ..
   failure_threshold: ..
liveness_probe:
   initial_delay_seconds: ..
   timeout_seconds: ..
   period_seconds: ..
   success_threshold: ..
   failure_threshold: ..

Failed to run Yatai server in on-premise K8S

Hello, bentoML team.

I'm recently trying to use bentoML and Yatai on our on-premise K8S cluster, but somehow it failed because we don't have LB service on our cluster. Is there any guide or workarounds to deploy Yatai on non-cloud K8S?

Thank you.

Followings are a few error messages.

The error appears when I tried to push bento to yatai (yatai login is succeeded)
์Šคํฌ๋ฆฐ์ƒท 2022-06-07 ์˜คํ›„ 3 33 16

And I found that bentoml push queries to the pods naemd deployment-yatai-deployment-comp-operator under yatai-operator namespace, and it shows following error, and it shows there's no externalIP in yatai-ingress-controller-ingress-nginx-controller

2022-06-07T06:36:31.318Z	INFO	controller-runtime.manager.controller.deployment	getting Deployment ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.318Z	INFO	controller-runtime.manager.controller.deployment	Deployment getting successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.318Z	INFO	controller-runtime.manager.controller.deployment	creating namespace yatai-components ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.318Z	INFO	controller-runtime.manager.controller.deployment	namespace yatai-components creation successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.322Z	INFO	controller-runtime.manager.controller.deployment	Installing CertManagerComponent ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.322Z	INFO	controller-runtime.manager.controller.deployment	crd certificates.cert-manager.io already exists, so skipping install cert-manager	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.322Z	INFO	controller-runtime.manager.controller.deployment	Installed CertManagerComponent successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.325Z	INFO	controller-runtime.manager.controller.deployment	Installing YataiDeploymentOperatorComponent ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.326Z	INFO	controller-runtime.manager.controller.deployment	installing crd from file helm-charts/yatai-deployment-operator/crds/deployments.yaml ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.361Z	INFO	controller-runtime.manager.controller.deployment	crd bentodeployments.serving.yatai.ai updated successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.361Z	INFO	controller-runtime.manager.controller.deployment	getting helm release yatai ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.368Z	INFO	controller-runtime.manager.controller.deployment	found helm release yatai, status: deployed	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.369Z	INFO	controller-runtime.manager.controller.deployment	Installed YataiDeploymentOperatorComponent successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.373Z	INFO	controller-runtime.manager.controller.deployment	Installing CSIDriverImagePopulatorComponent ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.373Z	INFO	controller-runtime.manager.controller.deployment	getting helm release yatai-csi-driver-image-populator ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.376Z	INFO	controller-runtime.manager.controller.deployment	found helm release yatai-csi-driver-image-populator, status: deployed	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.377Z	INFO	controller-runtime.manager.controller.deployment	Installed CSIDriverImagePopulatorComponent successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.380Z	INFO	controller-runtime.manager.controller.deployment	Installing IngressControllerComponent ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.382Z	INFO	controller-runtime.manager.controller.deployment	getting helm release yatai-ingress-controller ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.390Z	INFO	controller-runtime.manager.controller.deployment	found helm release yatai-ingress-controller, status: failed	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.393Z	INFO	controller-runtime.manager.controller.deployment	Installed IngressControllerComponent successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.396Z	INFO	controller-runtime.manager.controller.deployment	Installing MinioComponent ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.396Z	INFO	controller-runtime.manager.controller.deployment	installing crd from file helm-charts/minio-operator/crds/minio.min.io_tenants.yaml ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.627Z	INFO	controller-runtime.manager.controller.deployment	crd tenants.minio.min.io updated successfully	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.627Z	INFO	controller-runtime.manager.controller.deployment	getting helm release yatai-minio ...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.639Z	INFO	controller-runtime.manager.controller.deployment	found helm release yatai-minio, status: failed	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.640Z	INFO	controller-runtime.manager.controller.deployment	getting ingress-controller service external ip...	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": ""}
2022-06-07T06:36:31.640Z	ERROR	controller-runtime.manager.controller.deployment	getting ingress-controller service external ip failed	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": "", "error": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!", "errorVerbose": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*IngressControllerComponent).getIngressControllerServiceIps\n\t/workspace/controllers/deployment_controller.go:294\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*MinioComponent).Install\n\t/workspace/controllers/deployment_controller.go:510\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile\n\t/workspace/controllers/deployment_controller.go:211\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile\n\t/workspace/controllers/deployment_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
github.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile
	/workspace/controllers/deployment_controller.go:211
github.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile
	/workspace/controllers/deployment_controller.go:126
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-06-07T06:36:31.641Z	ERROR	controller-runtime.manager.controller.deployment	Failed to install MinioComponent	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": "", "error": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!", "errorVerbose": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*IngressControllerComponent).getIngressControllerServiceIps\n\t/workspace/controllers/deployment_controller.go:294\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*MinioComponent).Install\n\t/workspace/controllers/deployment_controller.go:510\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile\n\t/workspace/controllers/deployment_controller.go:211\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile\n\t/workspace/controllers/deployment_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
github.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile
	/workspace/controllers/deployment_controller.go:126
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-06-07T06:36:31.649Z	ERROR	controller-runtime.manager.controller.deployment	Reconciler error	{"reconciler group": "component.yatai.ai", "reconciler kind": "Deployment", "name": "deployment", "namespace": "", "error": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!", "errorVerbose": "the external ip of service yatai-ingress-controller-ingress-nginx-controller on namespace yatai-components is empty!\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*IngressControllerComponent).getIngressControllerServiceIps\n\t/workspace/controllers/deployment_controller.go:294\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*MinioComponent).Install\n\t/workspace/controllers/deployment_controller.go:510\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).doReconcile\n\t/workspace/controllers/deployment_controller.go:211\ngithub.com/bentoml/yatai-deployment-comp-operator/controllers.(*DeploymentReconciler).Reconcile\n\t/workspace/controllers/deployment_controller.go:126\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214

External S3 fails for object > 5gb

Tried to do bentoml push on a self-deployed Yatai, I get this error:

โ”‚ <Error><Code>EntityTooLarge</Code><Message>Your proposed upload exceeds the maximum allowed size</Message><ProposedSize>5709791785</ProposedSize><MaxSizeโ€ฆ โ”‚ 

Support creating deployment that's only accessible in cluster

Add an option in BentoDeployment for user to choose if they want to expose the service endpoint for external access.

This should be the default behavior when creating deployment via CRD, user should be able to explicitly set this option via a field in the spec or annotation in the CRD resource YAML file, in which case, related ingress resource will be created.

PostgreSQL SSL connection

Hello bentoml team ๐Ÿ‘‹

I am trying to build a Yatai production ready platform on AZURE.
The database is hosted on an external Azure Postgres Single server but I am unable to configure the connection to the db because of the ssl mode.
In Production environment, our security policies only allow connection with SSL: sslmode=require but the Postgres connection string can only use non-ssl connection as it seems to be hard-coded here:

uri := fmt.Sprintf("postgres://%s:%s@%s:%d/%s?sslmode=disable",

Is it possible to have an option to select the SSL mode pour the PG connection?

feat: Support defining the service account for a bentoDeployment

Currently when a bento is deployed via yatai it uses the default service account for the namespace. This prevents granular permissions for those bentos to access S3/GCS buckets or other data stores, and can cause other network policy issues.

Providing a string field to the bentoDeployment spec for serviceAccountName or serviceAccount would be really useful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.