Git Product home page Git Product logo

website's Introduction

KServe documentation

Welcome to the source file repository for our documentation on https://kserve.github.io/website/

Website

The KServe documentation website is built using Material for MkDocs.

View published documentation

View all KServe documentation and walk-through our code samples on the website.

The KServe website includes versioned docs for recent releases, the KServe blog, links to all community resources, as well as KServe governance and contributor guidelines.

Run website locally

For instructions, see KServe's MkDocs contributor guide.

Website source files

Source files for the documentation on the website are located within the /docs directory of this repo.

Documentation versions for KServe releases

Each release of the KServe docs is available in the website (starting with 0.3) and their source files are all stored in branches of this repo. Take a look at the release process for more information.

Contributing to docs

We're excited that you're interested in contributing to the KServe documentation! Check out the resources below to get started.

Getting started

If you want to contribute a fix or add new content to the documentation, you can navigate through the /docs repo or use the Edit this page pencil icon on each of the pages of the website.

Before you can contribute, first start by reading the KServe contributor guidelines and learning about our community and requirements. In addition to reading about how to contribute to the docs, you should take a moment to learn about the KServe code of conduct, governance, values, and the KServe working groups and committees.

KServe community and contributor guidelines.

Source files for all KServe community and governance topics are located separately in the kserve/community repo.

To help you get started, see the following resources:

Help and support

Your help and feedback is always welcome!

If you find an issue let us know, either by clicking the Create Issue on any of the website pages, or by directly opening an issue here in the repo.

If you have a question that you can't find an answer to, we would also like to hear about that too. In addition to our docs, you can also reach out to the community for assistance.

website's People

Contributors

alexagriffith avatar alexandrebrown avatar andyi2it avatar chinhuang007 avatar ckadner avatar dependabot[bot] avatar elukey avatar gavrishp avatar greenmoon55 avatar iamlovingit avatar jooho avatar js-ts avatar juhyung-son avatar markwinter avatar pvaneck avatar rachitchauhan43 avatar rafvasq avatar rajakavitha1 avatar retocode avatar sivanantha321 avatar suresh-nakkeran avatar taneem-ibrahim avatar terrytangyuan avatar tessapham avatar theofpa avatar tomcli avatar varunsh-xilinx avatar xfu83 avatar yuzisun avatar zoramt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

website's Issues

Pod termination takes a very long time

/kind feature

Describe the solution you'd like
I have followin InferenceService

port_inference = client.V1ContainerPort(7070, protocol='TCP', name='h2c')
port_management = client.V1ContainerPort(7071, protocol='TCP', name='h2c')
pytorch_predictor=V1beta1PredictorSpec(
    pytorch=V1beta1TorchServeSpec(
        runtime_version='0.5.3-gpu',
        storage_uri='pvc://torchserve-claim/models',
        resources=V1ResourceRequirements(
            requests={'cpu':'4000m', 'memory':'8Gi', 'nvidia.com/gpu': '1'},
            limits={'cpu':'4000m', 'memory':'16Gi', 'nvidia.com/gpu': '1'}
        ),
         ports=[port_inference]
    )
)
isvc = V1beta1InferenceService(api_version=constants.KSERVE_V1BETA1,
                               kind=constants.KSERVE_KIND,
                               metadata=client.V1ObjectMeta(name=service_name, namespace=namespace),
                               spec=V1beta1InferenceServiceSpec(predictor=pytorch_predictor))

The service deploys quickly and the POD is ready to go in a few seconds.

kubectl get pods -n kubeflow-user-example-com
POD_BERT = kubectl get pods -n kubeflow-user-example-com | grep -Eo "(model-[_A-Za-z0-9-]+)"

When I delete the InferenceService, the POD terminates for a very long time - about five minutes!
What is the reason and how can termination be accelerated?

onnx example not working with triton

/kind bug

What steps did you take and what happened:
I closely followed https://github.com/kserve/kserve/blob/release-0.8/docs/samples/v1beta1/onnx/README.md
to test kserve with the onnx model provided ( storageUri: "gs://kfserving-examples/onnx/style") as I want to use kserve with .onnx models.

The problem here is that the style model provided is a single .onnx file which was needed for the onnx runtime i suppose.
Now with onnx runtime replaced by triton this exampel does not work anymore as triton will not load from a single onnx file (/mnt/models/model.onnx).

Log from the triton server in kserve-container:


+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 14:04:33.356075 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 14:04:33.356082 1 server.cc:589]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I0629 14:04:33.356195 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Triton requires a the onnx file to arranged something like this:

my_model/
  -- my_model.pbtxt
  -- 1/
       -- my_model.onnx

After creating my own triton compatible model triton loaded it correctly:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "my_model"
spec:
  predictor:
    minReplicas: 0
    onnx:
      storageUri: "http://192.168.4.252:8000/my_model.tar.gz"
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 13:48:31.691292 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 13:48:31.691309 1 server.cc:589]
+------------+---------+--------+
| Model      | Version | Status |
+------------+---------+--------+
| my_model | 1       | READY  |
+------------+---------+--------+

I0629 13:48:31.691780 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

So I assume that the example needs to be updated.

Environment:

  • Istio Version:
  • Knative Version:
  • KFServing Version: 0.8
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Kubernetes version: (use kubectl version): 1.22 (microk8s)
  • OS (e.g. from /etc/os-release):

Docs about using nodeSelector

/kind feature

I am wondering if it is possible to set nodeSelector option for Tensorflow serving (or any other serving different that custom).
I found some information about this in some issues, but they were not very specific.
If it is possible, it would be great to include any example in docs.

Document kserve dependency version support

Describe the change you'd like to see
Document the dependency version matrix for Kubernetes/Knative/Istio, we won't be able to test each version combinations but like to give recommendations for the version set which has been tested and certified.

Additional context
Add any other context or screenshots about the feature request here.

Document user migration steps

/kind feature

Describe the solution you'd like
Document user migration steps from kfserving to kserve, this migration guide will be referenced on the blog.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Broken links in the kserve documentation

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
If you click on the link for "production installation... Administrator's Guide", the link sends you to a broken page.
Screen Shot 2022-05-09 at 12 43 14

Screen Shot 2022-05-09 at 12 44 07

What did you expect to happen:
A link to a page for the production installation.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
There are several broken or outdated links in the documentation.

Environment:

  • Istio Version:
  • Knative Version:
  • KFServing Version:
  • Kubeflow version:
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Screen alignment issue in 0.8 and master versions of the doc

An extra empty sidebar appears on the right side of the home page. It reduces the available space for main content and messes up the alignment.

Expected Behavior

image

Actual Behavior

image

Steps to Reproduce the Problem

Load the 0.8 or the master version of the document and navigate to the home page.

Consolidate CIFAR-10 Outlier Sample

Should keep one copy of the CIFAR-10 Outlier detection sample:

    Hi @rafvasq, thanks for the update! We also have the same website example [here](https://github.com/kserve/website/blob/main/docs/modelserving/detect/alibi_detect/cifar10_outlier.ipynb), should we keep one place instead of two copies?

Originally posted by @yuzisun in kserve/kserve#2472 (comment)

Run mkdocs gh-deploy before mike deploy on the Github workflow

Expected Behavior

By #70 , 404.html was updated. So it was supposed that we could see the error page properly.

Actual Behavior

The error page is still broken.

Steps to Reproduce the Problem

Please see a wrong URL like https://kserve.github.io/website/0.7/admin/.

Additional Info

Cause

Proposal

Run mkdocs gh-deploy command before we run mike deploy on Github workflow. So we can sync contents on the root with the latest contents.

Thanks!

404-page is broken

Hello, KServe team!
I found a bug about 404-page!

Expected Behavior

When we try to access a wrong URL path, a 404-page show properly

Actual Behavior

The 404-page is broken, which does not read assets (JS, CSS, Images).

Screenshot from 2021-12-29 19-20-02

Steps to Reproduce the Problem

Thanks!

Documentation Improvements for v0.10.0

What is changing? (Please include as many details as possible.)

We are working on the KServe 2023 Roadmap kserve/kserve#2526 and the v0.10.0 release, as well as preparing for our eventual v1.0 ๐Ÿ˜„ . For all of this and to keep up with our changes, we need to improve our documentation and website!

Here are the current objectives to improve the Kserve website documentation

  • Unify the data plane v1 and v2 page formats (one has table one has long form text)
  • Improve the v2 page to be more succinct and clearly show what can be used by splitting the page into two: (1) that tells the story of why and what changed and (2) to explain how to use v2 and what it provides
  • document fastapi updates for model server
  • document serving runtime spec changes
  • clean up the examples in kserve repo and unify them with the website's by creating one source of truth for example documentation
  • update any out-of-date documentation and make sure the website as a whole is consistent and cohesive
  • add monitoring setup documentation & documentation for knative serverless install #24
  • implement spellcheck in repo #82
  • move queue proxy extension documentation to website

In what release will this take happen (to the best of your knowledge)?

v0.10

Remove obsolete website document

Expected Behavior

The old links should no longer work and ideally redirect to the new website url.
Eg: The following link https://kserve.github.io/website/get_started/ should redirect to https://kserve.github.io/website/latest/get_started/

Actual Behavior

Its pointing to old doc which is applicable only to 0.7 version of kserve.

Additional Info

Additional context
Add any other context about the problem here.

Install information:

  • Platform (GKE, IKS, AKS, OnPrem, etc.):
  • KServe Version:

Add ServingRuntime docs

What is changing? (Please include as many details as possible.)

Introduce serving runtimes custom resource and new model spec, this eliminates the need to change KServe controller code every time adding the new serving runtimes.

How will this impact our users?

User can still use the existing way to specify model frameworks, new model spec is introduced to support existing serving runtimes as well as user defined serving runtime.

In what release will this take happen (to the best of your knowledge)?

v0.8

Context

Link to associated PRs or issues from other repos here.

  1. Introduce serving runtime and new model spec
  2. Default serving runtime installations
  3. Auto serving runtime selection

Additional info

Versioning the website docs

Describe the change you'd like to see
A clear and concise description of what you want to happen.

Additional context
Add any other context or screenshots about the feature request here.

Update the doc for Kubernetes 1.25

Describe the change you'd like to see
Hi,

I have been using Kubernetes v1.25 (which is not in the Recommended Version Matrix), and tried to Install the KServe "Quickstart" environment (for v0.9). It didn't work since it requires Istio version > 1.14.0. I am not sure if it works for Istio v.1.15.0 but I just set the ISTIO_VERSIO=1.16.0 in quick.install.sh, and everything worked with no issue. FYI.

Additional context
Add any other context or screenshots about the feature request here.

LightGBM example fails

/kind bug

What steps did you take and what happened:

When I followed https://kserve.github.io/website/0.8/modelserving/v1beta1/lightgbm/ the request fails with {"error": "Input data must be 2 dimensional and non empty."}. I think I narrowed it down to lightgbm.Dataset failing to autodetect the feature names, the model appears to just have generic "Column_0", "Column_1", etc feature names.

I got the example to work by changing this part:

iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = lgb.Dataset(X, label=y)

...to this:

iris = load_iris()
y = iris['target']
X = iris['data']
feature_name = iris['feature_names']
dtrain = lgb.Dataset(X, label=y, feature_name=feature_name)

Here's a reproduction:

import lightgbm as lgb
import pandas as pd
from sklearn.datasets import load_iris


def main():
    explicit_model = train(make_explicit_dataset())
    auto_model = train(make_auto_dataset())

    print("Trying explicit model")
    print(predict(explicit_model))
    print("\n")
    print("Trying auto model")
    print(predict(auto_model))


def train(dataset):
    params = {
        'objective':'multiclass',
        'metric':'softmax',
        'num_class': 3
    }
    return lgb.train(params=params, train_set=dataset)


def make_auto_dataset():
    iris = load_iris()
    y = iris['target']
    X = iris['data']
    return lgb.Dataset(X, label=y)


def make_explicit_dataset():
    iris = load_iris()
    y = iris['target']
    X = iris['data']
    feature_name = iris['feature_names']
    return lgb.Dataset(X, label=y, feature_name=feature_name)


def predict(model):
    request = {'sepal_width_(cm)': {0: 3.5}, 'petal_length_(cm)': {0: 1.4}, 'petal_width_(cm)': {0: 0.2},'sepal_length_(cm)': {0: 5.1} }

    # Simulate kserve's lgbserver behavior:
    df = pd.DataFrame(request, columns=model.feature_name())
    inputs = pd.concat([df], axis=0)
    return model.predict(inputs)


if __name__ == "__main__":
    main()

The output I got, with Python 3.7, scikit-learn == 1.0.1, lightgbm == 3.3.2:

Trying explicit model
[[9.99985204e-01 1.38238969e-05 9.72063744e-07]]


Trying auto model
Traceback (most recent call last):
  File "train.py", line 52, in <module>
    main()
  File "train.py", line 14, in main
    print(predict(auto_model))
  File "train.py", line 47, in predict
    return model.predict(inputs)
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 3540, in predict
    data_has_header, is_reshape)
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 820, in predict
    data = _data_from_pandas(data, None, None, self.pandas_categorical)[0]
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 566, in _data_from_pandas
    raise ValueError('Input data must be 2 dimensional and non empty.')
ValueError: Input data must be 2 dimensional and non empty.

Add a provision to show a version for master branch

Add master as a dropdown option for versions in website

While everyone adds new feature and fix bugs in master branch of KServe, they would ideally like to update the document for it as well. However, since 0.7 is the default deployment in website, every time when a doc was updated, it updates them in 0.7. However 0.7 version of KServe is already released and some of these document changes may not be applicable to it and rather be applicable to a newer version or the master.

Add master version in dropdown and make that default for new doc changes.
Retain latest release version (0.7) as the default option in dropdown.

Add sample which explain how to deploy model on EKS cluster

/kind bug

What steps did you take and what happened:
I have installed Kubeflow-1.3. I could not figure out which version of kfserving it has installed by default.
Following were namespace created which I believe for KFServing

knative-eventing            Active   151m
knative-serving             Active   152m

What did you expect to happen:
I tried to run XGBoost sample. Everything worked fine except that my SERVICE_HOSTNAME=$(kubectl get inferenceservice xgboost-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3) became xgboost-iris.default.example.com. Which I believe is expected but I did not create any domain for this service.

Can we add a sample which can run vanila EKS cluster without worrying about domain or am I missing something?

Environment:
EKS cluster
K8s version-1.9

  • Istio Version:
  • Knative Version:
  • KFServing Version:
  • Kubeflow version: 1.3
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Update the REST/gRPC V2 API with load/unload endpoints

/kind feature

Describe the solution you'd like
There's a new load/unload endpoint that's supported in MLserver and Triton. Would be good to have this formally documented in the API docs in the repo.

Anything else you would like to add:

"Build the custom image with Buildpacks" section needs more info.

Describe the change you'd like to see
Build the custom image with Buildpacks might need to update. Currently pack build --builder=heroku/buildpacks:20 ${DOCKER_USER}/custom-model:v1 won't work without specifying the correct python version at runtime.txt. heroku/buildpacks:20 comes with python-3.10 but ray[serve]=1.10.0 , kserve=0.9.0 requires python-3.9.

Additional context
The additional file runtime.txt file with python-3.9.13 is required to successfully build a docker image for a custom model serving.

TorchServe doc updates for 0.8

What is changing? (Please include as many details as possible.)

How will this impact our users?

TorchServe is now updated to 0.5.2

  • KServe migration: deprecate service_envelope in config.properties and use enable_envvars_config: true to enable service envelop at runtime.
  • KServe v2 REST protocol support

In what release will this take happen (to the best of your knowledge)?

Ex. v0.8

Context

Link to associated PRs or issues from other repos here.

  1. kserve/kserve#1944
  2. kserve/kserve#1870

Additional info

Update logger/batcher/autoscaler/canary examples

/kind feature

Describe the solution you'd like
Update examples for api groups and kfserving name reference for
https://github.com/kubeflow/kfserving/tree/master/docs/samples/batcher
https://github.com/kubeflow/kfserving/tree/master/docs/samples/autoscaling
https://github.com/kubeflow/kfserving/tree/master/docs/samples/kafka
https://github.com/kubeflow/kfserving/tree/master/docs/samples/logger

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

KServe "Quickstart" environment script file is wrong

Expected Behavior

By executing the quick-install script referred to Getting Started page, users should be able to install KServe on their local environment.

The script referenced in the quick start page should be https://raw.githubusercontent.com/kserve/kserve/master/hack/quick_install.sh

Actual Behavior

The current link is https://raw.githubusercontent.com/kserve/kserve/release-0.7/hack/quick_install.sh, which refers to a local file for installing kserve. (https://github.com/kserve/kserve/blob/9112a5d81395cf4b32cd9c9b35f10e030c23b77e/hack/quick_install.sh#L106)

It will only raise error.

Steps to Reproduce the Problem

  1. Visit the links

Additional Info

Additional context
Add any other context about the problem here.

Install information:

  • Platform (GKE, IKS, AKS, OnPrem, etc.):
  • KServe Version: 0.7.0

Add ModelMesh documentation to website

Some information about ModelMesh should be added to the KServe website. Can probably make a modelmesh folder here for now with a few md files.

Need to later determine if we can use the docs located https://github.com/kserve/modelmesh-serving/tree/main/docs as the main source of the modelmesh docs without having to duplicate a lot of the docs in both places. Or if mkdocs even supports that. Otherwise, we can just keep a subset of the docs on the website.

FYI @animeshsingh

No license statement

Expected Behavior

The website and the contained documentation should contain a clear license statement.

Actual Behavior

There is no license statement present.

Recommendation

A common choice for this type of content is CC-BY (Creative Commons 'Attribution'), although CC-BY-SA ('Attribution' plus 'ShareAlike') can also be used if the authors want to ensure that the content remains licensed under the licensed when it is reused or derivative works are produced.

AWS IAM Role for Service Account

What is changing? (Please include as many details as possible.)

Adds explicit support for aws IRSA credential method for download models

How will this impact our users?

More secure aws credential management (no static/long lived creds needed in secrets)

In what release will this take happen (to the best of your knowledge)?

v0.10.0

Context

Link to associated PRs or issues from other repos here.

  1. kserve/kserve#2113
  2. kserve/kserve#2373

Additional info

kfctl v1.3.0 link broken for kubeflow deployment on Azure

/kind bug

What steps did you take and what happened:

  1. kubeflow setup on azure using the link https://www.kubeflow.org/docs/distributions/azure/deploy/install-kubeflow/
  2. the link is broken for v1.3 : Download the kfctl v1.3.0 release from the Kubeflow releases page.
    so have installed the available https://github.com/kubeflow/kfctl/releases

What did you expect to happen:

I need to use kfserving with torchserve on azure. What i should do for this to work ?

Document how to use STORAGE_URI especially in case of custom framework

/kind feature

Describe the solution you'd like
Describe how to use STORAGE_URI environment variable in InferenceService schema with custom framework.
It could be added in space: kfserving/docs/samples/custom/
This feature is not documented. I would like to know if this feature is official and if it is expected to be available in the future.

Anything else you would like to add:
Important information about feature:

  1. container schema can have STORAGE_URI environment variable,when used in InferenceService, it provides standard storage-initializer functionality.
  2. container name has to be : kfserving-container.
  3. The storage mounts to /mnt/models.

This feature can be very beneficial if someone needs to create a custom docker image with specific dependencies. There was a case in my organization where the serialized model had such dependencies that it was impossible to extract them into a transformer.

STORAGE_URI feature basic locations:
https://github.com/kubeflow/kfserving/blob/master/pkg/apis/serving/v1alpha2/framework_custom.go#L30
https://github.com/kubeflow/kfserving/blob/master/pkg/apis/serving/v1beta1/predictor_custom.go#L57

other examples where STORAGE_URI is used:
https://github.com/kubeflow/kfserving/blob/master/docs/samples/custom/torchserve/torchserve-custom-pv.yaml
https://github.com/kubeflow/kfserving/blob/master/docs/samples/custom/torchserve/bert-sample/bert.yaml
https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1alpha2/triton/bert

I'd like to offer my help, but first I need some initial answer and maybe some instructions.

Write a AWS Cognito Guide

/kind feature

Describe the solution you'd like
There is not a clear write up on how to configure AWS Cognito for Kfserving. The current
kubeflow e2e does not cover kfserving configuration.

There is a good writeup for GCloud IAP.
That guide could be used as a model for the AWS Cognito. This would go a long way to avoid having people struggle with the setup for kfserving on AWS.

Recent issues on this:
kserve/kserve#1154
kubeflow/website#2378

Administration - Serverless installation guide includes two istio installation

Knative should be installed before Istio
Knative itself requires a networking layer, a better way is to tell user to choose istio as the networking layer for Knative instead of treating Knative and Istio as two separate components.

Problem seen
If I install Istio before Knative, Kserve simply doesn't work. It could be due to the unsupported versions or network configured in wrong order.

Cert manager version should be restricted

Expected Behavior

Following https://kserve.github.io/website/admin/serverless/ should result in a successful install

Actual Behavior

Following https://kserve.github.io/website/admin/serverless/ I installed cert-manager at the latest 1.6 release https://kserve.github.io/website/admin/serverless/#3-install-cert-manager. Then when trying to install kserve https://kserve.github.io/website/admin/serverless/#4-install-kserve it failed with

unable to recognize "https://github.com/kserve/kserve/releases/download/v0.7.0/kserve.yaml": no matches for kind "Certificate" in version "cert-manager.io/v1alpha2"
unable to recognize "https://github.com/kserve/kserve/releases/download/v0.7.0/kserve.yaml": no matches for kind "Issuer" in version "cert-manager.io/v1alpha2"

I had to reduce the cert-manager version to 1.3 then everything worked

Install information:

  • Platform (GKE, IKS, AKS, OnPrem, etc.): k3d 1.20.1
  • KServe Version: 0.7.0

Add swagger-ui to dataplane ui

Describe the change you'd like to see

Add swagger-ui for dataplane api within docs website

This would allow user to preview endpoints and request/response objects in a familiar user interface.

Model framework version updates in 0.8

What is changing? (Please include as many details as possible.)

scikit-learn is updated to 1.0
xgboost is updated to 1.5

How will this impact our users?

only affects the user if runtimeVersion is not pinned on inference service yaml, otherwise the model trained with scikit-learn == 0.23.0 may not work with sklearnserver 0.8 which is now updated with scikit-learn == 1.0

In what release will this take happen (to the best of your knowledge)?

Ex. v0.8.0

Context

Link to associated PRs or issues from other repos here.

  1. kserve/kserve#1954

Additional info

Add a provision to make changes across multiple versions of the document

**Provision to add common changes across versions like banner **

Banners, popups, etc should usually appear across multiple live versions of the document.
Currently this requires deploying or updating old docs. This would make it more difficult as this new banner / other changes have to be updated in each version of the doc manually.

Proposing a new solution to update common things like banner, popup, etc across multiple documents.

Deploy a common js file in gh-pages repo
Add the common js file to docs

Any future common doc changes would only require changes in common.js

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.