opendatahub-io / opendatahub-community Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Many of the references to slack channels in the repo are using the Slack invite link with plan text link to the channel channel. We should update all of these links to link directly to the slack channel in the format
https://odh-io.slack.com/app_redirect?channel=<SLACK CHANNEL NAME>
This will allow any members of the Open Data Hub slack channel to go directly to the channel linked.
Also, we should consolidate any references to the odh-io slack invite url to a single location so that it can be update quickly when the link expires. If we have a section in the README that has the Slack invite then all other references to the slack invite should reference the README section
Hi,
Looking at https://opendatahub.io/docs/quick-installation/
I see the step 4)
Verify the installation by viewing the Open Data Hub tab within the operator details. You Should see opendatahub listed.
It is outdated.
The version that I am using is:
Open Data Hub Operator
1.6.0 provided by Open Data Hub
The item 7) in https://opendatahub.io/docs/quick-installation/ is super noisy.
It is important to switch to the namespace you created earlier. Installing in the "OpenShift Operators" namespace will result in a fatal error similar to "csv created in namespace with multiple operatorgroups, can't pick one automatically".
It can be merged into item 8) with a small change
Before:
To view the status of the Open Data Hub operator installation, find the Open Data Hub Operator under Operators -> Installed Operators (inside the namespace you created earlier). Once the STATUS field displays InstallSucceeded, you can proceed to create a new Open Data Hub deployment.
After:
To view the status of the Open Data Hub operator installation, find the Open Data Hub Operator under Operators -> Installed Operators. Be aware to select `odh` namespace. Once the STATUS field displays InstallSucceeded, you can proceed to create a new Open Data Hub deployment.
If you Create an Instance that is not under`odh` a fatal error similar to "csv created in namespace with multiple operatorgroups, can't pick one automatically" will be raised.
Summary:
The decision to move to UWM as the monitoring stack for ML Serving is recorded here
As a part of this process we need to remove the model monitoring stack and make nesessary changes to have ML Serving metrics available via UWM
Acceptance Criteria:
Hello Team,
I want to join odh slack workspace. It says "Contact the workspace administrator at Open Data Hub for an invitation".Can anybody invite my email id "[email protected]"
Thanks
This is a tracker for all the various bits we will need to track to complete the feature work to add Habana Accelerator support.
Karl testing edit, will link or paste requirements soon
This is a top level "tracker of trackers" for supporting versions for pipelines. We want to support versions in the same way that they are supported in Kubeflow Pipelines.
Users should be able to:
Red Hat Internal Additional requirements information can be found here
Currently model serving capability in ODH is based on ModelMesh that is designed for high-scale, high-density and frequently-changing model use cases.
There are other scenarios where users might prefer single model deployment and benefit of other features like scale-to-zero, revision management, tracing etc.
KServe Model Serving runtime is able to solve similar scenarios.
Single model deployment is also the better candidate to serve LLM runtimes
This component uses Knative and Istio as dependency.
This is a tracker for bringing KServe into ODH as Trier 1 component.
UX: (add link)
Lead: TBD
UI: (add link)
Lead: TBD
Dev: (add link)
Lead: TBD
QE: (add link)
Lead: TBD
Doc: (add link)
Lead: TBD
This is a tracker for all the various bits we will need to track to complete the feature work for incorporating Service Mesh into ODH.
Clicking link in "Slack on channel #sig-platform" from https://github.com/opendatahub-io/opendatahub-community/tree/master/sig-platform
shows
"This link is no longer active
To join this workspace, you’ll need to ask the person who originally invited you for a new link."
This is a tracker for all the various bits we will need to track to complete the feature work to complete Custom Notebook (BYON) Improvements - Part 1
Refresh the SIG and working group member's list are still current.
This is a tracker for all the various bits we will need to track to complete the feature work to add Operator Rework
Add requirements
Item 3)
Click Create Instance to create a new deployment. The default kfdef provided will deploy the latest release of the [ODH Core components](https://opendatahub.io/docs/tiered-components). If you accepted the default name, this will trigger the creation of an Open Data Hub kfdef object named opendatahub and start the rollout of the [ODH Core components](https://opendatahub.io/docs/tiered-components).
There are 5 "Create Instance"
The one that the user must select is Kf Def
In the contributing.md for community governance, clarifying and defining the workflow for how to contribute a feature would be helpful to new users. For example a workflow such as 1) I have an idea/feature I want to add. 2) When do I contact the SIG to discuss. 3) Is presenting to the community for approval needed in addition to SIG approval.
The PR itself is where the discussions should be tracked. If it's an iterative or small update, contacting the SIG might not be necessary. Just working through the issue with the reviewers. We can build on this to make it more transparent as to how to get the right attention for any new features.
I am not sure if this is by design, but currently, out of the 6 default images in RHODS 1.28.1, only "minimal python" and "CUDA" do not have the Elyra configuration.
I can understand the rationale for the Minimal Image, but I would expect it to be there in the CUDA image.
Would it be possible to add the Elyra config to the CUDA image as well?
Model Serving
If my model is served externally, via a route, the URL provided for it is:
https://something.apps.cluster.code.p1.openshiftapps.com/v2/models/fraud-model/infer
If I turn off the external route, the new URL (rest) is given to me as :
http://modelmesh-serving.myproject:8008
I would expect the internal URL to end in the same way the external one does:
http://modelmesh-serving.myproject:8008/v2/models/fraud-model/infer
No response
No response
No response
RHODS 1.28.1
I recently was testing with a curl command, and just swapped the Internal URL instead of the external one, and it did not work until I manually added /v2/models/fraud-model/infer
to the end of the internal URL.
Red Hat is dropping BlueJeans service for video conferencing and we need to update the relevant links in opendatahub-community and opendatahub.io.
This needs to be announced to the community in the March meeting and mailing list
Is your feature request related to a problem? Please describe.
Sicne Openshift 4.6, I believe, there has been a new feature in both OKD and OCP called "Monitoring for User Defined Projects".
Enabling it on a cluster leads to all non-kube* and non-openshift* Namespaces being monitored by a different prometheus in openshift-user-workload-monitoring namespace. At the same time, application metrics timeseries from ServiceMonitors and PodMonitors, as well as kube-state-metrics container- and pod and PVC metrics are available per-namespace, nicely separated by namespace, with its own RBAC.
The only metrics that cannot be retrieved this way are node-exporter node-level metrics.
The thing is, Red Hat does not recommend mixing own implementations of prometheus operators (we did that on OCP 3.11 in the past and pre-OCP-4.6) with Monitoring for User Defined Projects.
"In OpenShift Container Platform 4.10 you must remove any custom Prometheus instances before enabling monitoring for user-defined projects".
Describe the solution you'd like
Q: Could you make the federation servicemonitor and prometheus and prometheus operator optional via an overlay in model-mesh? That way, you'd have all the metrics gathering still in there, while making it possible for users who have monitoring for user defined projects enabled to skip the prometheus part and cluster metrics federation part.
With Openshift Monitoring for User Defined Projects, the bringing-in / federation of cluster-level metrics from the kube-state-metrics exporter (pod container restarts , oom, all that stuff) happens automatically at the namespace-level. The only thing not accessible are node-level (node exporter) metrics. Meaning I get such things as kube_pod_container_restarts without an explicit federation servicemonitor.
ClusterRoles that are available in namespace-level rolebindings are described here
An Observe-section is also available on the Web Console GUI for all users with at least view Clusterole on a project as well as one of the clusterRoles mentioned in the link above.
See screenshots of per-namespace query-window and alerts window here https://access.redhat.com/documentation/en-us/openshift_container_platform/4.9/html/building_applications/odc-monitoring-project-and-application-metrics-using-developer-perspective
The monitoring of the metrics from odh-model-controller ServiceMonitor works with Monitoring for User Defined Projects, too, by the way.
That is, the section with the custom monitoring implementation for model mesh could be removed from odh-core, as it is achieved with Monitoring for User Defined Projects.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Links to tracker issues for components planning updates in this
This is a tracker for all the various bits we will need to track to complete the feature work to incubate VS code and R studio.
Requirements
Individual Efforts
Dev: opendatahub-io/notebooks#73
Lead: @atheo89
QE: (add link)
Lead: @mwaykole
Doc: (add link)
Lead: TBD
This is a tracker for all the various bits we will need to track to complete the feature work to add Prompt tuning for FM models.
Add requirements
Describe the bug
Currently, even in release branches, imagestream tags have referencePolicy: Source set, meaning images are always pulled from the external registry source, even when an internal openshift registry is present.
Together with imagePullPolicy: Always, this decreases stability and introduces a potential point of failure.
Steps To Reproduce
Disallow traffic to quai.io in proxy server referenced at Cluster Config.
Scenario "external image" https://itnext.io/variations-on-imagestreams-in-openshift-4-f8ee5e8be633 descibed at
Q: "Does it work if the external registry is unreachable?"
E.g. during a temporary network outage or when the external registry is down?
No, because the Pod uses an external url.
Expected behavior
When internal openshift registry is present, images should be cached in the internal openshift registry, which is what happens with tag referencePolicy: Local. When it is not present, images should be pulled from external.
Workaround (if any)
Manually modified imagestreams in odh-manifests before applying to cluster.
Open Data Hub Version
1.7.0
OpenShift Version
4.10.x on-prem OCP
Additional context
The correct way with dealing with internal openshift registry being present or not present is not setting referencePolicy: Source. The one exception being tag-notation references in tag.from.name in DEV environments. There, referencePolicy: Source can make sense, possibly also in combination with regularly-updates digests instead of tags in from.name, as you currently do in master branch.
It is to use image change trigger and letting Openshift resolve the correct location of images by means of that mechanism.
That requires PR-133 kubeflow notebook controller and PR-800 odh dashboard to be merged in.
Behavior of that mechanism:
Links to tracker issues for components planning updates in this
To standardize the minimum requirements for all git repos in the opendatahub-io org, we need to create repo templates for any new repos that will be initialized in the github.com/opendatahub-io organization
Template List:
I would assume the base requirements would be to have a
GitHub Project Template Documentation
To standardize the creation of Working Groups, we need a well defined process & checklist for how a SIG(s) proposes a WorkingGroup and the checklist for what is required when it is created
An external volunteer has offered to assist in supporting ODH deployment on IBM Power Architecture. This tracker issue will list out all of the task related to this work. The goal is to provide the default ODH capabilities when running on IBM Power Architecture
Additional images will be added as the work proceeds
This is a tracker for all the various bits we will need to track to complete the feature work for Edge PoC
The PoC will cover using ODH to build and push self contained model servers to edge clusters that are managed by Open Cluster Manager (Red Hat Advanced Cluster Management). We will use ACM Observability to monitor the performance with the edge clusters and any running models
The RHODS listing in Operator Hub is outdated and inaccurate for the latest version. @erwangranger and I have written a more relevant and engaging listing for the current RHODS, which will be linked. Suggestions and comments for the listing rewrite are appreciated!
https://docs.google.com/document/d/1pgXExt7LQBPhsANBaMh2sXe0Ay5FyDvAncszCeq7ZMQ/edit?usp=sharing
This is a tracker for all the various bits we will need to track to complete the feature work to integrate and support KServe/Caikit/TGIS for FM Serving
Add requirements
Is your feature request related to a problem? Please describe.
Our current CI test workflow verifies that ODH deploys successfully on OCP but not OKD. Since ODH is the community edition of RHODS, we need to start verifying that ODH deploys without issue on OKD to ensure that ODH works as a true upstream community project
Describe the solution you'd like
Add a CI test run for OKD to verify that everything works with the default OKD clusters
Describe alternatives you've considered
N/A
Additional context
N/A
Verify and document minimum requirements for deploying ODH core components on CRC.
Document any configurations required by the individual components to optimize resources.
This is a tracker for all the various bits we will need to track to complete the feature work to add Data Science Pipelines experiments.
Originally the odh-manifests repo was the source of truth for every component that was deployed in ODH since we defaulted to the Kubeflow model of using a single forked gitops mono-repo to customize and deploy all of the ODH components. Now that each ODH component has it's own repo under opendatahub-io, the operator has been redesigned and we are using the opendatahub-community/issues board as the initial issue tracking board tracking issues in odh-manifests has been made obsolete and redundant.
We need to disable issues in odh-manifests and redirect all Open Data Hub issues to opendatahub-community/issues as the unified. Any feedback (bugs, features, ...) with ODH deployment or installation, should be tracked in opendatahub-operator/issues. Other issues related to component functionality will be redirected to opendatahub-community/issues or the affected component.
GitHub supports linking issues to pull requests across repositories in an org. So there is no loss of builting GitHub issue automation when specifying that a PR in repository ABC Closes opendatahub-io/repository-XYZ#789
This is a tracker for all the various bits we will need to track to complete the feature work to add Distributed workloads - support CodeFlare stack.
Add requirements
As a part of the kserve integration into ODH, we need to look into how monitoring and metrics work in kserve and consider how they will fit into the monitoring story for ODH
Support sending a Slack notifications and alerts when the workflow is finished and/or for special events.
Suggestion: When a run of a pipeline finishes, automatically send an alert to a specific Slack channel. The alert can contain the name of the run/pipeline, the status, and/or other information.
Another usage is to send a notification that the run is stuck, for example when the size of the logs is not growing or other different criteria.
This is a tracker for all the various bits we will need to track to complete the feature work to add Explainability Phase 1
Add requirements
DS Project UI has "Add server", "Add data connection", "Add cluster storage" but we still use "Create workbench".
I am proposing to rename it to "Add workbench" to keep consistency
TBD
This is a tracker for all the various bits we will need to track to complete the feature work to complete Model Serving Metrics - Round 2
Add requirements
This is a tracker for all the various bits we will need to track to complete the feature work to add ability to configure storage classes
Add requirements
The Iter8 open source project (https://iter8.tools) can benefit ODH users by enabling various release engineer use-cases. The use-cases encompass automated traffic engineering for new versions of ML models (please see our MLOps Sig community meeting presentation on May 31st, 2023), and also metrics-based validation of ML models in production.
This is a tracker for bringing Iter8 into ODH.
Add requirements
UX: (add link)
Lead: TBD
UI: (add link)
Lead: TBD
Dev: (add link)
Lead: TBD
QE: (add link)
Lead: TBD
Doc: (add link)
Lead: TBD
Links to tracker issues for components planning updates in this
Hello,
Is it possible to install ODH in OKD? There seems to be an issue with:
Warning Failed 70m (x2 over 70m) kubelet Failed to pull image "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:4bef31eb993feb6f1096b51b4876c65a6fb1f4401fee97fa4f4542b6b7c9bc46": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication
and:
Warning Failed 70m (x4 over 71m) kubelet Failed to pull image "registry.redhat.io/openshift4/ose-cli@sha256:25fef269ac6e7491cb8340119a9b473acbeb53bc6970ad029fdaae59c3d0ca61": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication
Is there a workaround to make OpenDataHub work in an OKD cluster?
Links to tracker issues for components planning updates in this
Due Date: July 19th
Airflow has been made an optional tier-2 part of ODH in summer of 2022.
https://github.com/opendatahub-io-contrib/airflow-on-openshift
Recently, Elyra became a part of ODH via overlay. Even more recently, Elyra itself has been taken over by RedHat (from IBM).
opendatahub-io/notebooks#58 (comment)
Since ODH has a top-tier focus on Kubeflow Pipelines, ODH wants to focus on Kubeflow Pipelines only in Elyra.
Elyra has for a long time had Airflow support in all sorts of ways
Airflow-specific operators
https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a
Generic pipelines
, though Airflow 2.x support is still lacking, but will come, some tweaks needed for e.g. generic pipeline to DAG rendering, libraries have changed :-)
So it would be bad if the pipeline editor and runtime support for Airflow were removed. At least allow for optionally enabling it via Configmap or ENV variable, based on this
Background:
We plan to use both: data science pipelines / Kubeflow Pipelines for pure ML development and Airflow for more of an ETL / data engineering set of tasks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.