Git Product home page Git Product logo

opendatahub-community's Introduction

Open Data Hub Community

Welcome to the Open Data Hub community! This is the starting point for joining and contributing to the Open Data Hub community. To learn more about the project structure and organization, please refer to Governance information. The governance of this community is modeled after the Kubernetes project.

Communicating

The Open Data Hub community abides by the Open Data Hub Code of Conduct on all communication platforms that we moderate listed below with noted exceptions.

Here is an excerpt from the code of conduct:

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

  • Slack - Join us on Slack for discussions and questions at odh-io.slack.com. Invite
  • Mailing List - Join the Open Data Hub mailing list [email protected] to keep up with the latest news and start discussions.
  • Community Meetings - We use Google Meet for our community group meetings and contributor programs.
  • Website - Documentation is published at opendatahub.io
  • YouTube - Videos of Open Data Hub and other related content can be seen on the AI/ML OpenShift YouTube channel.
  • Open Data Hub Blog - Read various blogs about Open Data Hub and its releases on the Open Data Hub news page.
  • Bug Reports - Any bugs with Open Data Hub can be reported to the issues page where it will be reviewed and triaged by the relevant component owners

Meeting Agenda and Notes

Meeting agenda can be found in the document "Open Data Hub Community Meeting Agenda". To add items to the agenda, please join our Open Data Hub Community group so you can edit the document and add your agenda items under the meeting date title.

Calendar of Meetings

To see upcoming Open Data Hub events and meetings, please add the Open Data Hub community meeting calendar

ODH Community Meeting - Monthly on the third Monday @ 12pm EST

Governance

Open Data Hub has the following types of groups that are officially supported:

  • Steering Committee is the governing body of the Open Data Hub project tasked with taking on sensitive topics, providing strategic direction, decision-making and oversight.
  • Special Interest Groups (SIGs) are persistent open groups that focus on a part of the project. SIGs must have open and transparent proceedings. Anyone is welcome to participate and contribute provided they follow the Open Data Hub Code of Conduct. The purpose of a SIG is to own and develop a set of subprojects.
    • Subprojects Each SIG can have a set of subprojects. These are smaller groups that can work independently. Some subprojects will be part of the main Open Data Hub deliverables while others will be more speculative.

See the full governance document for more details on these groups.

Contribute

As of April 2023, we have moved to main as the default branch for this repository.

A first step to contributing is to pick from the list of Open Data Hub SIGs. Start attending SIG meetings, join the slack channel and subscribe to the mailing list. Subprojects in the SIGs will often have a set of "help wanted" issues that can help new contributors get involved.

The Contributor Guide provides help on how to get your ideas and bug fixes seen and accepted.

Membership

We encourage all contributors to become members. We aim to grow an active, healthy community of contributors, reviewers, and code owners. Learn more about requirements and responsibilities of membership in our Community Membership page.

opendatahub-community's People

Contributors

andrewballantyne avatar anishasthana avatar catrobson avatar danielezonca avatar dhirajsb avatar ezidav avatar gmfrasca avatar gregsheremeta avatar harshad16 avatar heyselbi avatar jkoehler-redhat avatar jooho avatar lavlas avatar oshritf avatar rareddy avatar ruivieira avatar shgriffi avatar skonto avatar sriumcp avatar taneem-ibrahim avatar tteofili avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opendatahub-community's Issues

Platform Near Edge Support - PoC

This is a tracker for all the various bits we will need to track to complete the feature work for Edge PoC

The PoC will cover using ODH to build and push self contained model servers to edge clusters that are managed by Open Cluster Manager (Red Hat Advanced Cluster Management). We will use ACM Observability to monitor the performance with the edge clusters and any running models

Individual Efforts

Update all link to ODH slack channels with direct links to the actual channels

Many of the references to slack channels in the repo are using the Slack invite link with plan text link to the channel channel. We should update all of these links to link directly to the slack channel in the format

https://odh-io.slack.com/app_redirect?channel=<SLACK CHANNEL NAME>

This will allow any members of the Open Data Hub slack channel to go directly to the channel linked.

Also, we should consolidate any references to the odh-io slack invite url to a single location so that it can be update quickly when the link expires. If we have a section in the README that has the Slack invite then all other references to the slack invite should reference the README section

User Workload Monitoring for ML Serving in ODH

The decision to move to UWM as the monitoring stack for ML Serving is recorded here
As a part of this process we need to remove the model monitoring stack and make nesessary changes to have ML Serving metrics available via UWM

Acceptance Criteria:

  • Finalize rbac solution for UWM. Solution Document
  • Implement manifest/logic changes for the finalized rbac solution
  • ODH-Dashboard changes to change the metrics backend from rhods-model-monitoring to User Workload Monitoring
  • Unit tests for UserWorkloadMonitoring
    • odh-manifests tests weirdly enough already use UWM to test metrics
    • verify trusty unit tests in odh-manifests are also using UserWorkloadMonitoring
  • QE test plan
    • Ensure QE is aware of the change and has the information needed to make a test plan
  • Documentation for UWM.
    • Document rbac changes according to the finalized solution
    • Document that currently the max retention will be equal to the UWM retention

OKD Support

Hello,
Is it possible to install ODH in OKD? There seems to be an issue with:

  Warning  Failed          70m (x2 over 70m)    kubelet            Failed to pull image "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:4bef31eb993feb6f1096b51b4876c65a6fb1f4401fee97fa4f4542b6b7c9bc46": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication

and:

  Warning  Failed          70m (x4 over 71m)    kubelet            Failed to pull image "registry.redhat.io/openshift4/ose-cli@sha256:25fef269ac6e7491cb8340119a9b473acbeb53bc6970ad029fdaae59c3d0ca61": rpc error: code = Unknown desc = unable to retrieve auth token: invalid username/password: unauthorized: Please login to the Red Hat Registry using your Customer Portal credentials. Further instructions can be found here: https://access.redhat.com/RegistryAuthentication

Is there a workaround to make OpenDataHub work in an OKD cluster?

Explainability Phase 1

This is a tracker for all the various bits we will need to track to complete the feature work to add Explainability Phase 1

Requirements @keklundrh please add all the requirements here.

Add requirements

Individual Efforts

Support versions for data science pipelines

This is a top level "tracker of trackers" for supporting versions for pipelines. We want to support versions in the same way that they are supported in Kubeflow Pipelines.

Requirements

Users should be able to:

  • Add a new pipeline version
  • Access all versions for a pipeline
  • Do runs for any version including previous versions

Red Hat Internal Additional requirements information can be found here

Individual Efforts

Single Model Deployment (KServe)

Currently model serving capability in ODH is based on ModelMesh that is designed for high-scale, high-density and frequently-changing model use cases.
There are other scenarios where users might prefer single model deployment and benefit of other features like scale-to-zero, revision management, tracing etc.
KServe Model Serving runtime is able to solve similar scenarios.
Single model deployment is also the better candidate to serve LLM runtimes

This component uses Knative and Istio as dependency.

This is a tracker for bringing KServe into ODH as Trier 1 component.

Requirements

  • Batching
  • Revision Management
  • Scale-to-zero

Individual Efforts

UX: (add link)
Lead: TBD
UI: (add link)
Lead: TBD
Dev: (add link)
Lead: TBD
QE: (add link)
Lead: TBD
Doc: (add link)
Lead: TBD

[Documentation] - Improve the documentation to install Open Data Hub - Item 7

The item 7) in https://opendatahub.io/docs/quick-installation/ is super noisy.

It is important to switch to the namespace you created earlier. Installing in the "OpenShift Operators" namespace will result in a fatal error similar to "csv created in namespace with multiple operatorgroups, can't pick one automatically".

It can be merged into item 8) with a small change

Before:

To view the status of the Open Data Hub operator installation, find the Open Data Hub Operator under Operators -> Installed Operators (inside the namespace you created earlier). Once the STATUS field displays InstallSucceeded, you can proceed to create a new Open Data Hub deployment.

After:

To view the status of the Open Data Hub operator installation, find the Open Data Hub Operator under Operators -> Installed Operators. Be aware to select `odh` namespace. Once the STATUS field displays InstallSucceeded, you can proceed to create a new Open Data Hub deployment.

If you Create an Instance that is not under`odh` a fatal error similar to "csv created in namespace with multiple operatorgroups, can't pick one automatically" will be raised.

Summary:

  • Remove item 7
  • Change Item 8

ODH 1.9 Release Tracker

Links to tracker issues for components planning updates in this

Test ODH core deployment on CRC

Verify and document minimum requirements for deploying ODH core components on CRC.

Document any configurations required by the individual components to optimize resources.

  • Verify CPU requirements
  • Verify Memory requirements

Add repository templates for projects, SIGs and WGs

To standardize the minimum requirements for all git repos in the opendatahub-io org, we need to create repo templates for any new repos that will be initialized in the github.com/opendatahub-io organization

Template List:

  • New projects starting from scratch
  • Special Interest Groups
  • Working Groups

I would assume the base requirements would be to have a

  • Software License
  • README template
  • Standard templates bug reports and feature requests
  • OWNERS file

GitHub Project Template Documentation

MLOps release engineering

The Iter8 open source project (https://iter8.tools) can benefit ODH users by enabling various release engineer use-cases. The use-cases encompass automated traffic engineering for new versions of ML models (please see our MLOps Sig community meeting presentation on May 31st, 2023), and also metrics-based validation of ML models in production.

This is a tracker for bringing Iter8 into ODH.

Requirements

Add requirements

Individual Efforts

UX: (add link)
Lead: TBD
UI: (add link)
Lead: TBD
Dev: (add link)
Lead: TBD
QE: (add link)
Lead: TBD
Doc: (add link)
Lead: TBD

Elyra now part of ODH, but Airflow optional support needs to be there

Airflow has been made an optional tier-2 part of ODH in summer of 2022.

https://github.com/opendatahub-io-contrib/airflow-on-openshift

Recently, Elyra became a part of ODH via overlay. Even more recently, Elyra itself has been taken over by RedHat (from IBM).

opendatahub-io/notebooks#58 (comment)

Since ODH has a top-tier focus on Kubeflow Pipelines, ODH wants to focus on Kubeflow Pipelines only in Elyra.

Elyra has for a long time had Airflow support in all sorts of ways

Airflow-specific operators

https://medium.com/ibm-data-ai/getting-started-with-apache-airflow-operators-in-elyra-aae882f80c4a

Generic pipelines

https://medium.com/ibm-data-ai/automate-your-machine-learning-workflow-tasks-using-elyra-and-apache-airflow-adf297adc455

, though Airflow 2.x support is still lacking, but will come, some tweaks needed for e.g. generic pipeline to DAG rendering, libraries have changed :-)

So it would be bad if the pipeline editor and runtime support for Airflow were removed. At least allow for optionally enabling it via Configmap or ENV variable, based on this

Background:

We plan to use both: data science pipelines / Kubeflow Pipelines for pure ML development and Airflow for more of an ETL / data engineering set of tasks.

Support for running ODH on Power Arch architecture

An external volunteer has offered to assist in supporting ODH deployment on IBM Power Architecture. This tracker issue will list out all of the task related to this work. The goal is to provide the default ODH capabilities when running on IBM Power Architecture

  • Provide a list of dependencies, version, and git URL for ODH
  • Provide the source files and image registry for rebuilds of the container images for Power Architecture
  • Provide an update to odh-manifests to deploy ODH components with Power Arch customizations
  • Provide documentation on opendatahub.io to outline the steps for customizing ODH for deployment for IBM Power Arch

Additional images will be added as the work proceeds

Disable issues in the odh-manifests repo and redirect users to the odh-community issues repository

Originally the odh-manifests repo was the source of truth for every component that was deployed in ODH since we defaulted to the Kubeflow model of using a single forked gitops mono-repo to customize and deploy all of the ODH components. Now that each ODH component has it's own repo under opendatahub-io, the operator has been redesigned and we are using the opendatahub-community/issues board as the initial issue tracking board tracking issues in odh-manifests has been made obsolete and redundant.

We need to disable issues in odh-manifests and redirect all Open Data Hub issues to opendatahub-community/issues as the unified. Any feedback (bugs, features, ...) with ODH deployment or installation, should be tracked in opendatahub-operator/issues. Other issues related to component functionality will be redirected to opendatahub-community/issues or the affected component.

Additional Info

GitHub supports linking issues to pull requests across repositories in an org. So there is no loss of builting GitHub issue automation when specifying that a PR in repository ABC Closes opendatahub-io/repository-XYZ#789

See Linking a pull request to an issue using a keyword

[Documentation] - Explicit say what Provide API the user should Create Instance

Item 3)

Click Create Instance to create a new deployment. The default kfdef provided will deploy the latest release of the [ODH Core components](https://opendatahub.io/docs/tiered-components). If you accepted the default name, this will trigger the creation of an Open Data Hub kfdef object named opendatahub and start the rollout of the [ODH Core components](https://opendatahub.io/docs/tiered-components).

There are 5 "Create Instance"

The one that the user must select is Kf Def

ODH 1.10 Release Tracker

Links to tracker issues for components planning updates in this

  • ODH Operator
  • ODH Dashboard
  • ODH Notebook Controller
  • ODH Workbench Images
  • Data Science Pipelines
  • Model Serving
  • AI Explainability
    - N/A
  • Distributed Workloads
  • opendatahub.io
  • Release Notes
  • Component Blogs

the CUDA image should probably also have the Elyra components

I am not sure if this is by design, but currently, out of the 6 default images in RHODS 1.28.1, only "minimal python" and "CUDA" do not have the Elyra configuration.
image

I can understand the rationale for the Minimal Image, but I would expect it to be there in the CUDA image.

Would it be possible to add the Elyra config to the CUDA image as well?

Define the process for creating a Working Group

To standardize the creation of Working Groups, we need a well defined process & checklist for how a SIG(s) proposes a WorkingGroup and the checklist for what is required when it is created

Distributed workloads - support CodeFlare stack

This is a tracker for all the various bits we will need to track to complete the feature work to add Distributed workloads - support CodeFlare stack.

Requirements

Add requirements

Individual Efforts

  • UX: (add link)
    • Lead: TBD
  • UI: (add link)
    • Lead: TBD
  • Dev: (add link)
    • Lead: TBD
  • QE: (add link)
    • Lead: TBD
  • Doc: (add link)
    • Lead: TBD

[Bug]: inconsistency between the URL provided for external and internal model serving.

ODH Component

Model Serving

Current Behavior

If my model is served externally, via a route, the URL provided for it is:

https://something.apps.cluster.code.p1.openshiftapps.com/v2/models/fraud-model/infer

If I turn off the external route, the new URL (rest) is given to me as :

http://modelmesh-serving.myproject:8008

Expected Behavior

I would expect the internal URL to end in the same way the external one does:

http://modelmesh-serving.myproject:8008/v2/models/fraud-model/infer

Steps To Reproduce

No response

Workaround (if any)

No response

What browsers are you seeing the problem on? (If applicable)

No response

Open Data Hub Version

RHODS 1.28.1

Anything else

I recently was testing with a curl command, and just swapped the Internal URL instead of the external one, and it did not work until I manually added /v2/models/fraud-model/infer to the end of the internal URL.

ODH Community Governance

  • Code of Conduct
  • Update Contributing Guide
  • Establish Steering Committee
  • Establish Special Interest Groups
  • Establish rules for Working Groups

Add CI runs to verify that ODH can be deployed successfully on OKD

Is your feature request related to a problem? Please describe.
Our current CI test workflow verifies that ODH deploys successfully on OCP but not OKD. Since ODH is the community edition of RHODS, we need to start verifying that ODH deploys without issue on OKD to ensure that ODH works as a true upstream community project

Describe the solution you'd like
Add a CI test run for OKD to verify that everything works with the default OKD clusters

Describe alternatives you've considered
N/A

Additional context
N/A

ODH 1.8 Release

Links to tracker issues for components planning updates in this

Due Date: July 19th

Rethink Metrics monitoring Stack in ODH

Is your feature request related to a problem? Please describe.
Sicne Openshift 4.6, I believe, there has been a new feature in both OKD and OCP called "Monitoring for User Defined Projects".
Enabling it on a cluster leads to all non-kube* and non-openshift* Namespaces being monitored by a different prometheus in openshift-user-workload-monitoring namespace. At the same time, application metrics timeseries from ServiceMonitors and PodMonitors, as well as kube-state-metrics container- and pod and PVC metrics are available per-namespace, nicely separated by namespace, with its own RBAC.

The only metrics that cannot be retrieved this way are node-exporter node-level metrics.

The thing is, Red Hat does not recommend mixing own implementations of prometheus operators (we did that on OCP 3.11 in the past and pre-OCP-4.6) with Monitoring for User Defined Projects.

"In OpenShift Container Platform 4.10 you must remove any custom Prometheus instances before enabling monitoring for user-defined projects".

https://docs.openshift.com/container-platform/4.10/monitoring/enabling-monitoring-for-user-defined-projects.html

Describe the solution you'd like

Bildschirmfoto 2023-03-20 um 15 10 59

Q: Could you make the federation servicemonitor and prometheus and prometheus operator optional via an overlay in model-mesh? That way, you'd have all the metrics gathering still in there, while making it possible for users who have monitoring for user defined projects enabled to skip the prometheus part and cluster metrics federation part.

With Openshift Monitoring for User Defined Projects, the bringing-in / federation of cluster-level metrics from the kube-state-metrics exporter (pod container restarts , oom, all that stuff) happens automatically at the namespace-level. The only thing not accessible are node-level (node exporter) metrics. Meaning I get such things as kube_pod_container_restarts without an explicit federation servicemonitor.

ClusterRoles that are available in namespace-level rolebindings are described here

https://docs.openshift.com/container-platform/4.10/monitoring/enabling-monitoring-for-user-defined-projects.html#granting-users-permission-to-monitor-user-defined-projects_enabling-monitoring-for-user-defined-projects

An Observe-section is also available on the Web Console GUI for all users with at least view Clusterole on a project as well as one of the clusterRoles mentioned in the link above.

See screenshots of per-namespace query-window and alerts window here https://access.redhat.com/documentation/en-us/openshift_container_platform/4.9/html/building_applications/odc-monitoring-project-and-application-metrics-using-developer-perspective

The monitoring of the metrics from odh-model-controller ServiceMonitor works with Monitoring for User Defined Projects, too, by the way.

That is, the section with the custom monitoring implementation for model mesh could be removed from odh-core, as it is achieved with Monitoring for User Defined Projects.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Status Notifications of the workflow run

Support sending a Slack notifications and alerts when the workflow is finished and/or for special events.

Suggestion: When a run of a pipeline finishes, automatically send an alert to a specific Slack channel. The alert can contain the name of the run/pipeline, the status, and/or other information.
Another usage is to send a notification that the run is stuck, for example when the size of the logs is not growing or other different criteria.

Define the workflow for how to contribute.

In the contributing.md for community governance, clarifying and defining the workflow for how to contribute a feature would be helpful to new users. For example a workflow such as 1) I have an idea/feature I want to add. 2) When do I contact the SIG to discuss. 3) Is presenting to the community for approval needed in addition to SIG approval.

The PR itself is where the discussions should be tracked. If it's an iterative or small update, contacting the SIG might not be necessary. Just working through the issue with the reviewers. We can build on this to make it more transparent as to how to get the right attention for any new features.

Data Science Pipelines experiments

This is a tracker for all the various bits we will need to track to complete the feature work to add Data Science Pipelines experiments.

Requirements

  • Add Experiments Creation Modal to the Runs pages (Scheduled and Triggered)
  • Add Experiments toggle view for Scheduled and Triggered Runs, to toggle between the Run list and Experiment groupings
  • Add Edit and Delete Experiments to the Experiment Grouping
  • Add Experiments to Run Creation Page

Individual Efforts

change release branch imagestream tag referencePolicy to Local instead of Source

Describe the bug
Currently, even in release branches, imagestream tags have referencePolicy: Source set, meaning images are always pulled from the external registry source, even when an internal openshift registry is present.
Together with imagePullPolicy: Always, this decreases stability and introduces a potential point of failure.

https://github.com/opendatahub-io/odh-manifests/blob/v1.7/notebook-images/base/jupyter-datascience-notebook-imagestream.yaml#L27

Steps To Reproduce

Disallow traffic to quai.io in proxy server referenced at Cluster Config.

Scenario "external image" https://itnext.io/variations-on-imagestreams-in-openshift-4-f8ee5e8be633 descibed at

Q: "Does it work if the external registry is unreachable?"

E.g. during a temporary network outage or when the external registry is down?
No, because the Pod uses an external url.

Expected behavior
When internal openshift registry is present, images should be cached in the internal openshift registry, which is what happens with tag referencePolicy: Local. When it is not present, images should be pulled from external.

Workaround (if any)
Manually modified imagestreams in odh-manifests before applying to cluster.

Open Data Hub Version
1.7.0

OpenShift Version
4.10.x on-prem OCP

Additional context
The correct way with dealing with internal openshift registry being present or not present is not setting referencePolicy: Source. The one exception being tag-notation references in tag.from.name in DEV environments. There, referencePolicy: Source can make sense, possibly also in combination with regularly-updates digests instead of tags in from.name, as you currently do in master branch.

It is to use image change trigger and letting Openshift resolve the correct location of images by means of that mechanism.
That requires PR-133 kubeflow notebook controller and PR-800 odh dashboard to be merged in.

Behavior of that mechanism:

Bildschirmfoto 2023-07-05 um 18 01 30

Support FM Prompt tuning using Codeflare/Distributed Workloads

This is a tracker for all the various bits we will need to track to complete the feature work to add Prompt tuning for FM models.

Requirements

Add requirements

Individual Efforts

  • UX: (add link)
    • Lead: TBD
  • UI: (add link)
    • Lead: TBD
  • Dev: (add link)
    • Lead: TBD
  • QE: (add link)
  • Doc: (add link)
    • Lead: TBD

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.