Git Product home page Git Product logo

Comments (45)

andreyvelich avatar andreyvelich commented on June 20, 2024 2

I strongly believe that the manifests repo should ONLY aggregate the manifests (which are authored in the other repos). There is no benefit to bringing code into the manifests aggregation repo. We would only create problems that make it harder to develop both the code and aggregate the manifests in a usable way.

Please can you explain what kind of problems we are going to have if we will combine platform components and manifests in a single repo ?

From my point of view the benefit of combining manifests and platform components to a single repo is to simplify process of making releases and triaging issues from our end users. E.g. less GitHub repos we are going to have, less PRs and Issues will stay abandoned.
Also, manifests repo is only required for Kubeflow Platform, so combining them in a single repo make sense to me.

Kubeflow "Workspaces":

@thesuperzapper Why do you want to include PVC Management and Tensorboards to the kubeflow/workspaces ?

from kubeflow.

james-jwu avatar james-jwu commented on June 20, 2024 2

Throwing in my 2c here. I think we are trying to solve a number of issues: engineering velocity, customer perception & experience, architectural "cleanliness", and ensuring maintainability of repos. IMO not all of these issues can be solved by repo organization, but maybe we can focused on the most important issues, and find compromise on less important issues.

To the Andrey's point about "we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project [1] and can be deployed as a standalone application.[2]", I agree with the part [1]. On part [2] I think there is the chance of the "platform" growing too big, so it is not necessary to have a single "platform" repo. Rather, we could create repos for large feature areas. Today the only user facing feature in "platform" is central dashboard, so we could create a repo named as such. This will reduce confusion on the customers when they try to file issues, etc.

For the common components (profile controller, etc.), there are several options: leave them with central dashboard, create separate repo, move some/all to manifest repo. I think we can give the current maintainers of these components and manifest repo to decide. It may not results in the cleanest option from architectural perspective, but I want to optimize the developer/maintainer's workflow because it's not like we have a large group of developers behind each component.

Another point I want to make is whether we should separate workspace and notebook repos. IMO it may be a better product strategy to give customer confidence about product continuity from Notebook to its next version. So maybe having workspace developed in notebooks repo is a better choice from this perspective.

from kubeflow.

StefanoFioravanzo avatar StefanoFioravanzo commented on June 20, 2024 2

I agree with @james-jwu on keeping workspaces in the same repo as notebooks. It's much better to provide continuity by keeping both workstreams in the same repo, even if versioning/releasing may be slightly more challenging at the beginning. It makes much more sense from a product perspective and give much more clarity to users and contributors alike

from kubeflow.

rimolive avatar rimolive commented on June 20, 2024 1

@kimwnasptd just to clarify: you are suggesting that the kubeflow/multi-tenancy repo would be under the Manifests WG responsibility, correct? We would probably need a "Control Plane WG" that is both responsible for kubeflow/manifests and kubeflow/multi-tenancy.

This is a great idea, it can be WG Manifests responsibility as they are already working on both.

It adds so much overhead as Software developer, maintainer and reviewer. Just try it out yourself :-D. You will usually get lost in Processes and talking with less code and way too much communication and synchronization overhead

Maybe the overhead comes from the fact that there are too much manual steps to accomplish this? If so, we can figure out a way to automate it and trigger only when we need to cut releases for Kubeflow.

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024 1

That's why my initial proposal was to name the repo which contains dashboard/profile-controller/kfam as kubeflow/dashboard. However, to make the name more flexible (e.g. allow admission-webhook to be included), and open the possibility of "branding" the frontend of Kubeflow as "Kubeflow Central", I quite like the repo name kubeflow/central.

As @juliusvonkohout said before: #7549 (comment) most of the multi-tenancy stuff is already living in kubeflow/manifests that is why we can move profile-controller, kfam, admission-webhook to the kubeflow/manifests.

I think, we should gather feedback from folks in the community who is planning to maintain these components (e.g. @kubeflow/wg-manifests-leads @kubeflow/wg-training-leads).
Can we find folks in the community who can maintain these components ?

Also, what do you think about this question: "What repo user might use if they get issues with Profile Controller ?"

My initial thought was to use separate branches of kubeflow/workspaces, but that just leads to more confusion, as branches aren't very visible, and makes it harder to "archive" Notebooks 1.0 once 2.0 is mature.

Why you can't use the same branches for Notebooks 1.0 and 2.0 in kubeflow/workspaces ?

For example, when @kimwnasptd and team worked on the new Katib UI: https://github.com/kubeflow/katib/projects/1, we created a new directory new-ui and committed code there. For a while we were releasing 2 Katib UI images for old and new UI. After a while we deprecated the old Katib UI: kubeflow/katib#2179.

from kubeflow.

juliusvonkohout avatar juliusvonkohout commented on June 20, 2024 1

@andreyvelich since you asked me directly in the community call: Yes i am willing to maintain the components in the kubeflow/platform kubeflow/workspaces split and also in Matthew Wicks kubeflow/dashboard kubeflow/workspaces
kubeflow/admission kubeflow/platform split, as well as something in between like kubeflow/manifests kubeflow/workspaces kubeflow/multi-tenancy.

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024 1

Sure, we can start that. Let's discuss this tomorrow @kubeflow/kubeflow-steering-committee.

@kimwnasptd @thesuperzapper Do you want to use kubeflow/notebooks repo in the future to develop Notebooks 2.0, a.k.a Kubeflow Workspaces ?

Yes, that's was I was meaning by my message in #7549 (comment).

Both the existing notebooks and new workspaces code will be on the same repo (and in the same branch).

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024 1

@kubeflow/wg-notebooks-leads kubeflow/notebooks repo has been created: https://github.com/kubeflow/notebooks 🎉
Thank you for doing this @zijianjoy!

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

Thank you for opening this @thesuperzapper!
Please can you also add the suggestion that @kimwnasptd proposed here: kubeflow/internal-acls#618 (comment)

from kubeflow.

juliusvonkohout avatar juliusvonkohout commented on June 20, 2024

I would like to add @andreyvelich option mentioned here kubeflow/internal-acls#618 (comment)

My personal opinion based on andreys proposal is:

The more repositories we have, the more cumbersome development becomes. There is also the requirement from the KSC to split kubeflow/kubeflow. kubeflow/control-plane + kubeflow/workspaces and kubeflow/manifests adds too much development and synchronization overhead in my opinion. Moving multi-user/multi-tenancy stuff into Kubeflow/manifests where already most of the multi tenancy stuff lives makes more sense to me. Splitting what we have to release together anyway over multiple repositories does not make that much sense to me. Kfam, profile controller, etc. are tightly coupled with kubeflow/manifests. I am also fine with renaming kubeflow/manifests to kubeflow/control-plane or kubeflow/platform. But we need as few repositories as possible and a common place for multi-user stuff. For example notebooks, PVC-viewer and maybe some other things could stay in kubeflow/workspaces.

from kubeflow.

kimwnasptd avatar kimwnasptd commented on June 20, 2024

I like @juliusvonkohout's arguments. My prosal would as well be on moving multi-tenancy closer to manifests and the rest on kubeflow/workspaces (for now). And as next steps afterwards to define what to do with those extra components (volume mgmt, tensorboard, poddefaults) and move them potentially away from workspaces.

@thesuperzapper has a good point that right now manifests repo is laser focused only on providing a catalogue of manifests, and I agree it should stick to this. It will be confusing from a user point of view to suddenly see multi-tenancy code into a manifests repository.

So my proposal is the following, after taking into consideration @juliusvonkohout @thesuperzapper and @andreyvelich's points:

  1. Change scope of kubeflow/manifests for handling multi-tenancy
  2. Have a new repo kubeflow/multi-tenancy that will be a subproject of manifests
  3. The kubeflow/multi-tenancy repo will contain
    1. Profiles and KFAM
    2. Central Dashboard (interacts with both KFAM and Profiles)
    3. Istio manifests
    4. OIDC manifests (oauth2-proxy and Dex)
  4. Have a kubeflow/workspaces repo which will contain the rest from kubeflow/kubeflow
    1. Notebooks controller and web app
    2. Example notebook servers
    3. Volumes web app and pveviewers controller
    4. Tensorboards web app and controller
    5. PodDefaults

Note that for multi-tenancy, I explicitly didn't mention new WGs. Although I believe this makes sense down the road, but we can start with this being a supbroject of manifests since all the dicsussions are happening there already with @juliusvonkohout and I.

from kubeflow.

kimwnasptd avatar kimwnasptd commented on June 20, 2024

The above is to immediately unblock the effort of cleaning up kubeflow/kubeflow for now. Then the next goal should be to help on the scope of what the Notebooks WG owns, and what is included in the kubeflow/workspaces repo.

Specifically I would suggest we think about and have answers with the @kubeflow/kubeflow-steering-committee on the following:

  1. What do we do with components that aim to make it more streamlined to interact with K8s? (poddefaults, volumes web app for managing pvcs, pvcviewer controller)
  2. TensorBoard is an ML tool, but doesn't really fit to live under kubeflow/workspaces. Let's either
    1. Deprecate the controller and web app if it's not used a lot (should we have a survey? cc @StefanoFioravanzo)
    2. Have a new repo, in the future, for this and potentially other Data Visualisation tools

IMO with answering the above will also help the kubeflow/workspaces repo to be more clean by focusing it more on notebooks/workspaces and not on other K8s functionalities or data visualisation tools.

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

cc @kubeflow/wg-pipeline-leads @kubeflow/wg-training-leads @kubeflow/wg-deployment-leads @kubeflow/wg-model-registry-leads @kubeflow/kubeflow-steering-committee for the feedback.

from kubeflow.

StefanoFioravanzo avatar StefanoFioravanzo commented on June 20, 2024

I agree with @kimwnasptd proposal.

@kimwnasptd just to clarify: you are suggesting that the kubeflow/multi-tenancy repo would be under the Manifests WG responsibility, correct? We would probably need a "Control Plane WG" that is both responsible for kubeflow/manifests and kubeflow/multi-tenancy.

My 2 cents on:

What do we do with components that aim to make it more streamlined to interact with K8s?

These would all make sense as standalone projects, but they cannot stay outside of one of the existing working groups just yet. I am wondering why @juliusvonkohout suggests that we should have as few repos as possible. What is stopping us from having kubeflow/poddefaults and kubeflow/volume-management repos? These could still be Notebooks WG subprojects, but at least have a separate lifecycle that could promote more contributions.

TensorBoard is an ML tool, but doesn't really fit to live under kubeflow/workspaces.

Good observation. We don't have data points as to how popular this is. I don't think that this component should stay under the Notebooks WG. I think we should:

  1. Agree that indeed TF Controller cannot live under kubeflow/workspaces
  2. Decide if we are okay with having a kubeflow/tensorboard repo
  3. If NO (due to operational difficulties with having a separate repo, for whatever reason), then Deprecation is the only option
  4. If YES, then do a call to action with a deadline. If someone is willing to maintain this repo, then we can figure out how to do so.

from kubeflow.

juliusvonkohout avatar juliusvonkohout commented on June 20, 2024

These would all make sense as standalone projects, but they cannot stay outside of one of the existing working groups just yet. I am wondering why @juliusvonkohout suggests that we should have as few repos as possible. What is stopping us from having kubeflow/poddefaults and kubeflow/volume-management repos? These could still be Notebooks WG subprojects, but at least have a separate lifecycle that could promote more contributions.

It adds so much overhead as Software developer, maintainer and reviewer. Just try it out yourself :-D. You will usually get lost in Processes and talking with less code and way too much communication and synchronization overhead

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

Based on some discussions today I have updated my proposed "Option 3" above to suggest splitting the repo up into kubeflow/dashboard (could also be called kubeflow/platform), and kubeflow/workspaces.

The goal would be to build "Notebooks 2.0" in a separate branch of kubeflow/workspaces and eventually have it replace the need for the volumes and tensorboard controllers.

from kubeflow.

StefanoFioravanzo avatar StefanoFioravanzo commented on June 20, 2024

Maybe the overhead comes from the fact that there are too much manual steps to accomplish this? If so, we can figure out a way to automate it and trigger only when we need to cut releases for Kubeflow.

Exactly! This seems something that should not block the creation of new repos, but rather encourage us to find ways to remove barriers and simplify process

from kubeflow.

StefanoFioravanzo avatar StefanoFioravanzo commented on June 20, 2024

The goal would be to build "Notebooks 2.0" in a separate branch of kubeflow/workspaces and eventually have it replace the need for the volumes and tensorboard controllers

@thesuperzapper what do you mean with that?

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

In addition to the @thesuperzapper comment above: #7549 (comment) I would like to add the following ideas based on our recent discussion.

I propose the idea that we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project and can be deployed as a standalone application.
For example: Kubeflow Notebooks, Kubeflow Pipelines, Kubeflow Katib, Kubeflow Model Registry, Kubeflow Spark Operator, and Kubeflow Training Operator.
Usually, those components can have their own release schedule.

Thus, from my perspective to find place for the "common" components (e.g. profile controller, central dashboard, TensorBoard, PVC Viewer) we should define a new entity called Kubeflow Platform which provides a way to deploy all things together and it requires those "common" components.
Until we will identify clear user requirements when those components can be used as a stand-alone application, I am not sure if we need to separate them.

That should help us to explain clearer how Kubeflow can be used:

  1. Install Kubeflow Platform from manifests.
  2. Install Kubeflow Platform from package distribution.
  3. Install Kubeflow components standalone.

Option 1: Short-term simple solution

Since we don't need to version these "common" components separately, move them to the kubeflow/manifests and Notebooks components to the kubeflow/workspaces as I mentioned before: kubeflow/internal-acls#618 (comment)

Option 2: Create kubeflow/platform for common components

Move all common components to the kubeflow/platform and Notebooks components to the kubeflow/workspaces.

What do you think about it ?

from kubeflow.

StefanoFioravanzo avatar StefanoFioravanzo commented on June 20, 2024

Option 2 seems to be the most future proof and avoids confusion

from kubeflow.

rimolive avatar rimolive commented on June 20, 2024

I'll go with Andrey's Option 2 as well. Having KF component code into kubeflow/manifests will not cohesively state the objective of the manifests repo.

from kubeflow.

juliusvonkohout avatar juliusvonkohout commented on June 20, 2024

As far as i understand @andreyvelich the second option just implies renaming kubeflow/manifests to kubeflow/platform, but still having the same content as in Option one.

I am in favor of option one with the renaming to Kubeflow/platform.

Because i agree on "I propose the idea that we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project and can be deployed as a standalone application.
For example: Kubeflow Notebooks, Kubeflow Pipelines, Kubeflow Katib, Kubeflow Model Registry, Kubeflow Spark Operator, and Kubeflow Training Operator.
Usually, those components can have their own release schedule."

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

the second option just implies renaming kubeflow/manifests to kubeflow/platform, but still having the same content as in Option one.

From my point of view this is the right approach since manifests repo is the superset of all Kubeflow components to deploy Kubeflow Platform product. Also, these components are versioned the same as manifests.

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

I strongly believe that the manifests repo should ONLY aggregate the manifests (which are authored in the other repos).

There is no benefit to bringing code into the manifests aggregation repo. We would only create problems that make it harder to develop both the code and aggregate the manifests in a usable way.


The three "components" (isolatable sections of Kubeflow) that live in kubeflow/kubeflow right now are:

  1. Kubeflow "Dashboard":
    • the central dashboard itself
    • profile controller
    • KFAM
    • (and probably the manifests to deploy Istio, dex, and oauth2-proxy)
  2. Kubeflow "Workspaces":
    • Kubeflow Notebooks (controller + UI + example images)
    • PVC Management (controller + UI)
    • Tensorboards (controller + UI)
  3. Kubeflow "Admission" (PodDefaults):
    • This one should either be part of dashboard or have its own repo.
    • It is used frequently in KFP, in addition to Notebooks, so should be able to be deployed separately.

Given that, I think we should create 2-3 new repos for these components, so they can be versioned on their own lifecycle:

  • kubeflow/dashboard
  • kubeflow/workspaces
  • kubeflow/admission (or this can live under the dashboard repo, to reduce the number of repos)

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

@james-jwu I 100% agree that the repos should be named by their "user-facing purpose".

That's why my initial proposal was to name the repo which contains dashboard/profile-controller/kfam as kubeflow/dashboard. However, to make the name more flexible (e.g. allow admission-webhook to be included), and open the possibility of "branding" the frontend of Kubeflow as "Kubeflow Central", I quite like the repo name kubeflow/central.

In any case, it's not possible to use the dashboard without profile-controller or kfam so I plan to keep them in the same repo as each other.


While we could develop notebooks/workspaces in the same repo, because we want to version/release them separately (and allow them to be deployed alongside each other), I think separate repos is cleaner.

My initial thought was to use separate branches of kubeflow/workspaces, but that just leads to more confusion, as branches aren't very visible, and makes it harder to "archive" Notebooks 1.0 once 2.0 is mature.


In any case, it's clear the next steps are to:

  1. Create a new kubeflow/workspaces repo (so we can start scaffolding the Notebooks 2.0 code ASAP)
  2. Create a new kubeflow/notebooks repo (so we can migrate the Notebooks 1.0 components to it)
  3. Continue to discuss the future location for dashboard/profile/kfam/poddefaults (and leave them in kubeflow/kubeflow until we have a decision).

If @kimwnasptd agrees on steps 1 and 2, what is the process to create those new repos @james-jwu?

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

Hey @andreyvelich @james-jwu @kimwnasptd, we discussed this in the Notebooks WG meeting today, and we are happy to keep notebooks and workspaces in the same kubeflow/notebooks repo.

Therefore, since we are all in agreement that at least kubeflow/notebooks needs to exist, can we please start the process to create that repository?

from kubeflow.

rimolive avatar rimolive commented on June 20, 2024

We need to work on a 1.9.0-rc0 release by the week of April 29th, so when working with separating notebooks from the main repo make sure this doesn't affect or block cutting a release for notebooks for 1.9.0-rc0.

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

Hey @andreyvelich @james-jwu @kimwnasptd, we discussed this in the Notebooks WG meeting today, and we are happy to keep notebooks and workspaces in the same kubeflow/notebooks repo.

Therefore, since we are all in agreement that at least kubeflow/notebooks needs to exist, can we please start the process to create that repository?

Sure, we can start that. Let's discuss this tomorrow @kubeflow/kubeflow-steering-committee.

@kimwnasptd @thesuperzapper Do you want to use kubeflow/notebooks repo in the future to develop Notebooks 2.0, a.k.a Kubeflow Workspaces ?

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

We discussed this topic today during KSC call and we are happy to create this new repo kubeflow/notebooks to migrate Notebooks-related components.
@kimwnasptd Please can we get confirmation from you as well since you are member of WG Notebooks ?

from kubeflow.

kimwnasptd avatar kimwnasptd commented on June 20, 2024

@andreyvelich sounds good!

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

@james-jwu @zijianjoy thanks!

Can we please also:

  1. give Github write access to @thesuperzapper and @kimwnasptd on the new repo
    • (so we can approve GitHub actions and cut tags)
    • we will probably need to raise a new pr in kubeflow/internal-acls for this
  2. ensure the branch protection rules are set up like kubeflow/kubeflow:
    • so that the main branch can not be pushed to (so people can't bypass the bot)
      • Screenshot 2024-03-05 at 12 19 14
      • so that DCO is required, to pass under "Require status checks to pass before merging" (NOTE: the option for this won't come up until we raise a PR the first time).
  3. Add kubeflow/notebooks to the following config (so that people's PRs are not automatically self-approved):

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

I have raised a separate PR to give @kimwnasptd and @thesuperzapper write access to the new kubeflow/notebooks repo:

from kubeflow.

juliusvonkohout avatar juliusvonkohout commented on June 20, 2024

@andreyvelich what are the next steps planned then?

from kubeflow.

andreyvelich avatar andreyvelich commented on June 20, 2024

@james-jwu @zijianjoy Please can you let us know if you made changes according to the @thesuperzapper comment: #7549 (comment)

@andreyvelich what are the next steps planned then?

The next steps are:

  1. @kubeflow/wg-notebooks-leads should transfer Kubeflow Notebooks code to the kubeflow/notebooks

  2. Transfer Notebooks PRs and Issues to the new repo from kubeflow/kubeflow.

  3. We need to identify WG who can take responsibility to maintain Kubeflow Platform control-plane components:

Maybe we can spend a few minutes in the tomorrow's community call cc @jbottum

from kubeflow.

thesuperzapper avatar thesuperzapper commented on June 20, 2024

@andreyvelich @james-jwu @zijianjoy I have raised a PR in the GoogleCloudPlatform/oss-test-infra repo to require self-approval for root-level approvers.

(So we don't have driveby LGTMs accidentally mering PRs which are not ready).

GoogleCloudPlatform/oss-test-infra#2271

from kubeflow.

juliusvonkohout avatar juliusvonkohout commented on June 20, 2024

Regarding "Transfer Notebooks PRs and Issues to the new repo from kubeflow/kubeflow." maybe our GSOC Student @hansinikarunarathne can help with that @rimolive

from kubeflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.