Comments (45)
I strongly believe that the manifests repo should ONLY aggregate the manifests (which are authored in the other repos). There is no benefit to bringing code into the manifests aggregation repo. We would only create problems that make it harder to develop both the code and aggregate the manifests in a usable way.
Please can you explain what kind of problems we are going to have if we will combine platform components and manifests in a single repo ?
From my point of view the benefit of combining manifests
and platform
components to a single repo is to simplify process of making releases and triaging issues from our end users. E.g. less GitHub repos we are going to have, less PRs and Issues will stay abandoned.
Also, manifests
repo is only required for Kubeflow Platform
, so combining them in a single repo make sense to me.
Kubeflow "Workspaces":
@thesuperzapper Why do you want to include PVC Management and Tensorboards to the kubeflow/workspaces
?
from kubeflow.
Throwing in my 2c here. I think we are trying to solve a number of issues: engineering velocity, customer perception & experience, architectural "cleanliness", and ensuring maintainability of repos. IMO not all of these issues can be solved by repo organization, but maybe we can focused on the most important issues, and find compromise on less important issues.
To the Andrey's point about "we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project [1] and can be deployed as a standalone application.[2]", I agree with the part [1]. On part [2] I think there is the chance of the "platform" growing too big, so it is not necessary to have a single "platform" repo. Rather, we could create repos for large feature areas. Today the only user facing feature in "platform" is central dashboard, so we could create a repo named as such. This will reduce confusion on the customers when they try to file issues, etc.
For the common components (profile controller, etc.), there are several options: leave them with central dashboard, create separate repo, move some/all to manifest repo. I think we can give the current maintainers of these components and manifest repo to decide. It may not results in the cleanest option from architectural perspective, but I want to optimize the developer/maintainer's workflow because it's not like we have a large group of developers behind each component.
Another point I want to make is whether we should separate workspace and notebook repos. IMO it may be a better product strategy to give customer confidence about product continuity from Notebook to its next version. So maybe having workspace developed in notebooks repo is a better choice from this perspective.
from kubeflow.
I agree with @james-jwu on keeping workspaces in the same repo as notebooks. It's much better to provide continuity by keeping both workstreams in the same repo, even if versioning/releasing may be slightly more challenging at the beginning. It makes much more sense from a product perspective and give much more clarity to users and contributors alike
from kubeflow.
@kimwnasptd just to clarify: you are suggesting that the kubeflow/multi-tenancy repo would be under the Manifests WG responsibility, correct? We would probably need a "Control Plane WG" that is both responsible for kubeflow/manifests and kubeflow/multi-tenancy.
This is a great idea, it can be WG Manifests responsibility as they are already working on both.
It adds so much overhead as Software developer, maintainer and reviewer. Just try it out yourself :-D. You will usually get lost in Processes and talking with less code and way too much communication and synchronization overhead
Maybe the overhead comes from the fact that there are too much manual steps to accomplish this? If so, we can figure out a way to automate it and trigger only when we need to cut releases for Kubeflow.
from kubeflow.
That's why my initial proposal was to name the repo which contains dashboard/profile-controller/kfam as kubeflow/dashboard. However, to make the name more flexible (e.g. allow admission-webhook to be included), and open the possibility of "branding" the frontend of Kubeflow as "Kubeflow Central", I quite like the repo name kubeflow/central.
As @juliusvonkohout said before: #7549 (comment) most of the multi-tenancy stuff is already living in kubeflow/manifests
that is why we can move profile-controller
, kfam
, admission-webhook
to the kubeflow/manifests
.
I think, we should gather feedback from folks in the community who is planning to maintain these components (e.g. @kubeflow/wg-manifests-leads @kubeflow/wg-training-leads).
Can we find folks in the community who can maintain these components ?
Also, what do you think about this question: "What repo user might use if they get issues with Profile Controller ?"
My initial thought was to use separate branches of kubeflow/workspaces, but that just leads to more confusion, as branches aren't very visible, and makes it harder to "archive" Notebooks 1.0 once 2.0 is mature.
Why you can't use the same branches for Notebooks 1.0 and 2.0 in kubeflow/workspaces
?
For example, when @kimwnasptd and team worked on the new Katib UI: https://github.com/kubeflow/katib/projects/1, we created a new directory new-ui
and committed code there. For a while we were releasing 2 Katib UI images for old and new UI. After a while we deprecated the old Katib UI: kubeflow/katib#2179.
from kubeflow.
@andreyvelich since you asked me directly in the community call: Yes i am willing to maintain the components in the kubeflow/platform kubeflow/workspaces split and also in Matthew Wicks kubeflow/dashboard kubeflow/workspaces
kubeflow/admission kubeflow/platform split, as well as something in between like kubeflow/manifests kubeflow/workspaces kubeflow/multi-tenancy.
from kubeflow.
Sure, we can start that. Let's discuss this tomorrow @kubeflow/kubeflow-steering-committee.
@kimwnasptd @thesuperzapper Do you want to use
kubeflow/notebooks
repo in the future to develop Notebooks 2.0, a.k.a Kubeflow Workspaces ?
Yes, that's was I was meaning by my message in #7549 (comment).
Both the existing notebooks and new workspaces code will be on the same repo (and in the same branch).
from kubeflow.
@kubeflow/wg-notebooks-leads kubeflow/notebooks
repo has been created: https://github.com/kubeflow/notebooks 🎉
Thank you for doing this @zijianjoy!
from kubeflow.
Thank you for opening this @thesuperzapper!
Please can you also add the suggestion that @kimwnasptd proposed here: kubeflow/internal-acls#618 (comment)
from kubeflow.
I would like to add @andreyvelich option mentioned here kubeflow/internal-acls#618 (comment)
My personal opinion based on andreys proposal is:
The more repositories we have, the more cumbersome development becomes. There is also the requirement from the KSC to split kubeflow/kubeflow. kubeflow/control-plane + kubeflow/workspaces and kubeflow/manifests adds too much development and synchronization overhead in my opinion. Moving multi-user/multi-tenancy stuff into Kubeflow/manifests where already most of the multi tenancy stuff lives makes more sense to me. Splitting what we have to release together anyway over multiple repositories does not make that much sense to me. Kfam, profile controller, etc. are tightly coupled with kubeflow/manifests. I am also fine with renaming kubeflow/manifests to kubeflow/control-plane or kubeflow/platform. But we need as few repositories as possible and a common place for multi-user stuff. For example notebooks, PVC-viewer and maybe some other things could stay in kubeflow/workspaces.
from kubeflow.
I like @juliusvonkohout's arguments. My prosal would as well be on moving multi-tenancy closer to manifests and the rest on kubeflow/workspaces
(for now). And as next steps afterwards to define what to do with those extra components (volume mgmt, tensorboard, poddefaults) and move them potentially away from workspaces.
@thesuperzapper has a good point that right now manifests repo is laser focused only on providing a catalogue of manifests, and I agree it should stick to this. It will be confusing from a user point of view to suddenly see multi-tenancy code into a manifests repository.
So my proposal is the following, after taking into consideration @juliusvonkohout @thesuperzapper and @andreyvelich's points:
- Change scope of
kubeflow/manifests
for handling multi-tenancy - Have a new repo
kubeflow/multi-tenancy
that will be a subproject of manifests - The
kubeflow/multi-tenancy
repo will contain- Profiles and KFAM
- Central Dashboard (interacts with both KFAM and Profiles)
- Istio manifests
- OIDC manifests (oauth2-proxy and Dex)
- Have a
kubeflow/workspaces
repo which will contain the rest fromkubeflow/kubeflow
- Notebooks controller and web app
- Example notebook servers
- Volumes web app and pveviewers controller
- Tensorboards web app and controller
- PodDefaults
Note that for multi-tenancy, I explicitly didn't mention new WGs. Although I believe this makes sense down the road, but we can start with this being a supbroject of manifests since all the dicsussions are happening there already with @juliusvonkohout and I.
from kubeflow.
The above is to immediately unblock the effort of cleaning up kubeflow/kubeflow
for now. Then the next goal should be to help on the scope of what the Notebooks WG owns, and what is included in the kubeflow/workspaces
repo.
Specifically I would suggest we think about and have answers with the @kubeflow/kubeflow-steering-committee on the following:
- What do we do with components that aim to make it more streamlined to interact with K8s? (poddefaults, volumes web app for managing pvcs, pvcviewer controller)
- TensorBoard is an ML tool, but doesn't really fit to live under
kubeflow/workspaces
. Let's either- Deprecate the controller and web app if it's not used a lot (should we have a survey? cc @StefanoFioravanzo)
- Have a new repo, in the future, for this and potentially other Data Visualisation tools
IMO with answering the above will also help the kubeflow/workspaces
repo to be more clean by focusing it more on notebooks/workspaces and not on other K8s functionalities or data visualisation tools.
from kubeflow.
cc @kubeflow/wg-pipeline-leads @kubeflow/wg-training-leads @kubeflow/wg-deployment-leads @kubeflow/wg-model-registry-leads @kubeflow/kubeflow-steering-committee for the feedback.
from kubeflow.
I agree with @kimwnasptd proposal.
@kimwnasptd just to clarify: you are suggesting that the kubeflow/multi-tenancy
repo would be under the Manifests WG responsibility, correct? We would probably need a "Control Plane WG" that is both responsible for kubeflow/manifests
and kubeflow/multi-tenancy
.
My 2 cents on:
What do we do with components that aim to make it more streamlined to interact with K8s?
These would all make sense as standalone projects, but they cannot stay outside of one of the existing working groups just yet. I am wondering why @juliusvonkohout suggests that we should have as few repos as possible. What is stopping us from having kubeflow/poddefaults
and kubeflow/volume-management
repos? These could still be Notebooks WG subprojects, but at least have a separate lifecycle that could promote more contributions.
TensorBoard is an ML tool, but doesn't really fit to live under kubeflow/workspaces.
Good observation. We don't have data points as to how popular this is. I don't think that this component should stay under the Notebooks WG. I think we should:
- Agree that indeed TF Controller cannot live under
kubeflow/workspaces
- Decide if we are okay with having a
kubeflow/tensorboard
repo - If NO (due to operational difficulties with having a separate repo, for whatever reason), then Deprecation is the only option
- If YES, then do a call to action with a deadline. If someone is willing to maintain this repo, then we can figure out how to do so.
from kubeflow.
These would all make sense as standalone projects, but they cannot stay outside of one of the existing working groups just yet. I am wondering why @juliusvonkohout suggests that we should have as few repos as possible. What is stopping us from having
kubeflow/poddefaults
andkubeflow/volume-management
repos? These could still be Notebooks WG subprojects, but at least have a separate lifecycle that could promote more contributions.
It adds so much overhead as Software developer, maintainer and reviewer. Just try it out yourself :-D. You will usually get lost in Processes and talking with less code and way too much communication and synchronization overhead
from kubeflow.
Based on some discussions today I have updated my proposed "Option 3" above to suggest splitting the repo up into kubeflow/dashboard
(could also be called kubeflow/platform
), and kubeflow/workspaces
.
The goal would be to build "Notebooks 2.0" in a separate branch of kubeflow/workspaces
and eventually have it replace the need for the volumes and tensorboard controllers.
from kubeflow.
Maybe the overhead comes from the fact that there are too much manual steps to accomplish this? If so, we can figure out a way to automate it and trigger only when we need to cut releases for Kubeflow.
Exactly! This seems something that should not block the creation of new repos, but rather encourage us to find ways to remove barriers and simplify process
from kubeflow.
The goal would be to build "Notebooks 2.0" in a separate branch of kubeflow/workspaces and eventually have it replace the need for the volumes and tensorboard controllers
@thesuperzapper what do you mean with that?
from kubeflow.
In addition to the @thesuperzapper comment above: #7549 (comment) I would like to add the following ideas based on our recent discussion.
I propose the idea that we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project and can be deployed as a standalone application.
For example: Kubeflow Notebooks
, Kubeflow Pipelines
, Kubeflow Katib
, Kubeflow Model Registry
, Kubeflow Spark Operator
, and Kubeflow Training Operator
.
Usually, those components can have their own release schedule.
Thus, from my perspective to find place for the "common" components (e.g. profile controller, central dashboard, TensorBoard, PVC Viewer) we should define a new entity called Kubeflow Platform which provides a way to deploy all things together and it requires those "common" components.
Until we will identify clear user requirements when those components can be used as a stand-alone application, I am not sure if we need to separate them.
That should help us to explain clearer how Kubeflow can be used:
- Install Kubeflow Platform from manifests.
- Install Kubeflow Platform from package distribution.
- Install Kubeflow components standalone.
Option 1: Short-term simple solution
Since we don't need to version these "common" components separately, move them to the kubeflow/manifests
and Notebooks components to the kubeflow/workspaces
as I mentioned before: kubeflow/internal-acls#618 (comment)
kubeflow/kubeflow
- NONE
kubeflow/manifests
access-management
(KFAM, Auth)admission-webhook
(PodDefaults)centraldashboard
profile-controller
crud-web-apps
(UIs for: Volumes, Notebooks, Tensorboards)pvcviewer-controller
kubeflow/workspaces
notebook-controller
example-notebook-servers
(pre-built Docker images for Notebooks)jupyter web app
Option 2: Create kubeflow/platform
for common components
Move all common components to the kubeflow/platform
and Notebooks components to the kubeflow/workspaces
.
What do you think about it ?
from kubeflow.
Option 2 seems to be the most future proof and avoids confusion
from kubeflow.
I'll go with Andrey's Option 2 as well. Having KF component code into kubeflow/manifests
will not cohesively state the objective of the manifests repo.
from kubeflow.
As far as i understand @andreyvelich the second option just implies renaming kubeflow/manifests to kubeflow/platform, but still having the same content as in Option one.
I am in favor of option one with the renaming to Kubeflow/platform.
Because i agree on "I propose the idea that we should create GitHub repos for Kubeflow components only when it makes sense to call the tool as an individual sub-project and can be deployed as a standalone application.
For example: Kubeflow Notebooks, Kubeflow Pipelines, Kubeflow Katib, Kubeflow Model Registry, Kubeflow Spark Operator, and Kubeflow Training Operator.
Usually, those components can have their own release schedule."
from kubeflow.
the second option just implies renaming kubeflow/manifests to kubeflow/platform, but still having the same content as in Option one.
From my point of view this is the right approach since manifests
repo is the superset of all Kubeflow components to deploy Kubeflow Platform
product. Also, these components are versioned the same as manifests.
from kubeflow.
I strongly believe that the manifests repo should ONLY aggregate the manifests (which are authored in the other repos).
There is no benefit to bringing code into the manifests aggregation repo. We would only create problems that make it harder to develop both the code and aggregate the manifests in a usable way.
The three "components" (isolatable sections of Kubeflow) that live in kubeflow/kubeflow
right now are:
- Kubeflow "Dashboard":
- the central dashboard itself
- profile controller
- KFAM
- (and probably the manifests to deploy Istio, dex, and oauth2-proxy)
- Kubeflow "Workspaces":
- Kubeflow Notebooks (controller + UI + example images)
- PVC Management (controller + UI)
- Tensorboards (controller + UI)
- Kubeflow "Admission" (PodDefaults):
- This one should either be part of dashboard or have its own repo.
- It is used frequently in KFP, in addition to Notebooks, so should be able to be deployed separately.
Given that, I think we should create 2-3 new repos for these components, so they can be versioned on their own lifecycle:
kubeflow/dashboard
kubeflow/workspaces
kubeflow/admission
(or this can live under the dashboard repo, to reduce the number of repos)
from kubeflow.
@james-jwu I 100% agree that the repos should be named by their "user-facing purpose".
That's why my initial proposal was to name the repo which contains dashboard
/profile-controller
/kfam
as kubeflow/dashboard
. However, to make the name more flexible (e.g. allow admission-webhook
to be included), and open the possibility of "branding" the frontend of Kubeflow as "Kubeflow Central", I quite like the repo name kubeflow/central
.
In any case, it's not possible to use the dashboard
without profile-controller
or kfam
so I plan to keep them in the same repo as each other.
While we could develop notebooks/workspaces in the same repo, because we want to version/release them separately (and allow them to be deployed alongside each other), I think separate repos is cleaner.
My initial thought was to use separate branches of kubeflow/workspaces
, but that just leads to more confusion, as branches aren't very visible, and makes it harder to "archive" Notebooks 1.0 once 2.0 is mature.
In any case, it's clear the next steps are to:
- Create a new
kubeflow/workspaces
repo (so we can start scaffolding the Notebooks 2.0 code ASAP) - Create a new
kubeflow/notebooks
repo (so we can migrate the Notebooks 1.0 components to it) - Continue to discuss the future location for dashboard/profile/kfam/poddefaults (and leave them in
kubeflow/kubeflow
until we have a decision).
If @kimwnasptd agrees on steps 1 and 2, what is the process to create those new repos @james-jwu?
from kubeflow.
Hey @andreyvelich @james-jwu @kimwnasptd, we discussed this in the Notebooks WG meeting today, and we are happy to keep notebooks and workspaces in the same kubeflow/notebooks
repo.
Therefore, since we are all in agreement that at least kubeflow/notebooks
needs to exist, can we please start the process to create that repository?
from kubeflow.
We need to work on a 1.9.0-rc0 release by the week of April 29th, so when working with separating notebooks from the main repo make sure this doesn't affect or block cutting a release for notebooks for 1.9.0-rc0.
from kubeflow.
Hey @andreyvelich @james-jwu @kimwnasptd, we discussed this in the Notebooks WG meeting today, and we are happy to keep notebooks and workspaces in the same
kubeflow/notebooks
repo.Therefore, since we are all in agreement that at least
kubeflow/notebooks
needs to exist, can we please start the process to create that repository?
Sure, we can start that. Let's discuss this tomorrow @kubeflow/kubeflow-steering-committee.
@kimwnasptd @thesuperzapper Do you want to use kubeflow/notebooks
repo in the future to develop Notebooks 2.0, a.k.a Kubeflow Workspaces ?
from kubeflow.
We discussed this topic today during KSC call and we are happy to create this new repo kubeflow/notebooks
to migrate Notebooks-related components.
@kimwnasptd Please can we get confirmation from you as well since you are member of WG Notebooks ?
from kubeflow.
@andreyvelich sounds good!
from kubeflow.
@james-jwu @zijianjoy thanks!
Can we please also:
- give Github write access to @thesuperzapper and @kimwnasptd on the new repo
- (so we can approve GitHub actions and cut tags)
- we will probably need to raise a new pr in
kubeflow/internal-acls
for this
- ensure the branch protection rules are set up like
kubeflow/kubeflow
: - Add
kubeflow/notebooks
to the following config (so that people's PRs are not automatically self-approved):
from kubeflow.
I have raised a separate PR to give @kimwnasptd and @thesuperzapper write access to the new kubeflow/notebooks
repo:
from kubeflow.
@andreyvelich what are the next steps planned then?
from kubeflow.
@james-jwu @zijianjoy Please can you let us know if you made changes according to the @thesuperzapper comment: #7549 (comment)
@andreyvelich what are the next steps planned then?
The next steps are:
-
@kubeflow/wg-notebooks-leads should transfer Kubeflow Notebooks code to the
kubeflow/notebooks
-
Transfer Notebooks PRs and Issues to the new repo from
kubeflow/kubeflow
. -
We need to identify WG who can take responsibility to maintain Kubeflow Platform control-plane components:
- access-management (KFAM, Auth)
- admission-webhook (PodDefaults)
- centraldashboard
- profile-controller
- crud-web-apps (UIs for: Volumes, Notebooks, Tensorboards)
- pvcviewer-controller
Maybe we can spend a few minutes in the tomorrow's community call cc @jbottum
from kubeflow.
@andreyvelich @james-jwu @zijianjoy I have raised a PR in the GoogleCloudPlatform/oss-test-infra
repo to require self-approval for root-level approvers.
(So we don't have driveby LGTMs accidentally mering PRs which are not ready).
GoogleCloudPlatform/oss-test-infra#2271
from kubeflow.
Regarding "Transfer Notebooks PRs and Issues to the new repo from kubeflow/kubeflow." maybe our GSOC Student @hansinikarunarathne can help with that @rimolive
from kubeflow.
Related Issues (20)
- User sees shared pipelines in Private section in Central Dashboard while not being a contributor in any namespace
- Problem with google.cloud.logging and set_*_limit
- Intel GPU not in default GPU vendor list in Jupyter Notebook server HOT 3
- RStudio image ignores pod environment variables HOT 1
- when create profile,it need to pull image from internet,i need to switch it private registry HOT 3
- jupyter-web-app's `PodDefault` `configurations` are keyed by their label selector's key, not their name HOT 5
- Support non-Istio deployment, using Cilium support as a use case HOT 3
- inferenceService can pull image directly HOT 3
- Internal error occurred: failed calling webhook "namespace.sidecar-injector.istio.io": failed to call webhook: Post "https://istiod.istio-system.svc:443/inject?timeout=10s": dial tcp 10.96.44.217:443: connect: connection refused HOT 1
- issue with inferenceService ingress HOT 1
- Support easier feature serving and model serving from External Add-Ons
- How to disable Multi User Isolation/ Remove Manage Contributors from UI HOT 3
- redundancy code
- Support regular OIDC logout from the central dashboard HOT 2
- Load configuration from files instead of ConfigMaps
- Is there any optimization we can do when using preemptible instances running jobs.
- KF 1.7 Run metadata not updating HOT 1
- BUG - KubeFlow local runner parameter type HOT 2
- Support KFP V2 on the front page card
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubeflow.