Git Product home page Git Product logo

Comments (27)

ymmt2005 avatar ymmt2005 commented on July 21, 2024 1

Thank you.

This problem can be fixed. It's just not very easy.

Interestingly, we find the same error in our environments even though our Role resources are in Accurate managed namespaces and the accurate controller has the admin privilege for Roles.

This is caused because that Role has more privileges than ClusterRole admin.
Kubernetes API server has an additional check when editing Roles to prevent privilege escalation, so Roles are very special.
https://kubernetes.io/docs/reference/access-authn-authz/rbac/#restrictions-on-role-creation-or-update

Nevertheless, our Accurate has been working fine for us because the Roles that caused error logs were actually not intended to be propagated (if so, we'd have given necessary privileges).

Based on these, we are open to accepting a fix for this, but for us, this is not a high priority.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024 1

Not in this project, but we usually write a design doc when we work on something new.
Examples are in https://github.com/cybozu-go/moco/tree/main/docs/designdoc

You may do the same, or since this is just fixing a bug, you can also use this issue to
explain your idea for the fix.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024 1

What if that line is removed by default. And a feature gate (or flag) enables better performance, with the tradeoff of annotating additional resources in other namespaces?

I prefer a fix to skip that line if the namespace does not have labels for Accurate.
In that case, we need to reconcile all watched resources and check accurate.cybozu.com/propagate-generated annotation of their parents when an accurate label is added to a namespace.

from accurate.

zoetrope avatar zoetrope commented on July 21, 2024 1

@erikgb
Sorry for the wait. I will release it today.

from accurate.

zoetrope avatar zoetrope commented on July 21, 2024 1

released: https://github.com/cybozu-go/accurate/releases/tag/v1.2.0

from accurate.

erikgb avatar erikgb commented on July 21, 2024 1

@ymmt2005 @zoetrope The latest release (with the new feature gate enabled) seems to work very well in our POC cluster, which means we are ready to move forward. 🎉 To resolve this issue, I suggest to:

  • Deprecate the propagate-generate feature (mainly doc update)
  • Promote the DisablePropagateGenerated to Beta and enable it by default.

WDYT? If you are willing to accept PRs with these suggested changes, we should be able to remove the feature and clean up the code after a couple of releases.

from accurate.

zeroalphat avatar zeroalphat commented on July 21, 2024

@erikgb
Thanks for your feedback.
We will investigate it.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

This is fixable but not easy.
The accurate controller tries to add that annotation to record if the resource is generated from another resource, mainly for performance reasons.

If we skip this check, we have to check all watched resources when an Accurate label is added to a namespace.

With Accurate, namespaces are created dynamically by users, so statically granting permission only for a set of namespaces to the accurate controller wouldn't make sense, I guess. If we grant permissions for all namespaces, this issue is no longer a problem.

I want to mark this as by design.

from accurate.

erikgb avatar erikgb commented on July 21, 2024

I want to mark this as by design.

That would be very unfortunate and imply that Accurate is not something we can use in our clusters. And I would expect this to be problematic for a significant share of potential users of Accurate. Is this design mentioned somewhere in the docs?

By addressing, not to speak of modifying, resources in namespaces not marked as an Accurate-namespace, ref. https://cybozu-go.github.io/accurate/concepts.html#namespace-types, I think Accurate should be considered a holistic software product. Meaning that you cannot by any means migrate to/from Accurate. Your cluster must be bootstrapped with Accurate and deleted if you want to remove Accurate.

A temporary workaround could be introducing an operator config (regexp) for namespaces that Accurate should stay away from. But this would just be a minimum viable workaround and not a solution to the problem IMO.

I could be interested in participating in designing a fix for this, and performance is important. But not more important than doing the right thing. 😉

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

@erikgb
Let me ask a few questions.

  1. The accurate controller failed to modify some resources in your cluster, and errors were recorded. What was your problem with this behavior? Noisy error logs?
  2. When you give Accurate permissions only for limited namespaces, how will you give it permissions for new namespaces that your user creates via SubNamespace?

from accurate.

erikgb avatar erikgb commented on July 21, 2024
  1. The accurate controller failed to modify some resources in your cluster, and errors were recorded. What was your problem with this behavior? Noisy error logs?

Noisy error logs for sure. And since Accurate has RBAC in all namespaces I would have to check if the controller managed to add annotations to any resources of the watched kind. I can do that.

  1. When you give Accurate permissions only for limited namespaces, how will you give more permissions when a user creates a new SubNamespace?

I don't think we can/should limit the Accurate permissions to namespaces using RBAC. But IMO Accurate should never modify (or attempt to modify) resources in namespaces that are not annotated with an Accurate annotation. Could you please explain why this requirement is not possible to implement?

from accurate.

erikgb avatar erikgb commented on July 21, 2024

Based on these, we are open to accepting a fix for this, but for us, this is not a high priority.

@ymmt2005 Thanks for your feedback on this. I will discuss with my team to eventually make a contribution in this area. I would expect you to prefer a design proposal before this change is implemented in a PR. Do you have any guidelines on the structure of such designs in this project?

from accurate.

tenstad avatar tenstad commented on July 21, 2024

To make the issue a bit more precise, I suggest the following as a starting point for restricting the scope in which Accurate acts:

If neither a namespace nor any resources within it has an Accurate annotation, and no other Accurate annotations reference the namespace, Accurate should not act within the namespace.

Then I wonder:

  1. Is propagate-generated currently the onle feature causing Accurate to act outside of the scope stated above?

A very simple and temporary solution to the problem would then be allowing the feature to be disabed entirely.

  1. Would only annotating dependents with generate if their owner has the propagate-generated annotation restrict Accurate to the scope stated above?

To not have to check owners for all watched resources when Accurate is added to a namespace, cound the owners instead be watched, and their dependants be annotated generate whenever they are annotated propagate-generated? Similarly to configuring watched resource types, one would also configure which resource types can have propagate-generated. The
already existing watched list could be used, or a separate list could be added to the config.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

Is propagate-generated currently the onle feature causing Accurate to act outside of the scope stated above?

Yes, as far as I remember right now.

Would only annotating dependents with generate if their owner has the propagate-generated annotation restrict Accurate to the scope stated above?

I don't understand this question very well.

from accurate.

tenstad avatar tenstad commented on July 21, 2024

Could this line be removed:

ann[constants.AnnGenerated] = notGenerated

So that watched resources are only annotated with generate if they have an owner with the propagate-generated annotation.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

That annotation is added for better performance.
Since a resource is generated only once, we want to check the owner only once.
Without this annotation, Accurate would have to check the same resource many times.

from accurate.

tenstad avatar tenstad commented on July 21, 2024

I agree that the performance benefit is substantial! Could we still get around it somehow?

What if it was required to specify which resource types may be an owner (and have the propagate-generated annotaion)? So that owners are only looked up if they are of correct resource type? Could even configure owner resource types per resource type, to further limit lookups. E.g. {owners: {Secret: [Certificate, ExternalSecret]}}, meaning both Certificate and ExternalSecret can generate Secret.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

What if it was required to specify which resource types may be an owner (and have the propagate-generated annotaion)?

It's doable, but I don't know if it helps mitigate this reported issue.

from accurate.

tenstad avatar tenstad commented on July 21, 2024

In combination with removing the mentioned line, it could be a tradeoff between performance and resolving the issue.

Otherwise, I think we need some outside the box thinking to get this one resolved.

from accurate.

tenstad avatar tenstad commented on July 21, 2024

What if that line is removed by default. And a feature gate (or flag) enables better performance, with the tradeoff of annotating additional resources in other namespaces?

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

@erikgb @tenstad
Hello, I have some news.
It turned out that this feature, accurate.cybozu.com/propagate-generated annotation, is no longer used in our company.

So @zoetrope and I agreed that we could simply disable this feature by default or even remove it.
What do you think?

from accurate.

erikgb avatar erikgb commented on July 21, 2024

So @zoetrope and I agreed that we could simply disable this feature by default or even remove it.
What do you think?

@ymmt2005, this is really good news! 🎉 I think the end goal should be to remove the feature completely, as it will remove some critical code and make the Accurate operator even simpler and more understandable. It will definitely also simplify/improve RBAC and controller machinery. If I am not mistaken, controller-runtime will by default spin up an informer/cache (if it not already exist) for any client Get/List call. So the lookup of any owner resource can be more costly than anticipated.

How to move forward is really up to you. Our short-term goal would be to have a release of Accurate that allows us to run without this really problematic issue present. We are not prepared to promote Accurate beyond our POC/sandbox cluster without it. So as a start, I would suggest merging #92 and then cutting a release. How does that sound to you?

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

I would suggest merging #92 and then cutting a release. How does that sound to you?

I'm fine. @zoetrope what's your count?

from accurate.

erikgb avatar erikgb commented on July 21, 2024

@ymmt2005 Are you able to cut a new release soon? We would like to continue our testing of Accurate, and cannot proceed without the new feature gate introduced in #92.

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

I'm not doing that for a while. @zoetrope could you?

from accurate.

erikgb avatar erikgb commented on July 21, 2024

Thanks @zoetrope! 👍

from accurate.

ymmt2005 avatar ymmt2005 commented on July 21, 2024

I'm fine with your suggestions.

from accurate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.