Git Product home page Git Product logo

Comments (16)

jcaamano avatar jcaamano commented on September 28, 2024

I think we should answer these questions:

  1. What problem are we trying to solve, or how do we think this alternative is better than a config map?
  2. Under which name would resources requested through the NAD be advertised?
  3. How different configuration parameters, like macvtap mode and resource capacity, would be provided through the NAD?
  4. If we agree that we are using the NAD for something that is not originally designed for, and depending on the answers above, do we have any concerns on how future-proof this option is?
  5. The device plugin itself could be completely independent and could be potentially used without multus or the cni, even if we don't see a use case right now. One thing is to have dp+cni together for convenience, and something else would be to throw code that ties them together.

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

Another option could be to have an operator that watches NADs (with specific annotations) and updates the config map. We would still need to tackle the config map updating process first and some of the above questions still stand.

from macvtap-cni.

maiqueb avatar maiqueb commented on September 28, 2024

I think we should answer these questions:

  1. What problem are we trying to solve, or how do we think this alternative is better than a config map?

Simplify the configuration of the macvtap interfaces.

Independently of how you see this, the whole thing of the macvtap DS requiring the configmap to exist before the DS is just not OK. At least that must change.

Since I don't envision a single use case of the device plugin without multus, I think rallying all configuration around the NAD is nice to have, since it simplifies the configuration - which is the objective.

  1. Under which name would resources requested through the NAD be advertised?

This is a good question.

We could add an attribute to the macvtap config - e.g. 'name' / 'resourceName'.

Or, if for some reason we don't want to add more attributes, we could possibly get away with something ugly like:

fmt.Sprintf("%s_%s", macvtapConfig.master, macvtapConfig.mode)
  1. How different configuration parameters, like macvtap mode and resource capacity, would be provided through the NAD?

I actually don't think 'capacity' is that important, we could just hardcode it. But OK, I that's besides the point; it would go into the spec's 'config' attribute, as we had originally.

As they were initially, e.g.

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvtap0
  annotations:
    k8s.v1.cni.cncf.io/resourceName: macvtap.network.kubevirt.io/eth0
spec: 
  config: '{
      "cniVersion": "0.3.1",
      "name": "eth0",  # new attribute 
      "type": "macvtap",
      "master": "eth0",
      "mode": "bridge",
      "mtu": 1500
    }'
  1. If we agree that we are using the NAD for something that is not originally designed for, and depending on the answers above, do we have any concerns on how future-proof this option is?

I think you're twisting the words here a bit - at least making it sound a worse than actually is. We're using the NAD exactly for what it was designed for. But, as part of the mactap configuration, we're exposing it as a host resource.
I don't think this is much different than when you use a NAD to create a linux bridge on the host (I understand you can use the linux-bridge CNI without multus).

  1. The device plugin itself could be completely independent and could be potentially used without multus or the cni, even if we don't see a use case right now. One thing is to have dp+cni together for convenience, and something else would be to throw code that ties them together.

Right. No answer for this one. But I think the strength of this argument is directly proportional to the number of use cases without multus.

from macvtap-cni.

maiqueb avatar maiqueb commented on September 28, 2024

Another option could be to have an operator that watches NADs (with specific annotations) and updates the config map. We would still need to tackle the config map updating process first and some of the above questions still stand.

This is what we're proposing, but without the configmap indirection.

Since this approach 'hides' the configmap from the user, and allows for the k8s device plugin framework to remain independent from multus, I'm quite OK with it.

While I still don't see a reason for maintaining the DP forcefully independent from multus, (e.g. no use cases) making it closed coupled does sound like a poor choice.

Finally, the cost of the solution seems to be the same.

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

Independently of how you see this, the whole thing of the macvtap DS requiring the configmap to exist before the DS is just not OK. At least that must change.

This should not be happening right now. You can create the config map after the DS and things should work after that.

  1. Under which name would resources requested through the NAD be advertised?

This is a good question.

We could add an attribute to the macvtap config - e.g. 'name' / 'resourceName'.

The question goes more in regards to:

  • the fact that we need to end up with the resource name annotation in the NAD and whether we will be able to inject this before PODs are scheduled or if we will need this annotation from the get go.
  • are we thinking on a 1 to 1 mapping of NAD/resource?
  1. If we agree that we are using the NAD for something that is not originally designed for, and depending on the answers above, do we have any concerns on how future-proof this option is?

I think you're twisting the words here a bit - at least making it sound a worse than actually is. We're using the NAD exactly for what it was designed for. But, as part of the mactap configuration, we're exposing it as a host resource.
I don't think this is much different than when you use a NAD to create a linux bridge on the host (I understand you can use the linux-bridge CNI without multus).

For example, the NAD could be expanded in the future to add additional CNI network parameters (vlan, etc...) that make little to no sense for the device plugin and that could end up exploding the amount of different resources advertised needlessly. Unless we are talking about consolidating all the NADs into the minimum DP config set required, which would be a complication on itself, or compromising into thinking this is unlikely to happen. But I hope that we agree that the NAD is not meant to configure DPs, DP & CNIs are considered standard in kubernetes, NADs & multus are not, etc...

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

Another option could be to have an operator that watches NADs (with specific annotations) and updates the config map. We would still need to tackle the config map updating process first and some of the above questions still stand.

This is what we're proposing, but without the configmap indirection.

Since this approach 'hides' the configmap from the user, and allows for the k8s device plugin framework to remain independent from multus, I'm quite OK with it.

Ok now I realize that we want to make it look like we never had to use a device plugin :P But we had. As much as I would have liked that the DP were not needed for macvtap-cni, I would not go as far as making NADs the final word on the config of the DP. I don't see the config map as much of a problem. For me it's pretty simple & static configuration, at least for now and that maybe tricks us into thinking we could rely solely on NADs. We talked about a use case on per node policies, which would be something pretty DP specific. So the DP has potential to evolve in certain ways that have nothing to do with the CNI just because it is a DP and I am not sure if configuration through NADs is going to make this more or less complicated.

While I still don't see a reason for maintaining the DP forcefully independent from multus, (e.g. no use cases) making it closed coupled does sound like a poor choice.

Well, I wouldn't say forcefully. They are independent now and why would we forego such a wonderful thing :P Not knowing about other use cases does not mean there aren't. If I want to learn about them in the future, the better approach for me is to keep my ignorant mind open.

Anyway, this is the macvtap-cni and NADs are a defacto standard so it wouldn't be that bad. If nobody shares my concerns, I am ok to go along with it.

from macvtap-cni.

phoracek avatar phoracek commented on September 28, 2024

I think we should answer these questions:

1. What problem are we trying to solve, or how do we think this alternative is better than a config map?

NAD is the de-facto standard for exposing of a network by admin to users. It is integrated to management tools and is well known. ConfigMap is generic object, would require more explanation to users, would make it harder for us to integrate it to UI. ConfigMap is good to adjust deployment by an admin, but here we are talking about dynamic day-2 control of available resources. Also, NAD is created by admin and specifies which network is available to users and how to connect to it. From that definition, it fits pretty well for the DP object.

2. Under which name would resources requested through the NAD be advertised?

Can be NAD.Namespace/NAD.Name to make it unique and directly bound to the NAD.

3. How different configuration parameters, like macvtap mode and resource capacity, would be provided through the NAD?

Although it is not as convenient as having a dedicated API type, this would need to be done through NAD annotations or the config JSON.

4. If we agree that we are using the NAD for something that is not originally designed for, and depending on the answers above, do we have any concerns on how future-proof this option is?

NAD is stupid. It just carries a JSON config. I'm not too afraid. But there is some risk.

Another option could be to have an operator that watches NADs (with specific annotations) and updates the config map. We would still need to tackle the config map updating process first and some of the above questions still stand.

I am all for keeping this project as simple as possible. Introducing a controller translating one object to another carries risks of races and would require us to deploy another Deployment handling this.

This should not be happening right now. You can create the config map after the DS and things should work after that.

I don't see the config map as much of a problem. For me it's pretty simple & static configuration, at least for now and that maybe tricks us into thinking we could rely solely on NADs.

It is not static though. And those simple config map options we have now are not production ready. We need to be able to deploy this plugin without any day-1 configuration. It should be just installed on the cluster. Only on day-2, an administrator would create configuration object X to expose selected NICs to users in namespace Y.

Unless we mount ConfigMap as a volume (not as a env variable), it would fail. Solvable though.

For example, the NAD could be expanded in the future to add additional CNI network parameters (vlan, etc...) that make little to no sense for the device plugin and that could end up exploding the amount of different resources advertised needlessly. Unless we are talking about consolidating all the NADs into the minimum DP config set required, which would be a complication on itself, or compromising into thinking this is unlikely to happen. But I hope that we agree that the NAD is not meant to configure DPs, DP & CNIs are considered standard in kubernetes, NADs & multus are not, etc...

Fair point. I am all for simplicity, but having everything hidden in the NAD may be an abstraction issue as you describe. We can look at SR-IOV operator for reference. They have SriovNetwork (NAD in our case) and SriovNetworkNodePolicy (our ConfigMap). Maybe it would be worth to consider creating our own MacvtapNetworkNodePolicy and try to list as many options as we can think that would fit there. This should not be more difficult to implement than the only-NAD approach. We just need to define a new CRD MacvtapNetworkNodePolicy and setup a controller in the DP listing all of these objects and checking whether they match the node. Then we would have following:

policy.yaml:

apiVersion: ...
kind: MacvtapNetworkNodePolicy
metadata:
  name: policy-1
  namespace: macvtap-cni
spec:
  nicSelector:
    ...
  nodeSelector:
    ...
  resourceName: RESOURCE_NAME

nad.yaml:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvtap0
  annotations:
    k8s.v1.cni.cncf.io/resourceName: macvtap.network.kubevirt.io/RESOURCE_NAME  # <---
spec: 
  config: '{
      "cniVersion": "0.3.1",
      "type": "macvtap",
      "mode": "bridge",
      "mtu": 1500
    }'

But note that by adding a new object we may make it more difficult to integrate this solution into existing management platforms. It also may be more challenging to make this secured (with NAD, it has been already done for us).

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

And those simple config map options we have now are not production ready.

If them being overly simplistic is the problem, I don't see how the NAD solves it. But I might have misunderstood this.

We need to be able to deploy this plugin without any day-1 configuration. It should be just installed on the cluster.

Agreed, but I don't see how this is the responsibility of the DP itself. There are for sure other DPs around that are configurable and that is not an intrinsic problem of the DP.

It is not static though.

A deployment has finite number of interfaces and macvtap has a finite number of operating modes. Once initially set on a deployment, I don't see it changing much. NADs are day 2, but I am not convinced that the physical host interfaces on top of which you want to setup your macvtap networks is.

Also, NAD is created by admin and specifies which network is available to users and how to connect to it. From that definition, it fits pretty well for the DP object.

This might work for the most simple of the capabilities. But again, the NAD/CNI can have parameters that don't concern the DP and vice-versa. The CNI side of things right now has nothing that is macvtap specific, it looks more like a network host-device plugin that is ready to work with multus. Who knows if this could evolve into or be replaced by something like that, that is more rich on configuring the network in different ways.

I am all for keeping this project as simple as possible. Introducing a controller translating one object to another carries risks of races and would require us to deploy another Deployment handling this.

I am not proposing a pattern that has not been used elsewhere. sriov operator, that we have already mentioned, automates the sriov device plugin configuration. But it was just a suggestion. Probably the point I am making is for the automation to be optional and as independent as possible from the DP and the CNI.

Another suggestion I have is to enable an optional mode in the DP such that when no configuration is provided, it exposes all interfaces in all modes or a default mode in a way that resources <nic>.<mode>.macvtap.network.kubevir.io are readily available from the get go.

Concluding, I am clearly not convinced. But if my concerns are mine only, would it seem reasonable to compromise at the very least to these:

  • Make the configuration from NAD optional
  • Keep the explicit configuration option as well
  • Keep the implementation as independent as conveniently possible.

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

Another suggestion I have is to enable an optional mode in the DP such that when no configuration is provided, it exposes all interfaces in all modes or a default mode in a way that resources <nic>.<mode>.macvtap.network.kubevir.io are readily available from the get go.

Relevant to this option, I submitted a PR#527 to netlink to allow edit of the operating mode from the CNI. That would allow to simplify default resources to <nic>.macvtap.network.kubevirt.io

from macvtap-cni.

phoracek avatar phoracek commented on September 28, 2024

A deployment has finite number of interfaces and macvtap has a finite number of operating modes. Once initially set on a deployment, I don't see it changing much. NADs are day 2, but I am not convinced that the physical host interfaces on top of which you want to setup your macvtap networks is.

New Nodes are being added to the cluster and new NICs plugged in. If you have an existed cluster and in a month decide to add an application requiring MACVTAP, you don't want to remove everything and start again.

This might work for the most simple of the capabilities. But again, the NAD/CNI can have parameters that don't concern the DP and vice-versa. The CNI side of things right now has nothing that is macvtap specific, it looks more like a network host-device plugin that is ready to work with multus. Who knows if this could evolve into or be replaced by something like that, that is more rich on configuring the network in different ways.

I don't think that user-facing API should be directed by underlying implementation. As an administrator, I don't care at all who is consuming my objects. I want to have comfortable API which makes sense from the logical point for me. This is not to support NAD-only or ConfigMaps. I would just prefer if we talked in terms of user-facing API, not underlying implementation.

Another suggestion I have is to enable an optional mode in the DP such that when no configuration is provided, it exposes all interfaces in all modes or a default mode in a way that resources <nic>.<mode>.macvtap.network.kubevir.io are readily available from the get go.

I am really hesitant to add multiple-modes before we make at least one of them done and working. We should pick either one or another and implement the full application on top of that. We can add more tweaks once that's done.

I'm sharing some of your concerns. This clearly needs more design, but outside of GitHub Issues which are not the easiest to track. Maybe a design document would be a better approach. Starting with agreement on basic requirements and building on top of that. Until then, I'm fine with "expose everything" approach to unblock testing of this on KubeVirt. What do you think?

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

Until then, I'm fine with "expose everything" approach to unblock testing of this on KubeVirt. What do you think?

I am ok with this. I am unaware how this is blocking testing, can you elaborate or are we tracking this somewhere else?

from macvtap-cni.

phoracek avatar phoracek commented on September 28, 2024

I am ok with this. I am unaware how this is blocking testing, can you elaborate or are we tracking this somewhere else?

We need to finish MVP here, including automation of building of images. Then we have to attach that to cluster-network-addons-operator and only then we would be able to finish the binding PR on kubevirt, including test automation.

from macvtap-cni.

maiqueb avatar maiqueb commented on September 28, 2024

Just so there's a way forward, could we agree on:

  • start design document about user facing API
  • update DP so that if no explicit config on startup it exposes all host interfaces (a single mode, no need for all of them)

The requirements for the user facing API would be:

  • Make the implicit configuration optional ( from NAD / policy object / whatever )
  • Keep the explicit configuration option as well (explicit DP config would override the implicit)
  • Keep the DP as independent from CNI / multus as possible

A final comment; I like the proposal of exposing all host interfaces - because it doesn't require configuration / dead simple - but I also feel that exposing all interfaces feels like exposing too much. Having said that, I understand it could be explicitly overridden via configuration... IDK, might be enough.

from macvtap-cni.

phoracek avatar phoracek commented on September 28, 2024

Just so there's a way forward, could we agree on:

Works for me.

The requirements for the user facing API would be:

Let's argue about requirements on the design document :)

exposing all interfaces feels like exposing too much

We can expose all physical NICs and bondings. I think that is a reasonable limitation. "All" would mean to expose XXXX veth ifaces and other weirdness from the host. What troubles me is that with heterogenous clusters "eth1" on one node may be connected to a completely different network on another node.

OpenShift ships macvlan CNI by default. maybe we can look for inspiration to their sided. In any case, starting with assumption that a homogenous cluster is used is fine for me.

from macvtap-cni.

maiqueb avatar maiqueb commented on September 28, 2024

Opened PR #10; we should move the discussion there.

I've started it with the minimal things that I think we're agreeing on.

from macvtap-cni.

jcaamano avatar jcaamano commented on September 28, 2024

Closing in favor of #10.

from macvtap-cni.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.