Git Product home page Git Product logo

kubernetes-device-plugins's Introduction

This repository was archived

Please read #64.

Collection of Kubernetes Device Plugins

Build Status

EARLY STAGE PROJECT

(yes, the logo is literally photo of whiteboard drawing)

Logo

This repository aims to group various device plugins by providing unified handling for common patterns.

Current Device Plugins

Note: When testing these device plugins ensure to open the feature gate using kubelet's --feature-gates=DevicePlugins=true

Each plugin may have it's own build instructions in the linked README.md.

Creating Device Plugin

See Device Plugin Manager documentaion on https://godoc.org/github.com/kubevirt/device-plugin-manager/pkg/dpm

Develop

Build all plugins:

make build

Build specific plugins:

make build-vfio
make build-network-bridge

Test all modules:

make test

Test specific modules:

make test-cmd-vfio

Deploy local Kubernetes cluster:

make cluster-up

Destroy local Kubernetes cluster:

make cluster-down

Build all device plugin images and push them to local Kubernetes cluster registry:

make cluster-sync

Build specific device plugin image and push it to local Kubernetes cluster registry:

make cluster-sync-network-bridge

Access cluster kubectl:

./cluster/kubectl.sh ...

Access cluster node via ssh:

./cluster/cli.sh node01

Run e2e tests (on running cluster):

make functests

kubernetes-device-plugins's People

Contributors

almusil avatar ansiwen avatar dominikholler avatar fabiand avatar gbenhaim avatar mpolednik avatar nickb937 avatar phoracek avatar yuvalif avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kubernetes-device-plugins's Issues

Allow building with Bazel

Bazel is a build system with enough introspection to speed up builds, make them fully reproducible and scales. The important part for us is that Kubernetes is trying Bazel, and so is KubeVirt. If we adopt Bazel, importing device plugins to other projects using the same build system should be extremely simple and convenient.

Extra resources:

bridge plugin crashing in the cleaner code

when running the new tests from kubevirt. see PR: kubevirt/kubevirt#1342
run:

FUNC_TEST_ARGS='-ginkgo.focus=vmi_bridge_test -ginkgo.regexScansFilePath' make functest

logs from plugin pod:

panic: failed to bind unix socket: address already in use

goroutine 15 [running]:
github.com/kubevirt/kubernetes-device-plugins/pkg/network/bridge.(*cleaner).Run(0xc42000c1a0)
	/home/travis/gopath/src/github.com/kubevirt/kubernetes-device-plugins/pkg/network/bridge/cleaner.go:74 +0x149
created by github.com/kubevirt/kubernetes-device-plugins/pkg/network/bridge.(*NetworkBridgeDevicePlugin).Start
	/home/travis/gopath/src/github.com/kubevirt/kubernetes-device-plugins/pkg/network/bridge/plugin.go:62 +0xd3

bridge: Interface is created after Pod processes started

Interface is moved to Pod network namespace asynchronously once Pod's network namespace is created. There is no blocking of Pod containers involved there. That means that interface can be make available only after containers are started. If container expects secondary interface or IP to be there, it may fail.

If someone want's to use bridge plugin reliably, timeout for interface appearance must be used. I don't know about any simple way how to solve this bug and block before interfaces are ready.

expose OpenVSwitch as a network resource

The Linux bridge is great, but OvS is the new kid in the 'hood.

I think that an OvS bridge should be advertised a new type of resource, because a pod may explicitly require OvS-only functionality (e.g setting a specific vlan id on a port)
We may add another "flavor" for the OvS resource if the bridge is a userspace (dpdk) one.

Rename the repo to device-plugins

You might want to renmae this repo to device-plugins or a like, then we could move it to the kubevirt repo, and use it as the cnetrla repo for all kubevirt related dev plugins.

Does kvm DP need /dev and /sys and /lib/modules

The kvm readme states that host sys and lib/modules are needed. Id oubt this :)

host should definetly not be passed in, as this is what the DP is for to solve: toget something into the pod (dev/kvm in this case) if it's not there.

sys and lib/modules - I don#t see a reason. what are the reasons?

The plugins are missing tests

Currently, there are no tests for the plugin themselves and only simple test for manager. This should be our top priority, as having untested code handle devices in the system is extremely dangerous.

General ideas:

  • let's consider gomega/ginkgo for BDD (depends on setup/teardown complexity compared to gotest)
  • mock Kubelet's device plugin gRPC API to have properly tested manager code
  • have unit + functional tests of the plugins (subject to hw requirements)

align kubetron code with network bridge code

use findPodUID code from: https://github.com/phoracek/kubetron/blob/master/pkg/deviceplugin/deviceplugin.go
to find the pod requesting the network

turn up the interfaces created by the plugin (both the one on the node and the one inside the pod), similarly to what is done here: https://github.com/phoracek/kubetron/blob/master/cmd/deviceplugin/attach-pod
investigate the option to use netlink functions to do that in golang instead of invoking a the attach-pod script. there may be an issue around modifying the internal interface.

investigate a way to pass other parameter (e.g. MAC, IP etc.) to configure the interface

Failed to build vfio

I am trying to compiling vfio module, but got following error:
cd cmd/vfio && go fmt && go vet && go install -v vet: ./vfio.go:46:28: cannot use (pci.PCILister literal) (value of type pci.PCILister) as dpm.ListerInterface value in argument to dpm.NewManager: wrong type for method Discover Makefile:20: recipe for target 'build-vfio' failed make: *** [build-vfio] Error 2

Please help to check

tls error on log retrieval and exec (prevents debugging)

I managed to deploy the net plugin on minikube but fail to retrieve logs from the pod:

$ k get pods
NAME                                 READY     STATUS    RESTARTS   AGE
device-plugin-network-bridge-zv9m2   1/1       Running   0          4m
[fabiand@tee kubernetes-device-plugins (smallerThings)]$ k describe node minikube
Name:               minikube
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=minikube
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             <none>
CreationTimestamp:  Mon, 19 Feb 2018 21:37:40 +0100
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Mon, 19 Feb 2018 21:42:21 +0100   Mon, 19 Feb 2018 21:37:40 +0100   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Mon, 19 Feb 2018 21:42:21 +0100   Mon, 19 Feb 2018 21:37:40 +0100   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Mon, 19 Feb 2018 21:42:21 +0100   Mon, 19 Feb 2018 21:37:40 +0100   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready            True    Mon, 19 Feb 2018 21:42:21 +0100   Mon, 19 Feb 2018 21:37:50 +0100   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  192.168.39.49
  Hostname:    minikube
Capacity:
 alpha.kubernetes.io/nvidia-gpu:       0
 bridge.network.kubevirt.io/mybridge:  100
 cpu:                                  2
 ephemeral-storage:                    16058792Ki
 hugepages-2Mi:                        0
 memory:                               1951720Ki
 pods:                                 110
Allocatable:
 alpha.kubernetes.io/nvidia-gpu:       0
 bridge.network.kubevirt.io/mybridge:  100
 cpu:                                  2
 ephemeral-storage:                    14799782683
 hugepages-2Mi:                        0
 memory:                               1849320Ki
 pods:                                 110
System Info:
 Machine ID:                 1ddc6e1c70184bdfb45c4ae5e85a1a5f
 System UUID:                1DDC6E1C-7018-4BDF-B45C-4AE5E85A1A5F
 Boot ID:                    5f14d76a-99da-4375-a78d-3024c9f2960b
 Kernel Version:             4.9.64
 OS Image:                   Buildroot 2017.11
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://17.9.0
 Kubelet Version:            v1.9.0
 Kube-Proxy Version:         v1.9.0
ExternalID:                  minikube
Non-terminated Pods:         (5 in total)
  Namespace                  Name                                     CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                                     ------------  ----------  ---------------  -------------
  default                    device-plugin-network-bridge-zv9m2       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-addon-manager-minikube              5m (0%)       0 (0%)      50Mi (2%)        0 (0%)
  kube-system                kube-dns-54cccfbdf8-nzghz                260m (13%)    0 (0%)      110Mi (6%)       170Mi (9%)
  kube-system                kubernetes-dashboard-77d8b98585-kfg22    0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                storage-provisioner                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  265m (13%)    0 (0%)      160Mi (8%)       170Mi (9%)
Events:
  Type     Reason                        Age                 From                  Message
  ----     ------                        ----                ----                  -------
  Normal   Starting                      17m                 kubelet, minikube     Starting kubelet.
  Normal   Starting                      17m                 kube-proxy, minikube  Starting kube-proxy.
  Warning  ImageGCFailed                 7m (x2 over 12m)    kubelet, minikube     failed to get imageFs info: unable to find data for container /
  Normal   NodeHasSufficientDisk         4m (x2 over 4m)     kubelet, minikube     Node minikube status is now: NodeHasSufficientDisk
  Normal   NodeHasSufficientMemory       4m (x2 over 4m)     kubelet, minikube     Node minikube status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure         4m (x2 over 4m)     kubelet, minikube     Node minikube status is now: NodeHasNoDiskPressure
  Normal   NodeAllocatableEnforced       4m                  kubelet, minikube     Updated Node Allocatable limit across pods
  Normal   NodeReady                     4m                  kubelet, minikube     Node minikube status is now: NodeReady
  Warning  FailedToStartNodeHealthcheck  18s (x18 over 17m)  kube-proxy, minikube  Failed to start node healthz on 0: listen tcp: address 0: missing port in address
[fabiand@tee kubernetes-device-plugins (smallerThings)]$ k logs device-plugin-network-bridge-zv9m2
Error from server: Get https://minikube:10250/containerLogs/default/device-plugin-network-bridge-zv9m2/device-plugin-network-bridge: remote error: tls: internal error

Auto-reload on configmap change.

We should add functionality to reload whenever the configmap is modified so that the pod doesn't need to be deleted and re-created. Definitely low priority and a quality of life improvement.

Insufficient Network Bridge

Once the device-plugin-network-bridge container restarts no pods that consume a network bridge are able to be scheduled. Side note: Any pods that were up and running with bridges will continue to work.

Name:         bridge-consumer-3
Namespace:    default
Node:         <none>
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"bridge-consumer-3","namespace":"default"},"spec":{"containers":[{"command":["/bin/...
Status:       Pending
IP:
Containers:
  fedora:
    Image:  fedora
    Port:   <none>
    Command:
      /bin/sleep
      999999
    Limits:
      bridge.network.kubevirt.io/mybr0:  1
    Requests:
      bridge.network.kubevirt.io/mybr0:  1
    Environment:                         <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-fsk5g (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-fsk5g:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-fsk5g
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  11s (x6 over 26s)  default-scheduler  0/1 nodes are available: 1 Insufficient bridge.network.kubevirt.io/mybr0.

Bridge DP waits forever for failed pods

When pod deployment fails in the middle (after resource was allocated), bridge device plugin waits for it to appear forever.

This problem is related to the temporary netns-access hack, do we want to fix it?

[READ IF YOU USE THIS] This repository may get archived

Hello,

in KubeVirt we used to use kubernetes-device-plugins [1] to experiment
with device plugins to:

  • connect pod's to linux bridges
  • pass PCI devices to containers
  • expose kvm device to containers

The reality is that this repo was not touched (except for some small
adjustments) in the last 2 years. It was used only to host device plugins
used within kubevirt. Is there any reason not to make this
repository read-only? Anybody using it on this forum? If nobody votes
against it in the next two weeks, I will archive the repo (note that it won't
be deleted, just set to read-only and marked as discontinued).

Note that I don't want to archive the device-plugin-manager [1] which
seems to be vendored by folks.

Regards,
Petr

[1] https://github.com/kubevirt/device-plugin-manager

Use dpm

DPM was factored out, the plugins hsould now move to consume the external version

Linux only builds

While working on PR I noticed this doesn't compile on OSX. My way around this was to add in a multi-stage build Dockerfile to build the project inside a go-lang container.

FROM golang:1.10.3 as builder
WORKDIR /go/src/github.com/kubevirt/kubernetes-device-plugins
COPY kubernetes-device-plugins .
RUN cd cmd/network/bridge && go fmt && go vet && go install -v


FROM fedora:27
RUN dnf install -y iproute
COPY --from=builder /go/bin/bridge .
ENTRYPOINT [ "./bridge", "-v", "3", "-logtostderr"]

Furthermore is there a reason fedora:27 is being used over say alpine? The image size could go down substantially to ~7-20mb.

Failed to launch pod with network

I attached a bridge to a pod:

apiVersion: v1
kind: Pod
metadata:
  name: demo-pod
spec:
  containers:
  - name: nginx
    image: nginx:latest
    resources:
      limits:
        bridge.network.kubevirt.io/mybridge: 1

but it fails with:

2018-02-19 21:44:02 +0100 CET   2018-02-19 21:44:02 +0100 CET   1         demo-pod.1514d4e22fea75ad   Pod       spec.containers{nginx}   Normal    Pulled    kubelet, minikube   Successfully pulled image "nginx:latest"
2018-02-19 21:44:02 +0100 CET   2018-02-19 21:44:02 +0100 CET   1         demo-pod.1514d4e234b0687f   Pod       spec.containers{nginx}   Normal    Created   kubelet, minikube   Created container
2018-02-19 21:44:02 +0100 CET   2018-02-19 21:44:02 +0100 CET   1         demo-pod.1514d4e237b96a1c   Pod       spec.containers{nginx}   Warning   Failed    kubelet, minikube   Error: failed to start container "nginx": Error response from daemon: linux runtime spec devices: error gathering device information while adding custom device "/tmp/deviceplugin-network-bridge-fakedev": no such file or directory

I wonder where fakedev comes from.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.