gpu-manager's Introduction

GPU Manager

GPU Manager is used for managing the nvidia GPU devices in Kubernetes cluster. It implements the DevicePlugin interface of Kubernetes. So it's compatible with 1.9+ of Kubernetes release version.

To compare with the combination solution of nvidia-docker and nvidia-k8s-plugin, GPU manager will use native runc without modification but nvidia solution does. Besides we also support metrics report without deploying new components.

To schedule a GPU payload correctly, GPU manager should work with gpu-quota-admission which is a kubernetes scheduler plugin.

GPU manager also supports the payload with fraction resource of GPU device such as 0.1 card or 100MiB gpu device memory. If you want this kind feature, please refer to vcuda-controller project.

How to deploy GPU Manager

GPU Manager is running as daemonset, and because of the RABC restriction and hydrid cluster, you need to do the following steps to make this daemonset run correctly.

service account and clusterrole

kubectl create sa gpu-manager -n kube-system
kubectl create clusterrolebinding gpu-manager-role --clusterrole=cluster-admin --serviceaccount=kube-system:gpu-manager

label node with nvidia-device-enable=enable

kubectl label node <node> nvidia-device-enable=enable

change gpu-manager.yaml and submit

change --incluster-mode from false to true, change image field to <your repository>/public/gpu-manager:latest, add serviceAccount filed to gpu-manager-role

Recommend Projects

ppomelo / gpu-manager Goto Github PK

gpu-manager's Introduction

GPU Manager

How to deploy GPU Manager

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent