UtilSched is a kubernetes scheduler based on scheduling-framework. As a sub-project of Alnair, it can cooperate with the profilling module to schedule GPU tasks according to the current cluster utilization.
The later version will also support:
- Cooperating with the elastic-training module to alleviate race conditions in a scale up/down process.
- Cooperating with the fine-grained-sharing module to schedule GPU tasks with granularity less than 1.
- Make sure kubernetes cluster version is
1.17+
- Deploy utilsched scheduler
kubectl apply -f https://raw.githubusercontent.com/YHDING23/UtilSched/master/deploy/utilsched.yaml
- Check the scheduler status
kubectl get pods -n kube-system
- Compile:
make local
- Build the docker image:
make build
- Clean the Build file
make clean