Git Product home page Git Product logo

scheduler-plugins's Introduction

Go Report Card License

Scheduler Plugins

Repository for out-of-tree scheduler plugins based on the scheduler framework.

This repo provides scheduler plugins that are exercised in large companies. These plugins can be vendored as Golang SDK libraries or used out-of-box via the pre-built images or Helm charts. Additionally, this repo incorporates best practices and utilities to compose a high-quality scheduler plugin.

Install

Container images are available in the official scheduler-plugins k8s container registry. There are two images one for the kube-scheduler and one for the controller. See the Compatibility Matrix section for the complete list of images.

docker pull registry.k8s.io/scheduler-plugins/kube-scheduler:$TAG
docker pull registry.k8s.io/scheduler-plugins/controller:$TAG

You can find how to install release image here.

Plugins

The kube-scheduler binary includes the below list of plugins. They can be configured by creating one or more scheduler profiles.

Additionally, the kube-scheduler binary includes the below list of sample plugins. These plugins are not intended for use in production environments.

Compatibility Matrix

The below compatibility matrix shows the k8s client package (client-go, apimachinery, etc) versions that the scheduler-plugins are compiled with.

The minor version of the scheduler-plugins matches the minor version of the k8s client packages that it is compiled with. For example scheduler-plugins v0.18.x releases are built with k8s v1.18.x dependencies.

The scheduler-plugins patch versions come in two different varieties (single digit or three digits). The single digit patch versions (e.g., v0.18.9) exactly align with the k8s client package versions that the scheduler plugins are built with. The three digit patch versions, which are built on demand, (e.g., v0.18.800) are used to indicated that the k8s client package versions have not changed since the previous release, and that only scheduler plugins code (features or bug fixes) was changed.

Scheduler Plugins Compiled With k8s Version Container Image Arch
v0.28.9 v1.28.9 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.28.9 AMD64
ARM64
v0.27.8 v1.27.8 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.27.8 AMD64
ARM64
v0.26.7 v1.26.7 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.26.7 AMD64
ARM64
v0.25.12 v1.25.12 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.25.12 AMD64
ARM64
v0.24.9 v1.24.9 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.24.9 AMD64
ARM64
v0.23.10 v1.23.10 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.23.10 AMD64
ARM64
v0.22.6 v1.22.6 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.22.6 AMD64
ARM64
v0.21.6 v1.21.6 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.21.6 AMD64
ARM64
v0.20.10 v1.20.10 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.20.10 AMD64
ARM64
v0.19.9 v1.19.9 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.19.9 AMD64
ARM64
v0.19.8 v1.19.8 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.19.8 AMD64
ARM64
v0.18.9 v1.18.9 registry.k8s.io/scheduler-plugins/kube-scheduler:v0.18.9 AMD64
Controller Compiled With k8s Version Container Image Arch
v0.28.9 v1.28.9 registry.k8s.io/scheduler-plugins/controller:v0.28.9 AMD64
ARM64
v0.27.8 v1.27.8 registry.k8s.io/scheduler-plugins/controller:v0.27.8 AMD64
ARM64
v0.26.7 v1.26.7 registry.k8s.io/scheduler-plugins/controller:v0.26.7 AMD64
ARM64
v0.25.12 v1.25.12 registry.k8s.io/scheduler-plugins/controller:v0.25.12 AMD64
ARM64
v0.24.9 v1.24.9 registry.k8s.io/scheduler-plugins/controller:v0.24.9 AMD64
ARM64
v0.23.10 v1.23.10 registry.k8s.io/scheduler-plugins/controller:v0.23.10 AMD64
ARM64
v0.22.6 v1.22.6 registry.k8s.io/scheduler-plugins/controller:v0.22.6 AMD64
ARM64
v0.21.6 v1.21.6 registry.k8s.io/scheduler-plugins/controller:v0.21.6 AMD64
ARM64
v0.20.10 v1.20.10 registry.k8s.io/scheduler-plugins/controller:v0.20.10 AMD64
ARM64
v0.19.9 v1.19.9 registry.k8s.io/scheduler-plugins/controller:v0.19.9 AMD64
ARM64
v0.19.8 v1.19.8 registry.k8s.io/scheduler-plugins/controller:v0.19.8 AMD64
ARM64

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

You can reach the maintainers of this project at:

You can find an instruction how to build and run out-of-tree plugin here .

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

scheduler-plugins's People

Contributors

alexeyperevalov avatar atantawi avatar cwdsuzhou avatar denkensk avatar everpeace avatar ffromani avatar fish-pro avatar gekko0114 avatar googs1025 avatar huang-wei avatar janeliul avatar jpedro1992 avatar k8s-ci-robot avatar kunwuluan avatar lianghao208 avatar my-git9 avatar nayihz avatar piotrprokop avatar pravarag avatar salmanyam avatar sanposhiho avatar seanmalloy avatar suigh avatar swatisehgal avatar tal-or avatar tenzen-y avatar wangchen615 avatar yibozhuang avatar yuanchen8911 avatar zwpaper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scheduler-plugins's Issues

`make autogen` does not generate deepcopy and register code for new config args types

I am writing a scheduler plugin which needs to be coifigured with args. I defined the args type both in pkg/apis/config/types.go and pkg/apis/config/v1beta1/types.go and then run make auto. But nothing happend, no error, no gnerated code, neither did running hack/update-autogen.sh work.

I did write the comment // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object before the type definition.

❯ make autogen
hack/update-vendor.sh
hack/update-generated-openapi.sh
Generating Kubernetes OpenAPI

❯ ./hack/update-codegen.sh
Generating deepcopy funcs
Generating defaulters
Generating conversions
Generating deepcopy funcs
Generating clientset for scheduling:v1alpha1 at sigs.k8s.io/scheduler-plugins/pkg/generated/clientset
Generating listers for scheduling:v1alpha1 at sigs.k8s.io/scheduler-plugins/pkg/generated/listers
Generating informers for scheduling:v1alpha1 at sigs.k8s.io/scheduler-plugins/pkg/generated/informers

Custom percentageOfNodesToScore in a PreFilter plugin

The scheduler option percentageOfNodeToScore controls how many nodes should be checked when scheduling a pod. It has an important impact on the scheduling performance.

To better balance the scheduling performance and quality to meet different scheduling needs of diverse workloads, an idea is to introduce a PreFilter plugin that updates the default global value if a custom threshold is specified through a Pod label.

This plugin sets the value of percentageOfNodesToScore according to the value associated with a label. For example,

parameter.scheduling.sigs.k8s.io/percentageOfNodesToScore: 10

We’d like to have your input and suggestions, particularly

  1. Is it a valid and useful feature?

  2. Is it possible to implement? A problem we notice is that the current scheduling framework does not provide a mechanism for plugins to access and update the scheduler options. Would it possible to change the plugin APIs with an additional argument, e.g. a scheduler option pointer?

Thanks a lot!

Official Docker Container Image

Several other kubernetes-sigs projects have official container images in k8s.gcr.io container registry. It would be convenient for end users if the scheduler-plugins container image was available from the k8s.gcr.io container registry.

Autogen files are not updated on running `make autogen`

Using Go version (1.14 and 1.15) on OSX, I am unable to generate code gen files for types defined under pkg/apis/config/types.go and pkg/apis/config/v1beta/types.go.

$: make autogen
hack/update-vendor.sh
hack/update-generated-openapi.sh
Generating Kubernetes OpenAPI

The command runs successfully but zz_generated* files are not updated.

Integration test needs to be improved

The integration test, esp. the coscheduling test, is slow and easily cause flakes.

In my mac, it takes 550s to finish:

⇒  hack/integration-test.sh
+++ [0629 17:17:27] Checking etcd is on PATH
+++ [0629 17:17:27] Starting etcd instance
etcd --advertise-client-urls http://127.0.0.1:2379 --data-dir /var/folders/d8/p4t60rzs7rs6b6vmppqmth300000gn/T/tmp.s95okkq8Yk --listen-client-urls http://127.0.0.1:2379 --debug > "/dev/null" 2>/dev/null
Waiting for etcd to come up.
+++ [0629 17:17:28] On try 2, etcd: : {"health":"true"}
{"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"2","raft_term":"2"}}+++ [0629 17:17:28] Running integration test cases
ok  	sigs.k8s.io/scheduler-plugins/test/integration	550.233s
+++ [0629 17:28:39] Cleaning up etcd
+++ [0629 17:28:39] Integration test cleanup complete

However, the CI env would time out after 10 mins. I would expect the test to finish within 2 mins.

/assign @denkensk

Register a new plugin

i am trying to make the entire kubernetes direcotry with a custom scoring plugin added named myallocated.go. I have registered my plugin through app.WithPlugin function

func main() {
	rand.Seed(time.Now().UnixNano())

	command := app.NewSchedulerCommand(){
			app.WithPlugin(noderesources.MyAllocatedName, noderesources.NewMyAllocated)

	}

Then in algorithmprovider/registry.go i enabled it as a scoring plugin with a weight = 2.

Score: &schedulerapi.PluginSet{
			Enabled: []schedulerapi.Plugin{
				{Name: noderesources.MyAllocatedName, Weight:2}
				//{Name: noderesources.BalancedAllocationName, Weight: 1},
				{Name: imagelocality.Name, Weight: 1},
				{Name: interpodaffinity.Name, Weight: 1},
				{Name: noderesources.LeastAllocatedName, Weight: 1},
				{Name: nodeaffinity.Name, Weight: 1},
				{Name: nodepreferavoidpods.Name, Weight: 10000},
				// Weight is doubled because:
				// - This is a score coming from user preference.
				// - It makes its signal comparable to NodeResourcesLeastAllocated.
				{Name: podtopologyspread.Name, Weight: 2},
				{Name: tainttoleration.Name, Weight: 1},
			},
		},

Then i registered MyAllocatedArgs in apis/config/register.go

        func addKnownTypes(scheme *runtime.Scheme) error {
	scheme.AddKnownTypes(SchemeGroupVersion,
		&KubeSchedulerConfiguration{},
		&Policy{},
		&DefaultPreemptionArgs{},
		&InterPodAffinityArgs{},
		&NodeLabelArgs{},
		&NodeResourcesFitArgs{},
		&PodTopologySpreadArgs{},
		&RequestedToCapacityRatioArgs{},
		&ServiceAffinityArgs{},
		&VolumeBindingArgs{},
		&NodeResourcesLeastAllocatedArgs{},
		**&NodeResourcesMyAllocatedArgs{},**
		&NodeResourcesMostAllocatedArgs{},
		&NodeAffinityArgs{},
	)
	scheme.AddKnownTypes(schema.GroupVersion{Group: "", Version: runtime.APIVersionInternal}, &Policy{})
	return nil
}```

Then errors kept coming so in apis/config/validation/validation_pluginargs.go 
```go
        func ValidateNodeResourcesMyAllocatedArgs(args *config.NodeResourcesMyAllocatedArgs) error {
	return validateResources(args.Resources)
}```

Also in apis/config/v1beta1/defaults.go i added 
```go
       func SetDefaults_NodeResourcesMyAllocatedArgs(obj *v1beta1.NodeResourcesMyAllocatedArgs) {
	if len(obj.Resources) == 0 {
		// If no resources specified, used the default set.
		obj.Resources = append(obj.Resources, defaultResourceSpec...)
	}
}```

And still after all this i get the following : 
**error annot use &NodeResourcesMyAllocatedArgs literal (type *NodeResourcesMyAllocatedArgs) as type runtime.Object in argument to scheme.AddKnownTypes:
	*NodeResourcesMyAllocatedArgs does not implement runtime.Object (missing DeepCopyObject method)**


And i noticed after i put the following lines in apis/config/zz_generated.deepcopy.go : 
```go
    func (in *NodeResourcesMyAllocatedArgs) DeepCopyInto(out *NodeResourcesMyAllocatedArgs) {
	*out = *in
	out.TypeMeta = in.TypeMeta
	if in.Resources != nil {
		in, out := &in.Resources, &out.Resources
		*out = make([]ResourceSpec, len(*in))
		copy(*out, *in)
	}
	return
}

// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NodeResourcesMyAllocatedArgs.
func (in *NodeResourcesMyAllocatedArgs) DeepCopy() *NodeResourcesMyAllocatedArgs {
	if in == nil {
		return nil
	}
	out := new(NodeResourcesMyAllocatedArgs)
	in.DeepCopyInto(out)
	return out
}

func (in *NodeResourcesMyAllocatedArgs) DeepCopyObject() runtime.Object {
	if c := in.DeepCopy(); c != nil {
		return c
	}
	return nil
}

But during make of deepcopy-gen the DeepCopyObject function got deleted. What can i do.

Code for MyAllocated.go :

package noderesources

import (
	"context"
	"fmt"

	v1 "k8s.io/api/core/v1"
	"k8s.io/apimachinery/pkg/runtime"
	"k8s.io/kubernetes/pkg/scheduler/apis/config"
	"k8s.io/kubernetes/pkg/scheduler/apis/config/validation"
	"k8s.io/kubernetes/pkg/scheduler/framework"
)

// MyAllocated is a score plugin that favors nodes with fewer allocation requested resources based on requested resources.
type MyAllocated struct {
	handle framework.Handle
	resourceAllocationScorer
}

var _ = framework.ScorePlugin(&MyAllocated{})

// MyAllocatedName is the name of the plugin used in the plugin registry and configurations.
const MyAllocatedName = "NodeResourcesMyAllocated"

// Name returns name of the plugin. It is used in logs, etc.
func (la *MyAllocated) Name() string {
	return MyAllocatedName
}

// Score invoked at the score extension point.
func (la *MyAllocated) Score(ctx context.Context, state *framework.CycleState, pod *v1.Pod, nodeName string) (int64, *framework.Status) {
	nodeInfo, err := la.handle.SnapshotSharedLister().NodeInfos().Get(nodeName)
	if err != nil {
		return 0, framework.NewStatus(framework.Error, fmt.Sprintf("getting node %q from Snapshot: %v", nodeName, err))
	}

	// la.score favors nodes with fewer requested resources.
	// It calculates the percentage of memory and CPU requested by pods scheduled on the node, and
	// prioritizes based on the minimum of the average of the fraction of requested to capacity.
	//
	// Details:
	// (cpu((capacity-sum(requested))*MaxNodeScore/capacity) + memory((capacity-sum(requested))*MaxNodeScore/capacity))/weightSum
	return la.score(pod, nodeInfo)
}

// ScoreExtensions of the Score plugin.
func (la *MyAllocated) ScoreExtensions() framework.ScoreExtensions {
	return nil
}

// NewMyAllocated initializes a new plugin and returns it.
func NewMyAllocated(laArgs runtime.Object, h framework.Handle) (framework.Plugin, error) {
	args, ok := laArgs.(*config.NodeResourcesMyAllocatedArgs)
	if !ok {
		return nil, fmt.Errorf("want args to be of type NodeResourcesMyAllocatedArgs sagapaw sou fwnaksa, got %T", laArgs)
	}

	if err := validation.ValidateNodeResourcesMyAllocatedArgs(args); err != nil {
		return nil, err
	}

	resToWeightMap := make(resourceToWeightMap)
	for _, resource := range (*args).Resources {
		resToWeightMap[v1.ResourceName(resource.Name)] = resource.Weight
	}

	return &MyAllocated{
		handle: h,
		resourceAllocationScorer: resourceAllocationScorer{
			Name:                MyAllocatedName,
			scorer:              myResourceScorer(resToWeightMap),
			resourceToWeightMap: resToWeightMap,
		},
	}, nil
}

func myResourceScorer(resToWeightMap resourceToWeightMap) func(resourceToValueMap, resourceToValueMap, bool, int, int) int64 {
	return func(requested, allocable resourceToValueMap, includeVolumes bool, requestedVolumes int, allocatableVolumes int) int64 {
		var nodeScore, weightSum int64
		for resource, weight := range resToWeightMap {
			resourceScore := myRequestedScore(requested[resource], allocable[resource])
			nodeScore += resourceScore * weight
			weightSum += weight
		}
		return nodeScore / weightSum
	}
}

// The unused capacity is calculated on a scale of 0-MaxNodeScore
// 0 being the lowest priority and `MaxNodeScore` being the highest.
// The more unused resources the higher the score is.
func myRequestedScore(requested, capacity int64) int64 {
	if capacity == 0 {
		return 0
	}
	if requested > capacity {
		return 0
	}

	return ((capacity - requested/2) * int64(framework.MaxNodeScore)) / capacity
}

can't pull the image from k8s.gcr.io/scheduler-plugins

$ docker pull k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.18.9
Error response from daemon: manifest for k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.18.9 not found: manifest unknown: Failed to fetch "v0.18.9" from request "/v2/scheduler-plugins/kube-scheduler/manifests/v0.18.9".

I can't pull our image form the addree listed in https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/README.md

Is there anything that needs to be updated?

/kind support

Official images

It seems even after kubernetes/k8s.io#1346 gets merged, k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.18.9 is still unable to be downloaded:

root@wei-dev:~# docker pull k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.18.9
Error response from daemon: manifest for k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.18.9 not found: manifest unknown: Failed to fetch "v0.18.9" from request "/v2/scheduler-plugins/kube-scheduler/manifests/v0.18.9".

Not quite sure it's due to missing folder "k8s-staging-scheduler-plugins" under https://github.com/kubernetes/k8s.io/tree/master/k8s.gcr.io/manifests.

ARM Container Images

I would like to run the scheduler-plugins on ARM(specifically ARM64) hardware. I'd like an official scheduler-plugins container image for this use case. This would be in addition to the already provided x86_64 container image.

How do custom scheduler plugins work?

Hi, I was reading Kubernetes concepts on schedulers here

https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/

And at the bottom it referred to this github for more information on custom plugins to the extension points of the kubernetes scheduler.

In addition to default plugins, you can also implement your own scheduling plugins and get them configured along with default plugins. You can visit scheduler-plugins for more details.

I am curious as to how that works here, are the scheduler plugins something you add to https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/scheduler.go at compile time and then create your own scheduler or is there someway to add just custom plugins at runtime?

I am just curious about this as I have been looking into how the Kubernetes default scheduler works, but there seems to be two levels of customization: 1) write your own scheduler-controller, 2) custom plugins at the extension points. But I am a bit confused how 2) is deployed at runtime if at all? Thanks!

Write scheduler plugin in python

Hi,
I would like to write a plugin for extension points, like "Filter", "Permit", "PreBind" in python.

I expect that this should be feasible.

Can someone please point me to the relevant doc and a sample ?

Thank you in advance!
--Renuka

Create CRD failed

I try go apply manifests/coscheduling/crd.yaml then get this error.

error validating data: ValidationError(CustomResourceDefinition.spec): unknown field "validation" in io.k8s.apiextensions-apiserver.pkg.apis.apiextensions.v1.CustomResourceDefinitionSpec; if you choose to ignore these errors, turn validation off with --validate=false

It seems validation field avaliable in apiextensions/v1beta1 and removed from apiextensions/v1
https://v1-18.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#customresourcedefinitionspec-v1-apiextensions-k8s-io
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#customresourcedefinitionspec-v1beta1-apiextensions-k8s-io

Is it the wrong way to use it?

TestPodGroupClean is flaky

TestPodGroupClean in pkg/coscheduling/coscheduling_test.go seems to be flaky:

⇒  go test ./pkg/coscheduling/... -run TestPodGroupClean -count 50
--- FAIL: TestPodGroupClean (1.00s)
    --- FAIL: TestPodGroupClean/pod_belongs_to_a_podGroup (1.00s)
        coscheduling_test.go:567: fail to gc PodGroup in coscheduling: pod1
--- FAIL: TestPodGroupClean (1.00s)
    --- FAIL: TestPodGroupClean/pod_belongs_to_a_podGroup (1.00s)
        coscheduling_test.go:567: fail to gc PodGroup in coscheduling: pod1
--- FAIL: TestPodGroupClean (1.00s)
    --- FAIL: TestPodGroupClean/pod_belongs_to_a_podGroup (1.00s)
        coscheduling_test.go:567: fail to gc PodGroup in coscheduling: pod1
--- FAIL: TestPodGroupClean (1.00s)
    --- FAIL: TestPodGroupClean/pod_belongs_to_a_podGroup (1.00s)
        coscheduling_test.go:567: fail to gc PodGroup in coscheduling: pod1
FAIL
FAIL	sigs.k8s.io/scheduler-plugins/pkg/coscheduling	13.811s
FAIL

Sample plugins to exercise each extension point

It's nice to come up with a series of sample plugins to exercise each extension point, and accompanied with a README guiding newcomers on how that's built.

Generally, the use-cases that the sample plugin tries to solve can be hypothetical, however, we do hope it's a or part of a problem in real-case world.

  • Sort - A QOS Sort plugin to sort the pods by their QOS level. #3
  • PreFilter
  • Filter
  • PostFilter - A cross-node preemption PostFitler plugin to try preempting pods on multiple nodes #56
  • PreScore
  • Score & NormalizeScore - A Score plugin to take (1) terminating Pods and (2) nominated Pods into accounts #103
  • Permit
  • Reserve
  • PreBind
  • Bind

Add Branch Name To Staging Registry Container Image Tags

From #77 (comment) ...

We might want to consider adding the branch name to container image tags in the staging registry.

  • if it's on release-v1.x.y, let's keep current logic (i.e., use the tag plus a random string)
  • if it's on the master branch, let's use "master" plus a random string - it's b/c we usually don't have a tag on master branch.

We could use the _PULL_BASE_REF variable from the cloudbuild.yaml to get the branch name. The Makefile would need to be updated. See the line in cloudbuild.yaml.

pod group min available is never updated

Pod group info is never updated, unless all pods are removed and GC happens.

Because of that scenarios such as scaling up the pod group will not work.

Integration test that should work but fails:

		{
			name: "equal priority, pod group is scaled up",
			pods: []podInfo{
				{podName: "t8-p1-1", podGroupName: "pg8-1", minAvailable: "2", priority: midPriority, memReq: 50},
				{podName: "t8-p1-2", podGroupName: "pg8-1", minAvailable: "2", priority: midPriority, memReq: 50},
				{podName: "t8-p1-3", podGroupName: "pg8-1", minAvailable: "3", priority: midPriority, memReq: 50},
			},
			expectedPods: []string{"t8-p1-1", "t8-p1-2", "t8-p1-3"},
		},

how can I run this plugins in my k8s cluster

KubeSchedulerConfiguration resource seems belong to group "kubescheduler.config.k8s.io" and version "kubescheduler.config.k8s.io/v1beta1", but in k8s 1.19.2, there is neither group "kubescheduler.config.k8s.io" nor version "kubescheduler.config.k8s.io/v1beta1", do I need to apply KubeSchedulerConfiguration crd first? I can`t find any doc related.

Coscheduling: When the resource is sufficient, the pod is always Pending

apiVersion: "batch/v1"
kind: "Job"
metadata:
namespace: test-ns
name: tf-work-1
spec:
template:
metadata:
labels:
pod-group.scheduling.sigs.k8s.io/name: tf-work
pod-group.scheduling.sigs.k8s.io/min-available: "2"
spec:
restartPolicy: "OnFailure"
containers:
- image: myimage
name: tf-work-1
command: [ "/bin/sh","-c","top -b" ]
resources:
limits:
cpu: "6"
memory: 8Gi
requests:
cpu: "6"
memory: 8Gi

apiVersion: "batch/v1"
kind: "Job"
metadata:
namespace: test-ns
name: tf-work-2
spec:
template:
metadata:
labels:
pod-group.scheduling.sigs.k8s.io/name: tf-work
pod-group.scheduling.sigs.k8s.io/min-available: "2"
spec:
restartPolicy: "OnFailure"
containers:
- image: myimage
name: tf-work-2
command: [ "/bin/sh","-c","top -b" ]
resources:
limits:
cpu: "6"
memory: 8Gi
requests:
cpu: "6"
memory: 8Gi

E1124 12:15:02.731405 1 factory.go:601] Error scheduling test-ns/tf-work-1-qbtz6: rejected by "coscheduling" at prefilter: less than minAvailable; retrying E1124 12:15:02.735282 1 scheduler.go:627] error selecting node for pod: rejected by "coscheduling" at prefilter: less than minAvailable E1124 12:15:02.735520 1 factory.go:601] Error scheduling test-ns/tf-work-1-qbtz6: rejected by "coscheduling" at prefilter: less than minAvailable; retrying E1124 12:15:02.735564 1 scheduler.go:627] error selecting node for pod: rejected by "coscheduling" at prefilter: less than minAvailable E1124 12:15:24.119796 1 factory.go:601] Error scheduling test-ns/tf-work-2-9t8jg: pod "tf-work-2-9t8jg" rejected while waiting at permit: rejected due to timeout after waiting 10s at plugin coscheduling; retrying E1124 12:15:34.124308 1 factory.go:601] Error scheduling test-ns/tf-work-2-9t8jg: pod "tf-work-2-9t8jg" rejected while waiting at permit: rejected due to timeout after waiting 10s at plugin coscheduling; retrying E1124 12:16:39.419128 1 factory.go:601] Error scheduling test-ns/tf-work-1-qbtz6: pod "tf-work-1-qbtz6" rejected while waiting at permit: rejected due to timeout after waiting 10s at plugin coscheduling; retrying

Add a new tag v0.19.0

@Huang-Wei The K8s API dep was already bumped to 1.19.0 in the master branch. Should we add a new tag v0.19.0? Otherwise, git describe and kube-scheduler --version just report the outdated information.

$ git describe
v0.18.800-54-gad9f407
$ kube-scheduler --version
I1029 18:05:37.788957   14935 registry.go:173] Registering SelectorSpread plugin
I1029 18:05:37.789067   14935 registry.go:173] Registering SelectorSpread plugin
Kubernetes v0.18.800

Add co-scheduling plugin based on scheduler framework

What would you like to be added:

Add co-scheduling plugin based on scheduler framework

Why is this needed:

Kubernetes has become a popular solution for orchestrating containerized workloads. Due to limitation of Kubernetes scheduler, some offline workloads (ML/DL) are managed in a different system. To improve cluster utilization and operation efficiency, we'd like to treat Kubneretes as a unified management platform. But ML jobs are All-or-Nothing: they require all tasks of a job to be scheduled at the same time. If the job only start part of tasks, it will wait for other tasks to be ready to begin to work. In the worst case, all jobs are pending leading to a deadlock. To solve this problem, co-scheduling is needed for the scheduler. The new scheduler framework makes the goal possible.

KEP:kubernetes/enhancements#1463

/assign
/cc @Huang-Wei

Controller logic to sync ElasticQuota

#73 doens't cover the controller logic to sync ElasticQuota into etcd. It would be to add controller logic covering that. And when it's available, it may be possible to prune the scheduling logic / data structure.

/assign @denkensk

Allow dynamic change of scheduler plugins in a profile

Currently the configuration of scheduler profiles and plugins within is static for the lifetime of scheduler run. This prevents dynamically changing plugins for valid reasons. For example, score plugins that depend on real time metrics might need to be substituted with other plugins if metrics are not available for long time. Similarly, based on current state of a cluster, scheduler plugins could be dynamically changed for future intended state.

This can only be achieved currently with a scheduler restart after (re)configuring new plugins for the given profile, which is not a good solution. Request is to add support to do same without disruptions.

Custom Scoring Plugin

Hello everyone, I would like to create my own scoring plugin , for example myallocation.go which favors nodes the widest fraction of CPU and memory. I've realized that all i have to do is register my plugin in algorithmprovider/register.go and then create my plugin integrating into the scheduling framework.Do i also have to make an myallocation_test.go before i compile the whole project and run it as a different scheduler. Is there another way? Is this a proper way? At last i would like to say that it would be very useful if there were any advises or guides on how to take the best out of the scheduling framework enabling and disabling plugins simple and successfully.
Thank you in advance.

CoScheduling: invoke getOrCreatePodGroupInfo in Permit and Unreserve plugins

getOrCreatePodGroupInfo creates a new PodGroup or obtains the existing PodGroupInfo. It's required for Less and PreFilter as a PodGroup may not exist when a Pod reaches the two phases. However, PodGroup should already exist in Permit and Unreserve. If not (due to GC or PodGroup expiration?), a new PodGroup should not be created in the two phases. We may need a separate func for read operation getPodGroupInfo.

@denkensk @Huang-Wei WDYT?

Concerns for potential deadlocking Issue

I have experienced a similar problem when I am testing the coschedulnig plugin with Kubemark.

I prepared 2 hollow-nodes, each with 8 GPU. Then I submit several dozens of jobs sequentially (i.e. 60 jobs in total, submit every 15 seconds, each job runs for around 30 seconds), each requiring 1-8 GPU cards. It would work fine when there were sufficient GPU resources and every pod in a job could get allocated without pending.

When there were not enough GPU resources for the incoming jobs, those jobs will start pending, which is as expected. However, when the running jobs succeeded and released enough resources for the pending jobs, no pending jobs would start. All the jobs would still pending even if there were enough resources for at least one of them because 1-8 always < 16.

I suspect this is a typical deadlocking situation that happens in the 'Permit' stage. And I indeed observed the interleaving of pods of different podgroup waiting in the logs.

It is reported in the design doc that the QueueSort Plugins ensures the Pod belonging to the same PodGroup will be placed back to back to avoid deadlock (I am not sure if this is also the method of the latest PodGroup CRD based coscheduling). However, the QueueSort Plugins only have an effect on the ActiveQ.

Imagine the situations where some pods of a podgroup are in the Permit stage, while others may be thrown into BackoffQ or UnschedulableQ because of failed scheduling due to insufficient resources. When will the later pods be flushed back to ActiveQ is kind of 'random'. When the pods in the Permit stage expire and requeue (or thrown into BackoffQ/UnschedulableQ), they may never have the chance to 'reunite' with their peers in the ActiveQ back-to-back. The result is, there is always only a part of the pods of each podgroup are competing in the reserve stage thus deadlock happens.

This is just suspicion based on what I observe in my experiment and what I read about the coscheudling plugins and kube-scheduler SchedulilngQueue Mechanism. I worry that the BackoffQ and UnschedulableQ mechanism will somehow invalidate some of the assumptions the coscheduling plugins implies.

Originally posted by @TomLan42 in #114 (comment)

Proposal: a generic QueueSort plugin

The default priority_sort plugin sorts pods by priority and then timestamp. coscheduling's QueueSort plugin extends the default priority_sort plugin by taking into account PodGroup information to ensure pods from a same PodGroup are sorted together. The current repo also includes a simple QoS plugin that uses Pod QoS class to break a tie for pods with the same priority and timestamp.

As only one QueueSort plugin is allowed in a single scheduler executable, multiple configurations using different QueueSort plugins and hence multiple schedulers may be required in order to meet different scheduling needs. However, running multiple schedulers is challenging and undesirable in practice. A generic QueueSort that can meet different scheduling needs hence will be useful.

We propose a generic QueueSort plugin with advanced sorting features that can meet different scheduling needs (regular pod scheduling, gang scheduling, etc.) and be used by multiple scheduling profiles. An implementation will adapt the current coscheduling implementation by decoupling the QueueSort plugin code and creating a separate API and a generic QueueSort plugin.

1 Create a new API package for PodGroup related information, which can be extended to support other PodGroup-based scheduling enhancement, e.g., re-using and sharing the node scores among a group of pods with homogeneous resource specifications. The bottom line is that coscheduling's PodGroup QueueSort plugin is such a useful feature and can be used for other scenarios as well. Converting it to an independent and extensible plugin is beneficial.

api/types.go

package api

type PodGroupList struct {
... 
}
type PodGroupInfo struct {
...
}

func (podGroups *PodGroupList) GetPodGroupInfo(p *framework.PodInfo) (*PodGroupInfo, int) {
...
}

func GetPodGroupLabels(p *v1.Pod) (string, int, error) {
...
}
  1. Create a generic QueueSort plugin that takes into account Pod priority, timestamp, QoS class and PodGroup information in sorting.
    2.1. Compare the pods' priorities.
    2.2. Compare the timestamps of the initialization time of PodGroups.
    2.3 Compare the QoS classes of regular pods, i.e., the pods do not belong to any PodGroup or the minAvailable is 1.
    2.4. Compare the keys of the pods' PodGroups.

pkg/queuesort

package queuesort
func (gs *GenericSort) Less(podInfo1 *framework.PodInfo, podInfo2 *framework.PodInfo) bool {
    pod1 := podInfo1.Pod
    pod2 := podInfo2.Pod

    // compare by priority
    priority1 := pod.GetPodPriority(pod1)
    priority2 := pod.GetPodPriority(pod2)

    if priority1 != priority2 {
        return priority1 > priority2
    }

    // compare by timestamp
    pgInfo1,  min1 := gs.podGroups.GetPodGroupInfo(podInfo1)
    pgInfo2, min2 := gs.podGroups.GetPodGroupInfo(podInfo2)
    pgName1 := pgInfo1.Name
    pgName2 := pgInfo2.Name
    time1 := pgInfo1.Timestamp
    time2 := pgInfo2.Timestamp

    if !time1.Equal(time2) {
        return time1.Before(time2)
    }

    // compare by pod QoS class between two pods that do not belong to any podgroup : Guaranteed > Burstable > BestEffort
    if (pgName1 == "" || min1 <= 1) && (pgName2 == "" || min2 <= 1) {
        qos1 := v1qos.GetPodQOS(pod1)
        qos2 := v1qos.GetPodQOS(pod2)

        if qos1 != qos2 {
            if qos1 == v1.PodQOSGuaranteed {
                return true
            } else if qos1 == v1.PodQOSBurstable {
                return qos2 != v1.PodQOSGuaranteed
            } else {
                return qos2 == v1.PodQOSBestEffort
            }
        }
    }

    // compare by podGroup key
    key1 := fmt.Sprintf("%v/%v", podInfo1.Pod.Namespace, pgInfo1.Name)
    key2 := fmt.Sprintf("%v/%v", podInfo2.Pod.Namespace, pgInfo2.Name)
    return key1 < key2
}
  1. Modify the current coscheduling implementation by removing the PodGroup related structures and functions that are defined in the new API package and using the API to obtain the PodGroup information.

Is this something useful? We'd really like the feedbacks and suggestions from the community. We have a prototype implementation already.

coscheduling queue sort plugin starves pods

Currently coscheduling plugin is using InitialAttemptTimestamp to compare pods of the same priority. If there are enough pods with early InitialAttemptTimestamp which cannot be scheduled then pods with later InitialAttemptTimestamp will get starved - scheduler will never attempt to schedule them. This is because scheduler will re-queue "early" pods before "later" pods are attempted. Normal scheduler is using time when pod was inserted into the queue, so this situation cannot occur.

The permit logic in coscheduling

I am using cosheduling plugins, but I am a little connfused about the permit logic.
The code is

func (pgMgr *PodGroupManager) Permit(ctx context.Context, pod *corev1.Pod, nodeName string) (bool, error) {
	pgFullName, pg := pgMgr.GetPodGroup(pod)
	if pgFullName == "" {
		return true, util.ErrorNotMatched
	}
	if pg == nil {
		// once we have admission check, a Pod targeting non-exisitng PodGroup won't be created.
		// So here it just serves as a sanity check.
		return false, fmt.Errorf("PodGroup not found")
	}

	bound := pgMgr.calculateBoundPods(pg.Name, pg.Namespace)
	// The bound is calculated from the snapshot. The current pod does not exist in the snapshot during this scheduling cycle.
	ready := int32(bound)+1 >= pg.Spec.MinMember
	if ready {
		return true, nil
	}
	return false, util.ErrorWaiting
}

func (pgMgr *PodGroupManager) calculateBoundPods(podGroupName, namespace string) int {
	nodeInfos, err := pgMgr.snapshotSharedLister.NodeInfos().List()
	if err != nil {
		klog.Errorf("Cannot get nodeInfos from frameworkHandle: %v", err)
		return 0
	}
	var count int
	for _, nodeInfo := range nodeInfos {
		for _, podInfo := range nodeInfo.Pods {
			pod := podInfo.Pod
			if pod.Labels[util.PodGroupLabel] == podGroupName && pod.Namespace == namespace && pod.Spec.NodeName != "" {
				count++
			}
		}
	}

	return count
}

it calculate the number of pods have been bound to nodes. If the number of pods have been bound to nodes are greater than the minMember of the PodGroup, the permit plugin will allow the subsequence scheduling process.
But lets say if I set minMember in the PodGroup to 3, and the replicas of my deployment is 6. Wouldn`t it be all the pods become pending status because none of them can pass the permit plugin?(Since there is no pod binding among the podGroup because
they all stuck in the permit stage)
I am not sure if I comprehend the whole process correctly, can someone help?

Improve co-scheduling plugins base on framework

After the coscheduling plugin is merged . The next phase is to improve the performance and configurability of coscheduling. This issue will track all TODOs in coscheduling or schedule framework.

Optimize performance

  • 1. expose pods status(assumed or actually scheduled) in the SharedLister of Framework
  • 2. get actually scheduled(bind successfully) account from the SharedLister in coscheduling plugins
    // TODO get actually scheduled(bind successfully) account from the SharedLister
    running := cs.calculateRunningPods(podGroupName, namespace)
    waiting := cs.calculateWaitingPods(podGroupName, namespace)
    current := running + waiting + 1
  • 3. get the total pods from the scheduler cache and queue instead of the hack manner
    func (cs *Coscheduling) calculateTotalPods(podGroupName, namespace string) int {
    // TODO get the total pods from the scheduler cache and queue instead of the hack manner
    selector := labels.Set{PodGroupName: podGroupName}.AsSelector()
    pods, err := cs.podLister.Pods(namespace).List(selector)
    if err != nil {
    klog.Error(err)
    return 0
    }
    return len(pods)
    }
  • 4. Efficient Requeuing of Unschedulable Pods kubernetes/kubernetes#87738
  • 5. Move the pods(rejected by Prefilter or Permit) back activeQ by the new mechanism

PodGroup optimization

  • 1. implement a timeout based gc for the PodGroupInfos map
    // PodGroupInfo is a wrapper to a PodGroup with additional information.
    // TODO implement a timeout based gc for the PodGroupInfos map
    type PodGroupInfo struct {
    name string
    // timestamp stores the timestamp of the initialization time of PodGroup.
    timestamp time.Time
    }
  • 2. keep a cache of pod group size to improve reading efficiency
    func (cs *Coscheduling) calculateWaitingPods(podGroupName, namespace string) int {
    waiting := 0
    // Calculate the waiting pods.
    // TODO keep a cache of podgroup size.
    cs.frameworkHandle.IterateOverWaitingPods(func(waitingPod framework.WaitingPod) {
    if waitingPod.GetPod().Labels[PodGroupName] == podGroupName && waitingPod.GetPod().Namespace == namespace {
    waiting++
    }
    })
    return waiting
    }

Waiting time Configurability

/cc @ahg-g @Huang-Wei @alculquicondor

Difference between the upstream scheduler main program and the scheduler-plugins one

Compared to upstream kube-scheduler main program, scheduler-plugin main program is missing the following code segment, I was wondering why it's removed. Thanks.

    // TODO: once we switch everything over to Cobra commands, we can go back to calling
    // utilflag.InitFlags() (by removing its pflag.Parse() call). For now, we have to set the
    // normalize func and add the go flag set by hand.
    pflag.CommandLine.SetNormalizeFunc(cliflag.WordSepNormalizeFunc)
    // utilflag.InitFlags()
    logs.InitLogs()
    defer logs.FlushLogs()

How can I apply the plugins on my own kubernetes machine installed by Kubeadm

Recently,I plan to design my own kube-scheduler,considering that kubernetes-sigs has already designed the scheduler-plugins based on the Framework extention points,so i just wanna try the qos plugins given by the sigs. When applying the plugins,i havd another question:Since my kubernetes environment isn‘t applyed by “Make” the source code,but by the Kubeadm,all my components are running in containers.
Does that mean i should “Make" the plugins firstly ,then make it a image running by the Pod? Shoud the Pod be applied on the Master Machine or the Worker Node?

How can i test the Qos-Plugins on my own k8s environment?

Since i successfully start the Qos-Plugins,and i wanna test if the Qos-based selection will work when the priority of two pods are equal.So i prepare the two files, one is the priority file to make sure that the following two pods'priority are equal,the other file is the pod files for testing,which is set the same priority and the second pod is written with Guaranteed class.
priority.yaml

 apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata: 
  name: priority-test
value: 1000000
globalDefault: false
description: "This priority class should be used for testing the qos-scheduling "

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx1
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  priorityClassName: priority-test
  schedulerName: my-scheduler1
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx2
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        memory: "10Mi"
        cpu: "100m"
      limits:
        memory: "10Mi"
        cpu: "100m"
  priorityClassName: priority-test
  schedulerName: my-scheduler1

But it seems impossible to see which pod is scheduled fisrtly, cause they looks like being scheduled at the same time,
to test this plugin,how can i design my experiment? Or is there any other command to check the scheduleling logs? for now,kubectl logs and kubectl descirbe cannot see the scheduling sequence .

Unschedulabelable pods may cause co-scheduling group not scheduled in time

Current Pre-filter logic may add pod to unschedulabe queue when there not enough pods having the same poggroups. There is chance that the pods in unschedulable queue need long time to dequeue(may>60s), which may case the group can not be scheduled in time.

IMO, we can take one of following actions:

  1. remove the logic about checking pod numbers
  2. add a Func to framework to solve this issue absolutely, e.g. MoveUnschedulablePodsToActiveQ

/cc @denkensk @yuanchen8911 @Huang-Wei

Does NodeResourcesLeastAllocatable + NodeResourcesMostAllocated == NodeResourcesLeastAvailable ?

If I enable both "NodeResourcesLeastAllocatable"(from the this repo) and "NodeResourcesMostAllocated"(from the main repo) in one profile, and configure them with same weights, have I gotten the equality of "NodeResourcesLeastAvailable"?

What I really want

A score plugin that favors nodes with fewer resources left.

Enabling either "NodeResourcesLeastAllocatable" or "NodeResourcesMostAllocated" alone does not meet my situation, because we may run pods of large differences in requested resources on nodes of also large differences in resources allocatable. I have tried to write another scheduler plugin just list the "NodeResourcesLeastAllocatable" but scores on available (allocatable - allocated) instead of allocatable resources. But then I realized that this plugin may not to be necessary if I can compose it from other plugins.

Add Capacity scheduling for ML/DL workloads based on scheduler framework

What would you like to be added:

Add Capacity scheduling for ML/DL workloads based on scheduler framework

Why is this needed:

There is increasing demand to use Kubernetes to manage batch workloads (ML/DL). In those cases, one challenge is to improve cluster utilization while ensuring that each user has a reasonable amount of resources. The problem can be partially addressed by the Kubernetes ResourceQuota. The native Kubernetes ResourceQuota API can be used to specify the maximum overall resource allocation per namespace. The quota enforcement is done through an admission check. A quota resource consumer (e.g., a Pod) cannot be created if the aggregated resource allocation exceeds the quota limit. The Kubernetes quota design has the following limitations:

  1. The quota resource usage is aggregated based on the resource configurations (e.g., Pod cpu/mem requests specified in the Pod spec). Although this mechanism can guarantee that the actual resource consumption will never exceed the ResourceQuota limit, it might lead to low resource utilization had the actual resource consumption been much smaller than the limit. i.e., lead to internal resource fragmentation.

  2. If we use ResourceQuota to strictly divide the resources of the cluster among all users to prevent clusters from running out of resources, it might also lead to low resource utilization because some users may have quota resources not being used.

Due to above limitations, the batch workloads (ML/DL) can't run in a Kubernetes cluster as efficiently as they do in other container orchestration platforms such as Yarn. In order to overcome above limitations, we introduce an "Queue" concept used in Yarn capacity scheduler in Kubernetes. Basically, the "Queue" has the notions of "max" and "min", where the "min” is the minimum resources that are needed to ensure the basic functionality/performance of the consumers and the "max" specifies the upper bound of the resource consumption of the consumers. By introducing "min" and "max", Pod scheduling allows the following optimizations:

  1. The slack between "min" and "max" can help to tolerate runtime failures. For example, if a Pod that consumes the entire "min" fails to run, new Pods can still be created if the "max" has not been reached. When using Kubernetes resource quota, once the Pod that consumes the entire quota is created, no other Pods can be created even if the Pod fails to run (e.g., the Pod stucks at the image pulling phase).

  2. Improve overall resource utilization by allowing one queue user to "borrow" unused reserved "min" resources from other queues. A queue user's unused "min" resource can be used by other users, under the condition that there is a mechanism to guarantee the "victim" user can consume his "min" resource whenever he needs. Typically, this is done by implementing preemption.

The success of “Queue” based capacity scheduling relies on a proper Pod preemption algorithm implementation. This KEP proposes the minimal scheduler extension to support the “Queue” based scheduling based on the scheduler framework.

KEP:https://docs.google.com/document/d/1ViujTXLP1XX3WKYUTk6u5LTdJ1sX-tVIw9_t9_mLpIc/edit?usp=sharing

/assign
/cc @Huang-Wei

Nginx pods are in pending state, while testing cosheduling.

Tested in a cluster with 1 master and 1 worker.
Below are the steps I did,

  1. Created kube-schedule-configuration.yml with below contents
[apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: false
clientConnection:
  kubeconfig: "/etc/kubernetes/scheduler.conf"   #scheduler.conf file from my kubernetes master
profiles:
- schedulerName: default-scheduler
  plugins:
    queueSort:
      enabled:
        - name: Coscheduling
      disabled:
        - name: "*"
    preFilter:
      enabled:
        - name: Coscheduling
    permit:
      enabled:
        - name: Coscheduling
    reserve:
      enabled:
        - name: Coscheduling
  1. Executed below steps
docker build -f ./build/scheduler/Dockerfile -t test/test-scheduler:v1 . #(kube-schedule-configuration.yml and /etc/kubernetes/scheduler.conf is inside docker image)
docker push test/test-scheduler:v1
  1. Followed the steps in this link to create a second scheduler (https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/)
....
  spec:
      serviceAccountName: my-scheduler
      containers:
      - command:
        - /usr/local/bin/kube-scheduler
        - --address=0.0.0.0
        - --leader-elect=false
        - --scheduler-name=my-scheduler
        - --config=kube-schedule-configuration.yml
        image: test/test-scheduler:v1
.....
  1. Deployed the below replicaSet with scheduler as my-scheduler
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 6
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
        pod-group.scheduling.sigs.k8s.io/name: nginx
        pod-group.scheduling.sigs.k8s.io/min-available: "3"
    spec:
      schedulerName: my-scheduler
      containers:
      - name: nginx
        image: nginx
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 100Mi

5, After deploying nginx, all pods are in pending state. Couldnt make a single pod running with any plugin configuration I tried .

Coscheduling:uninstall error

I want to try to use this component offline, so I need to uninstall and then install it again. But I had a problem with the uninstall process. Do I need to connect to the Internet when I unload? I push the image(acs/kube-scheduler-update && acs/kube-scheduler) to my docker warehouse. Can I use it offline?Thanks.

我想尝试离线使用这个组件,所以我需要先进行卸载,然后重新安装。但是我在卸载过程遇到了问题。卸载也需要联网吗,我把acs/kube-scheduler-update和acs/kube-scheduler镜像push到自己的docker仓库,是不是就可以离线使用了?谢谢兄弟,能说一下吗。

./helm uninstall ack-coscheduling -n kube-system

but get error
Error: uninstallation completed with 1 error(s): timed out waiting for the condition

scheduler-rollback-8dfv5 0/1 Pending

message: '0/2 nodes are available: 1 node(s) didn''t match node selector, 1 node(s) didn''t match pod affinity/anti-affinity.'

this is the scheduler-rollback-8dfv5

  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: component
            operator: In
            values:
            - kube-scheduler
        topologyKey: kubernetes.io/hostname
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/type
            operator: In
            values:
            - scheduler-rollback
        topologyKey: kubernetes.io/hostname

this is my master label

Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=tjtx-90-15.58os.org
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=

Coscheduling-Test problem

I wannna try the Coscheduling Plugin to test it in my own k8s environment,so i prepared 2 pods belong to the same PodGroup which needs minAvaliable pods are 3. So normally, 2 pods cannot run successfully and when i add a new pod as the same configuration, the 3 pods can run successfuly.
Here is my Pod file:

[root@master coscheduling-test]# cat pod.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: nginx1
  labels: 
    pod-group.scheduling.sigs.k8s.io/name: nginx
    pod-group.scheduling.sigs.k8s.io/min-available: 3
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  priorityClassName: priority-test
  schedulerName: my-scheduler1
---
apiVersion: v1
kind: Pod
metadata:
  name: nginx2
  labels: 
    pod-group.scheduling.sigs.k8s.io/name: nginx
    pod-group.scheduling.sigs.k8s.io/min-available: 3
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  priorityClassName: priority-test
  schedulerName: my-scheduler1

to make sure the pods are the same priority,i create a priority-file for them:

[root@master coscheduling-test]# cat priorityClass.yaml 
apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata: 
  name: priority-test
value: 1000000
globalDefault: false

but when i apply the pod file,i got the error:

[root@master coscheduling-test]# kubectl apply -f pod.yaml 
Error from server (BadRequest): error when creating "pod.yaml": Pod in version "v1" cannot be handled as a Pod: v1.Pod.Spec: v1.PodSpec.ObjectMeta: v1.ObjectMeta.Namespace: Name: Labels: ReadString: expects " or n, but found 3, error found in #10 byte of ...|ailable":3,"pod-grou|..., bigger context ...|"pod-group.scheduling.sigs.k8s.io/min-available":3,"pod-group.scheduling.sigs.k8s.io/name":"nginx"},|...
Error from server (BadRequest): error when creating "pod.yaml": Pod in version "v1" cannot be handled as a Pod: v1.Pod.Spec: v1.PodSpec.ObjectMeta: v1.ObjectMeta.Namespace: Name: Labels: ReadString: expects " or n, but found 3, error found in #10 byte of ...|ailable":3,"pod-grou|..., bigger context ...|"pod-group.scheduling.sigs.k8s.io/min-available":3,"pod-group.scheduling.sigs.k8s.io/name":"nginx"},|...

where is the problem?

Scheduler PLugins applying problem: i/o timeout

Recently, i wanna try the Qos-Plugins provided by the kubernetees-sigs https://github.com/kubernetes-sigs/scheduler-plugins ,
since my kubernetes was installed by the Kubeadm, so all the components are running in containers, so if i wanna realize my Scheduler with the qos-sort plugins, i have to make a contariner for it, so i "make" the plugins soure code and run "docker build" to create a image.
My kubernetes version is v 1.18, os is centOS 7.7, docker 19.03
Then i should wirte a yaml file for the container to run it as a pod in my kubernetes environment.
, and the yaml file is below:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-scheduler
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: my-scheduler-as-kube-scheduler
subjects:
- kind: ServiceAccount
  name: my-scheduler
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: my-scheduler
  apiGroup: rbac.authorization.k8s.io
--- 
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: my-scheduler
rules:
- apiGroups:
  - ""
  - events.k8s.io
  resources:
  - events
  verbs:
  - create
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - endpoints
  verbs:
  - create
- apiGroups:
  - ""
  resourceNames:
  - my-scheduler1
  resources:
  - endpoints
  verbs:
  - delete
  - get
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - delete
  - get
  - list
  - watch
  - update
- apiGroups:
  - ""
  resources:
  - bindings
  - pods/binding
  verbs:
  - create
- apiGroups:
  - ""
  resources:
  - pods/status
  verbs:
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - replicationcontrollers
  - services
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - apps
  - extensions
  resources:
  - replicasets
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - persistentvolumeclaims
  - persistentvolumes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - authentication.k8s.io
  resources:
  - tokenreviews
  verbs:
  - create
- apiGroups:
  - authorization.k8s.io
  resources:
  - subjectaccessreviews
  verbs:
  - create
- apiGroups:
  - storage.k8s.io
  resources:
  - storageclasses
  - csinodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "" 
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - coordination.k8s.io
  resources:
  - leases
  verbs:
  - get
  - list
  - watch
  - update
- apiGroups:
  - events.k8s.io
  resources:
  - events
  verbs:
  - create
  - patch
  - update  
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    component: scheduler
    tier: control-plane
  name: my-scheduler1
  namespace: kube-system
spec:
  selector:
    matchLabels:
      component: scheduler
      tier: control-plane
  replicas: 1
  template:
    metadata:
      labels:
        component: scheduler
        tier: control-plane
        version: first
    spec:
      #nodeName: master
      #tolerations:
      #- key: node-role.kubernetes.io/master
      #  operator: Exists
      #  effect: NoSchedule
      serviceAccountName: my-scheduler     
      volumes:
      - name: myscheduler-config
        hostPath:
          path: /etc/kubernetes/
      containers:
      - command:
        - kube-scheduler
        - --config=/etc/kubernetes/qos-scheduler-config.yaml
        - --kubeconfig=scheduler.conf
        image: wenshiqi/my-scheduler:3.0
        imagePullPolicy: IfNotPresent
       # livenessProbe:
       #   httpGet:
       #     path: /healthz
       #     port: 10251
       #   initialDelaySeconds: 15
        name: kube-second-scheduler
       # readinessProbe:
       #   httpGet:
       #     path: /healthz
       #     port: 10251
        resources:
          requests:
            cpu: '50m'      
        volumeMounts: 
        - name: myscheduler-config
          mountPath: /etc/kubernetes/

the scheduler config file"/etc/kubernetes/qos-scheduler-config.yaml" :

apiVersion: kubescheduler.config.k8s.io/v1alpha2
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: true
clientConnection:
  kubeconfig: "/etc/kubernetes/scheduler.conf"
profiles:
- schedulerName: my-scheduler1
  plugins:
    queueSort:
      enabled:
      - name: QOSSort
      disabled:
      - name: "*"

FInally, the container is running on the worker node by deauflt Scheduler, But!! the scheduler cannot schedule, so i check the logs about the pod,got the error :

[root@master myScheduler]# kubectl logs -n kube-system -f my-scheduler1-6d5d85c5bc-lxhj5
I0901 11:39:28.825380       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
I0901 11:39:28.825529       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
I0901 11:39:29.575313       1 serving.go:313] Generated self-signed cert in-memory
I0901 11:40:00.161880       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
I0901 11:40:00.161921       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
W0901 11:40:00.164614       1 authorization.go:47] Authorization is disabled
W0901 11:40:00.164635       1 authentication.go:40] Authentication is disabled
I0901 11:40:00.164652       1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
I0901 11:40:00.168483       1 secure_serving.go:178] Serving securely on [::]:10259
I0901 11:40:00.169776       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0901 11:40:00.169804       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0901 11:40:00.169849       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0901 11:40:00.176315       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0901 11:40:00.176336       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0901 11:40:00.270184       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I0901 11:40:00.279253       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0901 11:40:30.171828       1 trace.go:116] Trace[869437882]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.170242712 +0000 UTC m=+31.412237087) (total time: 30.001547747s):
Trace[869437882]: [30.001547747s] [30.001547747s] END
E0901 11:40:30.171876       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Pod: Get "https://apiserver.demo:6443/api/v1/pods?fieldSelector=status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.172666       1 trace.go:116] Trace[686785895]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.171880789 +0000 UTC m=+31.413875173) (total time: 30.000754954s):
Trace[686785895]: [30.000754954s] [30.000754954s] END
E0901 11:40:30.172692       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.PersistentVolumeClaim: Get "https://apiserver.demo:6443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.173037       1 trace.go:116] Trace[793270814]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.172448254 +0000 UTC m=+31.414442639) (total time: 30.000567169s):
Trace[793270814]: [30.000567169s] [30.000567169s] END
E0901 11:40:30.173062       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.PersistentVolume: Get "https://apiserver.demo:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.175910       1 trace.go:116] Trace[1105604104]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.175535559 +0000 UTC m=+31.417529924) (total time: 30.000340806s):
Trace[1105604104]: [30.000340806s] [30.000340806s] END
E0901 11:40:30.175938       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Node: Get "https://apiserver.demo:6443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.176537       1 trace.go:116] Trace[1997089750]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.176243435 +0000 UTC m=+31.418237801) (total time: 30.00027694s):
Trace[1997089750]: [30.00027694s] [30.00027694s] END
E0901 11:40:30.176551       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.StorageClass: Get "https://apiserver.demo:6443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.176582       1 trace.go:116] Trace[84268955]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.17595965 +0000 UTC m=+31.417954035) (total time: 30.000613986s):
Trace[84268955]: [30.000613986s] [30.000613986s] END
E0901 11:40:30.176588       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1beta1.PodDisruptionBudget: Get "https://apiserver.demo:6443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.176615       1 trace.go:116] Trace[923666098]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.176143796 +0000 UTC m=+31.418138158) (total time: 30.000464036s):
Trace[923666098]: [30.000464036s] [30.000464036s] END
E0901 11:40:30.176620       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Service: Get "https://apiserver.demo:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:30.178669       1 trace.go:116] Trace[1354993487]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:00.172875536 +0000 UTC m=+31.414869910) (total time: 30.005732378s):
Trace[1354993487]: [30.005732378s] [30.005732378s] END
E0901 11:40:30.178689       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.CSINode: Get "https://apiserver.demo:6443/apis/storage.k8s.io/v1/csinodes?limit=500&resourceVersion=0": dial tcp: i/o timeout
I0901 11:40:50.186268       1 trace.go:116] Trace[832137528]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2020-09-01 11:40:30.182593902 +0000 UTC m=+61.424588275) (total time: 20.003647835s):
Trace[832137528]: [20.003647835s] [20.003647835s] END
E0901 11:40:50.186287       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.CSINode: Get "https://apiserver.demo:6443/apis/storage.k8s.io/v1/csinodes?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: read udp 10.100.196.131:34078->10.96.0.10:53: i/o timeout

i didin't know why it keep saying "i/o timeout", where is the problem, how can i test? i am new learner, am really confused about it , at first i thought maybe it shouldn't be scheduled on the worker node, so i force the pod to be scheduled on the control-plane Master node,but still got the error:

I0901 11:30:06.259089       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
I0901 11:30:06.284679       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
I0901 11:30:06.709720       1 serving.go:313] Generated self-signed cert in-memory
I0901 11:30:07.416726       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
I0901 11:30:07.416839       1 registry.go:150] Registering EvenPodsSpread predicate and priority function
W0901 11:30:07.421850       1 authorization.go:47] Authorization is disabled
W0901 11:30:07.421873       1 authentication.go:40] Authentication is disabled
I0901 11:30:07.421915       1 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
I0901 11:30:07.426621       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0901 11:30:07.426665       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0901 11:30:07.426794       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0901 11:30:07.426804       1 shared_informer.go:223] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0901 11:30:07.427155       1 secure_serving.go:178] Serving securely on [::]:10259
I0901 11:30:07.428427       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I0901 11:30:07.526958       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I0901 11:30:07.527042       1 shared_informer.go:230] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
E0901 11:30:07.708359       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1beta1.PodDisruptionBudget: Get "https://apiserver.demo:6443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.709050       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Pod: Get "https://apiserver.demo:6443/api/v1/pods?fieldSelector=status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.709203       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.CSINode: Get "https://apiserver.demo:6443/apis/storage.k8s.io/v1/csinodes?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.709288       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Node: Get "https://apiserver.demo:6443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.709445       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.StorageClass: Get "https://apiserver.demo:6443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.709996       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Service: Get "https://apiserver.demo:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.709960       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.PersistentVolume: Get "https://apiserver.demo:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.710145       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.PersistentVolumeClaim: Get "https://apiserver.demo:6443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host
E0901 11:30:07.827232       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.PersistentVolumeClaim: Get "https://apiserver.demo:6443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0": dial tcp: lookup apiserver.demo on 10.96.0.10:53: no such host

this time it shows "no such host" ??? Which Node should My Own Scheduler Pod be Scheduled? And what is the problem with my Kubernetes environment? Thanks for all the help.

User Documentation for deploying scheduler plugins

We need some docs to help users deploying the server of scheduler-plugins.
Our documentation needs to include the following steps

  1. How to deploy the scheduler/controller of scheduler-plugins in an existing k8s cluster
  2. How to replace existing default-scheduler
  3. Deploy the controller of scheduling-plugins
  4. Create CRD for PodGroup
  5. How to verify that the scheduler starts successfully.

Different people may deploy cluster in different ways.
We can take a cluster deployed by kubeadm as an example.

/help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.