k8snetworkplumbingwg / net-attach-def-admission-controller Goto Github PK
View Code? Open in Web Editor NEWAn admission controller to check resources as defined by the NPWG spec
License: Apache License 2.0
An admission controller to check resources as defined by the NPWG spec
License: Apache License 2.0
Currently we have a situation where we're creating the ValidatingWebhookConfiguration up front, and we're setting it up to listen for all pod creation events on all namespaces.
This leaves for a situation where there's a failure of the net-attach-def-admission-controller (for example, the pod gets killed and for some reason it doesn't come back -- say, a misconfiguration), and we deny any pods from being created cluster-wide.
In order to mitigate this failure, I propose that we dynamically create the ValidatingWebhookConfiguration for the isolation feature (this is the feature that listens to pod creation events) -- and we limit the ValidatingWebhookConfiguration to listen on specific namespace(s) -- only namespaces with NetworkAttachmentDefinitions defined within them. And we create one for each namespace, or modify the namespaces under which we gate pod creation.
Psuedocode (I have not trialed nor validated this, just stubbing in the "namespaces" key) in yaml for the filtering to specific namespaces (specifically note the last line):
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
name: net-attach-def-admission-controller-isolating-config
webhooks:
- name: net-attach-def-admission-controller-isolating-config.k8s.io
clientConfig:
service:
name: net-attach-def-admission-controller-service
namespace: ${NAMESPACE}
path: "/isolate"
caBundle: ${CA_BUNDLE}
rules:
- operations: [ "CREATE" ]
apiGroups: ["apps", ""]
apiVersions: ["v1"]
resources: ["pods"]
namespaces: ["foo","bar","quux"]
I envision that there will be two processes in order to dynamically create these ValidatingWebhookConfigurations.
Upside: This greatly mitigates a failure of the net-attach-def-admission-controller at scale for namespaces (or entire deployments) that do not have defined net-attach-defs.
Downside: The namespace isolation feature in Multus must still be used. As this alone will not be enough for security purposes.
@s1061123 -- btw, I'll pick this up when I'm back from PTO!
Big topic is unit testing (so we can make quick modifications and validate we didn't cause regressions), but, we also need some smaller topics like linting.
I have scenarios where I'll need to compile with golang 1.10. I'm currently trying to fish out the issue and make it compatible, this error does not occur with golang 1.11, but, appears to with 1.10.
$ go install ./...
# github.com/K8sNetworkPlumbingWG/net-attach-def-admission-controller/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json
vendor/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go:104:16: unknown field 'caseSensitive' in struct literal of type jsoniter.Config
(original subject: networks annotations must not refer to namespaced values?)
@s1061123 updated: As of the thread conversation, we clarified that we can improve as #54 (comment)
The net-attach-def-admission-controller is rejecting the creation of our replica set due to the following:
65s Warning FailedCreate replicaset/loam-b-v1-54b6459f86 Error creating: admission webhook "net-attach-def-admission-controller-isolating-config.k8s.io" denied the request: k8s.v1.cni.cncf.io/networks annotations must not refer to namespaced values (must use local namespace, i.e. must not contain a /), rejected: my-ns/loam-b-mgmt (namespace: my-ns)
Looking at other implementations (eg: multus-cni), it is allowed to have a namespace in the network annotation.
So why is the net-attach-def-admission-controller rejecting it?
Feng mentioned during a recent demo that one thing that might not be intuitive to users is how to fix an error that's caused by a JSON parsing error, with an error message such as:
error parsing configuration: invalid character ':' after top-level value
Two thoughts regarding this:
config:
field, please check that there's valid JSON within"[centos@kube-nonetwork-master net-attach-def-admission-controller]$ kubectl logs net-attach-def-admission-controller-server-765d66c78b-nstmt --namespace=kube-system
webhook flag redefined: log_dir
panic: webhook flag redefined: log_dir
goroutine 1 [running]:
flag.(*FlagSet).Var(0xc42003a120, 0x11e3fe0, 0xc420070d20, 0x1104d9b, 0x7, 0x11281a6, 0x2f)
/usr/local/go/src/flag/flag.go:810 +0x540
flag.(*FlagSet).StringVar(0xc42003a120, 0xc420070d20, 0x1104d9b, 0x7, 0x0, 0x0, 0x11281a6, 0x2f)
/usr/local/go/src/flag/flag.go:713 +0x8b
flag.(*FlagSet).String(0xc42003a120, 0x1104d9b, 0x7, 0x0, 0x0, 0x11281a6, 0x2f, 0xc420070d10)
/usr/local/go/src/flag/flag.go:726 +0x8b
flag.String(0x1104d9b, 0x7, 0x0, 0x0, 0x11281a6, 0x2f, 0xc4200e4800)
/usr/local/go/src/flag/flag.go:733 +0x69
./hack/webhook-deployment needs some attention
While testing updates to Multus, I encountered this one:
[centos@kube-singlehost-master multus-cni]$ cat <<EOF | kubectl create -f -
> apiVersion: v1
> kind: Pod
> metadata:
> name: samplepod
> annotations:
> k8s.v1.cni.cncf.io/networks: '{
> "name": "macvlan-conf",
> "default-route": ["192.168.2.1"]
> }'
> spec:
> containers:
> - name: samplepod
> command: ["/bin/bash", "-c", "trap : TERM INT; sleep infinity & wait"]
> image: dougbtv/centos-network
> EOF
Error from server: error when creating "STDIN": admission webhook "net-attach-def-admission-controller-isolating-config.k8s.io" denied the request: parsePodNetworkAnnotation: failed to parse pod Network Attachment Selection Annotation JSON format: json: cannot unmarshal object into Go value of type []*types.NetworkSelectionElement
Looks like we'll need to update this to account for that field.
E.g. to prevent:
Oct 27 14:58:01 worker-0 crio[2934]: time="2020-10-27 14:58:01.780170471Z" level=info msg="About to del CNI network multus-cni-network (type=multus)"
Oct 27 14:58:01 worker-0 crio[2934]: time="2020-10-27 14:58:01.865939294Z" level=error msg="Error deleting network: delegateDel: error invoking DelegateDel - \"egress-router\": error in getting result from DelNetwork: invalid version \"\": the version is empty"
I've ran into some issues before where I've got to remove the CSR to let it regen everything. There's probably a number of these we can fix up in the installer
Feedback from @dcbw
at the very least if the secret exists, bail
eg if we have a secret already, then we don't need to do any of the CSR stuff
it should perhaps still reconcile (and recrete) the webhook config and service, if they don't exist
Some commentary that I received regarding the admission controller:
- The self-approval at https://github.com/K8sNetworkPlumbingWG/net-attach-def-admission-controller/blob/develop/pkg/installer/installer.go#L99 looks concerning in the current installer. Does the controller run with cluster admin permissions to be able to do that? That's generally considered bad practice. If the installer is a separate binary with an extra privileged kubeconfig, it's fine.
- https://github.com/K8sNetworkPlumbingWG/net-attach-def-admission-controller/blob/develop/pkg/installer/installer.go#L175 โ pods are always in the legacy core API group, not in apps.
- I assume the webhook is running as a pod in the cluster. Its namespace should be excluded, compare https://github.com/K8sNetworkPlumbingWG/net-attach-def-admission-controller/blob/develop/vendor/k8s.io/api/admissionregistration/v1beta1/types.go#L193.
- Instead it uses
failurePolicy := arv1beta1.Ignore
which means the admission webhook logic will be ignored if the webhook is down. I cannot judge how security critical the webhook is, just be aware of it.
Additionally some references:
For the control plane cert rotation we are using https://github.com/openshift/library-go/tree/master/pkg/operator/certrotation. For service serving certs we have https://github.com/openshift/service-serving-cert-signer in OpenShift maintained by the auth team, not sure about the state of rotation in there though. But propably that would be the way to go.
Example, this is actually admitted:
I0805 14:32:25.409571 1 webhook.go:117] validating network config spec: { "cniVersion": "0.4.0", "type": "egress-router", "name": "egress-router-cni-nad", "ip": { "addresses": [ "fd2e:6f44:5dd8::64/64" ], "destinations": ["80 TCP 2607:f8b0:4004:808::200e","8080 TCP 2600:1408:20:c81::3831 80","8888 TCP 2001:420:1101:1::185 80"],
"gateway": "fe80::5054:ff:feb0:6560" }, "log_file": "/tmp/egress-router-log", "log_level": "debug" }
I0805 14:32:25.410002 1 webhook.go:141] spec is not a valid network config list: error parsing configuration list: no 'plugins' key - trying to parse into standalone config
I0805 14:32:25.410209 1 webhook.go:154] AdmissionReview request allowed: Network Attachment Definition '{"cniVersion":"0.4.0","ip":{"addresses":["fd2e:6f44:5dd8::64/64"],"destinations":["80 TCP 2607:f8b0:4004:808::200e","8080 TCP 2600:1408:20:c81::3831 80","8888 TCP 2001:420:1101:1::185 80"],"gateway":"fe80::5054:ff:feb0:6560"},"log_file":"/tmp/egress-router-log","log_level":"debug","name":"egress-router-cni-nad","type":"egress-router"}' is valid
So it shouldn't say error, just a notice (if anything at all)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.