Git Product home page Git Product logo

azure / azure-container-networking Goto Github PK

View Code? Open in Web Editor NEW
359.0 36.0 228.0 81.95 MB

Azure Container Networking Solutions for Linux and Windows Containers

License: MIT License

Go 95.61% Makefile 1.33% Shell 1.98% PowerShell 0.49% Dockerfile 0.34% Starlark 0.01% Python 0.09% C 0.14%
container-networking azure azure-container azure-container-service linux-containers windows-containers kubernetes-networking

azure-container-networking's Introduction

Microsoft Azure Container Networking

Build Status Go Report Card GitHub release

Azure Network Policy Manager Conformance
Cyclonus Network Policy Suite Cyclonus Network Policy Test
Kubernetes Network Policy E2E Build Status

Overview

This repository contains container networking services and plugins for Linux and Windows containers running on Azure:

The azure-vnet network plugins connect containers to your Azure VNET, to take advantage of Azure SDN capabilities. The azure-vnet-ipam IPAM plugins provide address management functionality for container IP addresses allocated from Azure VNET address space.

The following environments are supported:

Plugins are offered as part of Azure Kubernetes Service (AKS), as well as for individual Azure IaaS VMs. For Kubernetes clusters created by aks-engine, the deployment and configuration of both plugins on both Linux and Windows nodes is automatic and default.

The next generation of Azure CNI Plugin is powered by Cilium. Learn more at Azure CNI Powered By Cilium

Documentation

See Documentation for more information and examples.

Build

This repository builds on Windows and Linux. Build plugins directly from the source code for the latest version.

$ git clone https://github.com/Azure/azure-container-networking
$ cd azure-container-networking
$ make all-binaries

Then follow the instructions for the plugin in Documentation.

Contributions

Contributions in the form of bug reports, feature requests and PRs are always welcome.

Please follow these steps before submitting a PR:

  • Create an issue describing the bug or feature request.
  • Clone the repository and create a topic branch.
  • Make changes, adding new tests for new functionality.
  • Submit a PR.

License

See LICENSE.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

azure-container-networking's People

Contributors

aegal avatar aggarwal0009 avatar ashvindeodhar avatar behzad-mir avatar camrynl avatar csfmomo avatar dependabot[bot] avatar huntergregory avatar jaer-tsun avatar jpayne3506 avatar jungukcho avatar kmurudi avatar matmerr avatar nddq avatar neaggarwms avatar ninzavivek avatar ofiliz avatar paulyufan2 avatar pjohnst5 avatar qxbytes avatar ramiro-gamarra avatar rbtr avatar rjdenney avatar sharmasushant avatar tamilmani1989 avatar thatmattlong avatar timraymond avatar vakalapa avatar vipul-21 avatar zetaozhuang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

azure-container-networking's Issues

Kubernetes Network Policies not enforced

Is this a request for help?:

Yes

h component (CNI/IPAIs this an ISSUE or FEATURE REQUEST? (choose one):

Issue

Which release version?:

azure-npm:v0.0.3

Which component (CNI/IPAM/CNM/CNS):

Network Policy Manager

Which Operating System (Linux/Windows):

Linux - AKS

Which Orchestrator and version (e.g. Kubernetes, Docker)

Kubernetes

What happened:

Deployed https://github.com/Azure/acs-engine/blob/master/parts/k8s/addons/kubernetesmasteraddons-azure-npm-daemonset.yaml . Tried https://github.com/ahmetb/kubernetes-network-policy-recipes/blob/master/04-deny-traffic-from-other-namespaces.md. Pod still accessible from other namespaces. Samples using pod selectors do seem to work.

What you expected to happen:

Traffic to be blocked from outside the namespace.

How to reproduce it (as minimally and precisely as possible):

Deploy AKs with advanced networking. Deploy https://github.com/Azure/acs-engine/blob/master/parts/k8s/addons/kubernetesmasteraddons-azure-npm-daemonset.yaml . Follow: https://github.com/ahmetb/kubernetes-network-policy-recipes/blob/master/04-deny-traffic-from-other-namespaces.md.

Make CNI Network Monitor part of CNI

Is this a request for help?:
NO

Is this an ISSUE or FEATURE REQUEST? (choose one):
FEATURE REQUEST

Which component (CNI/IPAM/CNM/CNS):
CNI

Which Operating System (Linux/Windows):
Linux

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

azure-cni returns unparseable output on IP address exhaustion, causes infinite retry loop

Is this an ISSUE or FEATURE REQUEST? (choose one): Issue


Which release version?: 1.0.7


Which component (CNI/IPAM/CNM/CNS): CNI


Which Operating System (Linux/Windows): Windows Server version 1803


When the IP range for a node is exhausted, the kubelet needs to be able to parse the CNI output correctly to determine the error. in IP exhaustion - the pod should be evicted and scheduled on another node. Since the output cannot be parsed - the kubelet will go into a loop trying to schedule the container forever. If there are too many doing this, it can make the node unresponsive.

Logs from kubelet:

E0712 22:28:39.344476    2872 cni.go:260] Error adding network: netplugin failed but error parsing its diagnostic message "{\n    \"cniVersion\": \"0.3.0\",\n    \"interfaces\":
 [\n        {\n            \"name\": \"eth0\"\n        }\n    ],\n    \"dns\": {}\n}{\n    \"code\": 100,\n    \"msg\": \"Failed to allocate address: Failed to delegate: Failed
to allocate address: No available addresses\"\n}": invalid character '{' after top-level value
E0712 22:28:39.344476    2872 cni_windows.go:49] error while adding to cni network: netplugin failed but error parsing its diagnostic message "{\n    \"cniVersion\": \"0.3.0\",\
n    \"interfaces\": [\n        {\n            \"name\": \"eth0\"\n        }\n    ],\n    \"dns\": {}\n}{\n    \"code\": 100,\n    \"msg\": \"Failed to allocate address: Failed
to delegate: Failed to allocate address: No available addresses\"\n}": invalid character '{' after top-level value
W0712 22:28:39.344476    2872 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "iis-1803-688b96c57b-rqlxs_def
ault": netplugin failed but error parsing its diagnostic message "{\n    \"cniVersion\": \"0.3.0\",\n    \"interfaces\": [\n        {\n            \"name\": \"eth0\"\n        }\
n    ],\n    \"dns\": {}\n}{\n    \"code\": 100,\n    \"msg\": \"Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses\"\n}": invali
d character '{' after top-level value

Steps to reproduce:

  • Deploy a 2-node Windows cluster using acs-engine v0.19.2
  • Cordon one of the nodes to force all to be scheduled on 1 node
  • Create an IIS deployment with kubectl apply -f https://gist.github.com/PatrickLang/0df013d20d32eb98bc57456c4f73461a
  • Attempt to scale it to 30 kubectl scale deploy/iis-1803 --replicas=30
  • Uncordon the other node
  • run kubectl get pod -o wide to watch if any fail over to other node. they won't
  • Look at c:\k\kubelet.err.log on the node

Add PR build jobs to test the code before merge

Is this a request for help? : Yes

Is this an ISSUE or FEATURE REQUEST? (choose one): ISSUE

Which release version?: all future versions

Which component (CNI/IPAM/CNM/CNS): all

Which Operating System (Linux/Windows): all

Which Orchestrator and version (e.g. Kubernetes, Docker): N/A

What happened: On PRs to this repo, there are no tests run automatically.

What you expected to happen: PR build jobs to verify the code of the PR before merge.

How to reproduce it (as minimally and precisely as possible): Open a PR to this repo.

Anything else we need to know

Having PR build jobs will help prove new code works as expected and prevent regressions. I see this repo has code that is desired to execute on windows and linux. For contributors who only develop on one of those systems, the PR build job will also help double-check the code works on the other system.

Basic tests:

  • unit tests (go test)
  • formatting (gofmt)
  • code coverage

More sophisticated tests (for bonus points):

  • stress tests
  • fuzz testing

You can use appveyor to run a windows build job for PRs. Microsoft has an example setup for the hcsshim repo: https://github.com/Microsoft/hcsshim/blob/master/appveyor.yml

You can use circleci or travis to setup a linux build job for PRs. For example, cni uses travis:
https://github.com/containernetworking/cni/blob/master/.travis.yml

Using CNM on the docker, the containers can not be reached

Hi,

I created docker network with azure-vnet-plugin, and created few containers like Ubuntu, nginx, they can't be reached via the VM in the same vNet.

The VM IP: 10.1.0.4
The container alping IP: 10.1.0.7
The test VM IP: 10.1.0.6

steven@myubuntutest:~$ sudo docker run -it --net=azure alpine
/ # ifconfig
eth0 Link encap:Ethernet HWaddr 5A:E0:6A:AA:15:50
inet addr:10.1.0.7 Bcast:10.1.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
/ # ping 10.1.0.6
PING 10.1.0.6 (10.1.0.6): 56 data bytes

From the VM 10.1.0.6, it can not reach alpine container as well, they are in the same vnet, I tried nginx, the same issue. Any idea or advice?

The more information as below:
vnet-plugin version: 1.0.1

steven@myubuntutest:$ docker --version
Docker version 17.12.0-ce, build c97c6d6
steven@myubuntutest:
$ docker network ls
NETWORK ID NAME DRIVER SCOPE
316fa9ad9510 azure azure-vnet local

Thanks a lot!

run CNM sample with Docker on Azure, can't run docker

HI,
When I follow the example create an Ubuntu VMM, install docker, vnet plugin, and run sample:
docker run -it --rm --net=azure ubuntu:latest /bin/bash

It shows:
docker: Error response from daemon: No address returned.

The docker network creation is successful:
$ sudo docker network create --driver=azure-vnet --ipam-driver=azure-vnet --subnet=10.1.0.0/24 azure
316fa9ad951081a6b7553b7f51b967e50aa5cdbc3b9124761ed5bd47cf5ab352

anything I miss here?

Create new custom chain for ebtables

Is this an ISSUE or FEATURE REQUEST? (choose one):
FEATURE REQUEST

Which component (CNI/IPAM/CNM/CNS):
CNI/CNS/CNS

Which Operating System (Linux/Windows):
Linux

This requires CNI network monitor to start monitoring the new chain.

The project is not vendored

Is this a request for help?:

Not really


Is this an ISSUE or FEATURE REQUEST? (choose one):

Issue


Which release version?:
All versions

Which componenet (CNI/IPAM/CNM/CNS):

All components


The dependencies that are being used to build binaries are not checked into this repository.
For reproducible builds across forks, and over time it would be helpful to have the dependencies ( / libraries) vendored, and checked into this repository.

will vnet plugin support multi-subnet

Is this a request for help?:
yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
FEATURE REQUEST

Which release version?:
1.0.4

Which componenet (CNI/IPAM/CNM/CNS):
CNI/IPAM

will vnet plugin support multi-subnet in a k8s cluster in acs-engine and aks?
do we have the roadmap

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

build fails

make all-binaries
go build -v -o output/linux_amd64/azure-cnm-plugin -ldflags "-X main.version=v0.7-2-gf768da9-dirty -s -w" cnm/plugin/*.go
github.com/Azure/azure-container-networking/log
github.com/Azure/azure-container-networking/store
github.com/Azure/azure-container-networking/platform

github.com/Azure/azure-container-networking/log

log/logger.go:44: logger.SetTarget undefined (type Logger has no field or method SetTarget)
log/stdapi.go:19: stdLog.SetTarget undefined (type *Logger has no field or method SetTarget)
make: *** [output/linux_amd64/azure-cnm-plugin] Error 2

ClusterIPs should also be assigned by Azure SDN

Only POD IP space is assigned by the AzureSDN, but there should also be a mechanism to assign the Service CIDR space from the Azure SDN as well. This allows flexibility to place Load Balancers within Azure SDN, within K8s from internal to Azure only communication or to external services via the LoadBalancer type.

Intermittent "Store is locked vnet" on azure-cni 1.0.7

Is this an ISSUE or FEATURE REQUEST? (choose one): issue


Which release version?: 1.0.7


Which component (CNI/IPAM/CNM/CNS): CNI


Which Operating System (Linux/Windows): Windows (maybe Linux too)


For windows: provide output of "$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion" 10.0.17134.112


Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes v1.11


What happened:

I'm getting this intermittently in the logs with no apparent failures. Pods still start eventually and I'm not out of IPs

2018/07/11 22:47:05 [cni] Timed out on locking store, err:Store is locked.
2018/07/11 22:47:05 Failed to initialize key-value store of network plugin, err:Store is locked.

What you expected to happen:

No errors from CNI


How to reproduce it (as minimally and precisely as possible):
Deploy k8s cluster using acs-engine v0.19.2
Deploy 16 pods to a Windows node (https://raw.githubusercontent.com/PatrickLang/Windows-K8s-Samples/master/iis/iis-1803.yaml)


cc @CecileRobertMichon

Seeing "index out of range" error

I am seeing following errors on few nodes.

Plugin Version: 0.91

config:


{
  "cniVersion": "0.3.1",
  "name": "azure",
  "type": "azure-vnet",
  "master": "eth0",
  "bridge": "azure0",
   "mode": "bridge",
  "logLevel": "info",
  "ipam": {
    "type": "azure-vnet-ipam",
    "environment": "azure"
  }
}
2017/12/16 12:40:52 [cni-net] ADD command completed with result:IP:[{Version:4 Interface:<nil> Address:{IP:10.170.128.108 Mask:ffff8000} Gateway:10.170.128.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.170.128.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:<nil>.
2017/12/16 12:40:52 [cni] Recovered panic: runtime error: index out of range goroutine 1 [running]:
github.com/Azure/azure-container-networking/cni.(*Plugin).Execute.func1(0xc42011fef0)
	/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0xbc
panic(0x674a60, 0x804490)
	/usr/local/go/src/runtime/panic.go:489 +0x2cf
github.com/Azure/azure-container-networking/network.(*networkManager).addBridgeRules(0xc420014500, 0xc42001e900, 0xc420019280, 0xc420011660, 0x6, 0xc420011740, 0x6, 0x0, 0xf)
	/go/src/github.com/Azure/azure-container-networking/network/network_linux.go:155 +0x4eb
github.com/Azure/azure-container-networking/network.(*networkManager).connectExternalInterface(0xc420014500, 0xc42001e900, 0xc420084d80, 0x0, 0x0)
	/go/src/github.com/Azure/azure-container-networking/network/network_linux.go:255 +0x542
github.com/Azure/azure-container-networking/network.(*networkManager).newNetworkImpl(0xc420014500, 0xc420084d80, 0xc42001e900, 0x5, 0x82a160, 0xc42011f8a0)
	/go/src/github.com/Azure/azure-container-networking/network/network_linux.go:36 +0xa2
github.com/Azure/azure-container-networking/network.(*networkManager).newNetwork(0xc420014500, 0xc420084d80, 0x0, 0x0, 0x0)
	/go/src/github.com/Azure/azure-container-networking/network/network.go:149 +0x1f3
github.com/Azure/azure-container-networking/network.(*networkManager).CreateNetwork(0xc420014500, 0xc420084d80, 0x0, 0x0)
	/go/src/github.com/Azure/azure-container-networking/network/manager.go:163 +0x95
github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add(0xc4200d6620, 0xc42001c230, 0x0, 0x0)
	/go/src/github.com/Azure/azure-container-networking/cni/network/network.go:200 +0x18aa
github.com/Azure/azure-container-networking/cni.(PluginApi).Add-fm(0xc42001c230, 0xc4200114a6, 0x5)
	/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:112 +0x39
github.com/containernetworking/cni/pkg/skel.(*dispatcher).checkVersionAndCall(0xc420018a00, 0xc42001c230, 0x7e1060, 0xc420017620, 0xc42005feb0, 0x0, 0xc420020000)
	/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:167 +0x19f
github.com/containernetworking/cni/pkg/skel.(*dispatcher).pluginMain(0xc420018a00, 0xc42005feb0, 0xc42005fe98, 0x7e1060, 0xc420017620, 0xc42005fe68)
	/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:178 +0x2f8
github.com/containernetworking/cni/pkg/skel.PluginMainWithError(0xc42005feb0, 0xc42005fe98, 0x7e1060, 0xc420017620, 0xc420017620)
	/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:215 +0xed
github.com/Azure/azure-container-networking/cni.(*Plugin).Execute(0xc42000e040, 0x7e0f20, 0xc4200d6620, 0x0, 0x0)
	/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:112 +0x127
main.main()
	/go/src/github.com/Azure/azure-container-networking/cni/network/plugin/main.go:35 +0x328

2017/12/16 12:40:52 [cni-net] Plugin stopped.

cni: panic when testing standalone with cnitool

cnitool uses a very short ContainerID. This causes a panic in the plugin. This is a minor nit and should not matter in production. But standalone validation with cnitool fails.

https://github.com/containernetworking/cni/blob/master/cnitool/cni.go#L57

	rt := &libcni.RuntimeConf{
		ContainerID: "cni",
		NetNS:       netns,
		IfName:      "eth0",
	}

A bounds check is needed here
https://github.com/Azure/azure-container-networking/blob/master/cni/network/network.go#L89

// GetEndpointID returns a unique endpoint ID based on the CNI args.
func (plugin *netPlugin) getEndpointID(args *cniSkel.CmdArgs) string {
	return args.ContainerID[:8] + "-" + args.IfName
}
panic: runtime error: slice bounds out of range

goroutine 1 [running]:
github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add(0xc4200d0560, 0xc420060070, 0x3, 0x0)
        /home/tester/go/src/github.com/Azure/azure-container-networking/cni/network/network.go:142 +0x1982
github.com/Azure/azure-container-networking/cni.(PluginApi).Add-fm(0xc420060070, 0xc4200696d0, 0x5)
        /home/tester/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0x39
github.com/containernetworking/cni/pkg/skel.(*dispatcher).checkVersionAndCall(0xc42006ee80, 0xc420060070, 0x7da060, 0xc42006d7d0, 0xc42004deb0, 0x0, 0x0)
        /home/tester/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:162 +0x19f
github.com/containernetworking/cni/pkg/skel.(*dispatcher).pluginMain(0xc42006ee80, 0xc42004deb0, 0xc42004de98, 0x7da060, 0xc42006d7d0, 0x0)
        /home/tester/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:173 +0x2f8
github.com/containernetworking/cni/pkg/skel.PluginMainWithError(0xc42004deb0, 0xc42004de98, 0x7da060, 0xc42006d7d0, 0xc42006d7d0)
        /home/tester/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:210 +0xed
github.com/Azure/azure-container-networking/cni.(*Plugin).Execute(0xc420080030, 0x7d9f20, 0xc4200d0560, 0x0, 0xc4200642d0)
        /home/tester/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0xd6
main.main()
        /home/tester/go/src/github.com/Azure/azure-container-networking/cni/network/plugin/main.go:35 +0x328
netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input

Azure IPAM allocates subnet per pod on the node if subnet is not specified in network config

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

Which release version?:
I tried on azure v1.0.2

Which componenet (CNI/IPAM/CNM/CNS):
CNI IPAM

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

What happened:
As explained in the PR #105, azure ipam fails to allocate IPs from second pods onwards on nodes if the subnet in the network config is not set.

What you expected to happen:
Azure to allocate a pool if its not allocated and then allocate an ip for every pod from the pool.

How to reproduce it (as minimally and precisely as possible):

  1. Create a vnet 172.17.0.0/16
  2. Create 3 subnets in the vnet 172.17.1.0/24, 172.17.2.0/24 , 172.17.3.0/24

screen shot 2018-04-02 at 2 42 28 pm

  1. Create 3 Linux VMs on Azure and attach VM1 to default subnet , VM2 ucp-subnet, VM3 linux-subnet.
    So the azure metadata servers on each of the VMs now have access to the /24 subnet its attached to.
  2. Install kubernetes using kubeadm
  3. Now that kubernetes is running , copy the azure-vnet-ipam binary to cni directory.
  4. Deploy Calico cni and point the ipam to azure-vnet-ipam. Here is the calico.yaml I used.
    https://gist.github.com/abhi/b752cc662297c12b9780c0297d16181b
    Note the only change in this yaml from standard calico yaml is the ipam is set to azure-vnet-ipam and the address pool for calico is set to the main address pool 172.17.0.0/16 (Though this is not relevant in my setup).
  5. Now that cni and ipam plugins are deploy. Kubernetes is ready for deployment.
kubectl get nodes
NAME           STATUS     ROLES     AGE       VERSION
abhi-cali-1    Ready      master    49m       v1.10.0
abhi-cali-2   NotReady   <none>    48m       v1.10.0
abhi-cali-3    Ready      <none>    48m       v1.10.0
  1. Deploy pods kubectl run curl-image6 --image=sequenceiq/alpine-curl --replicas=10 -- sleep 900000
kubectl get pods -o wide
NAME                           READY     STATUS              RESTARTS   AGE       IP            NODE
curl-image6-55f998f9cb-29b64   0/1       ContainerCreating   0          16m       <none>        abhi-cali-3
curl-image6-55f998f9cb-4n8g8   0/1       ContainerCreating   0          16m       <none>        abhi-cali-3
curl-image6-55f998f9cb-9q9qx   0/1       ContainerCreating   0          16m       <none>        abhi-cali-2
curl-image6-55f998f9cb-dfw8f   0/1       ContainerCreating   0          16m       <none>        abhi-cali-2
curl-image6-55f998f9cb-fsgh6   0/1       ContainerCreating   0          16m       <none>        abhi-cali-3
curl-image6-55f998f9cb-fwwq4   0/1       ContainerCreating   0          16m       <none>        abhi-cali-3
curl-image6-55f998f9cb-mlh5s   0/1       ContainerCreating   0          16m       <none>        abhi-cali-2
curl-image6-55f998f9cb-p9zff   1/1       Running             0          16m       172.17.2.80   abhi-cali-3
curl-image6-55f998f9cb-w5t2d   1/1       Running             0          16m       172.17.0.62   abhi-cali-2
curl-image6-55f998f9cb-zt77x   0/1       ContainerCreating   0          16m       <none>        abhi-cali-2

Logs from azure ipam on of the nodes where the second pod failed

2018/04/02 21:36:46 [cni-ipam] Plugin started.
2018/04/02 21:36:46 [cni-ipam] Processing ADD command with args {ContainerID:0aafa74c66f31f16254a86429e79b4ed0cb7f61a05bffa99ea31239baabc022a Netns:/proc/105794/ns/net IfName:eth0 Args:IgnoreUnknown=1;IgnoreUnknown=1;K8S_POD_NAMESPACE=default;K8S_POD_NAME=curl-image6-55f998f9cb-mlh5s;K8S_POD_INFRA_CONTAINER_ID=0aafa74c66f31f16254a86429e79b4ed0cb7f61a05bffa99ea31239baabc022a Path:/opt/calico/bin:/opt/cni/bin}.
2018/04/02 21:36:46 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:k8s-pod-network Type:calico Mode: Master: Bridge: LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet: Address: QueryInterval:}}.
2018/04/02 21:36:46 [ipam] Starting source azure.
2018/04/02 21:36:46 [ipam] Refreshing address source.
2018/04/02 21:36:46 [ipam] Save succeeded.
2018/04/02 21:36:46 [ipam] Requesting pool with poolId: options:map[azure.interface.name:] v6:false.
2018/04/02 21:36:46 [ipam] Checking pool 172.17.0.0/24.
2018/04/02 21:36:46 [ipam] Pool is in use.
2018/04/02 21:36:46 [ipam] Pool request completed with pool:<nil> err:No available address pools.
2018/04/02 21:36:46 [azure-vnet-ipam] Failed to allocate pool: No available address pools.
2018/04/02 21:36:46 [cni-ipam] ADD command completed with result:<nil> err:Failed to allocate pool: No available address pools.
2018/04/02 21:36:46 [cni-ipam] Plugin stopped.

Anything else we need to know:

Hairpin-mode not correctly set on azveth interfaces

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

Which release version?:
0.3.0

Which component (CNI/IPAM/CNM/CNS):
CNI (I think)

Which Operating System (Linux/Windows):
Linux 4.13.0-1018-azure #21-Ubuntu SMP Thu May 17 13:58:38 UTC 2018 x86_64

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes 1.9.7, Docker 17.05.*

What happened:
As described on the Kubernetes website: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#a-pod-cannot-reach-itself-via-service-ip

Hairpin-mode is correctly set to hairpin-veth, but all azveth interfaces are created with hairpin_mode off.

What you expected to happen:
When running with --hairpin-mode=hairpin-veth, all azveth interface are created with hairpin_mode on.

How to reproduce it (as minimally and precisely as possible):
Roll out an ACS-Engine cluster with 'azure' networking, check azveth interfaces for hairpin_mode (in my experience, hairpin-veth option is default so no need to set that, check with 'journalctl -u kubelet.service).

Anything else we need to know:

k8s w/ Azure CNI agent node lost InternalIP

Tracking this here:

Azure/acs-engine#3503

Here are kubelet logs from the affected node:

Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 File not exist /var/run/AzureCNITelemetry.json
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 GetReport state file didn't exist. Setting flag to true
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 "Start Flag true CniSucceeded false Name CNI Version v1.0.7 ErrorMessage  vnet []
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]:                                 Context AzureCNI SubContext "
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 OrchestratorDetails &{kubectl controls }
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 OSDetails &{linux "16.04.4 LTS (Xenial Xerus)" 4.15.0-1013-azure ubuntu }
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 SystemDetails &{6946 3562 0 198471 194949 2 }
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 InterfaceDetails &{Primary 10.239.0.0/16 10.239.0.35 00:0d:3a:f9:9c:9d azure0 30 0 }
Jul 18 07:44:58 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:44:58 BridgeDetails <nil>
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 SendReport failed due to [Azure CNI] HTTP Post returned statuscode 500
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 "Start Flag true CniSucceeded true Name CNI Version v1.0.7 ErrorMessage  vnet []
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]:                                 Context AzureCNI SubContext "
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 OrchestratorDetails &{kubectl controls }
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 OSDetails &{linux "16.04.4 LTS (Xenial Xerus)" 4.15.0-1013-azure ubuntu }
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 SystemDetails &{6946 3562 0 198471 194949 2 }
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 InterfaceDetails &{Primary 10.239.0.0/16 10.239.0.35 00:0d:3a:f9:9c:9d azure0 30 0 }
Jul 18 07:45:00 k8s-agentmd-34767551-2 kubelet[8431]: 2018/07/18 07:45:00 BridgeDetails <nil>
Jul 18 07:45:02 k8s-agentmd-34767551-2 kubelet[8431]: W0718 07:45:02.692082    8431 docker_sandbox.go:372] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "ngin
Jul 18 07:45:02 k8s-agentmd-34767551-2 kubelet[8431]:  with error: exit status 1```

Does the above logs suggest some errata from Azure CNI caused the node to lose its InternalIP?

While the node was in this state, we were able to manually query the instance metadata service, e.g.,

`curl -H Metadata:true "http://169.254.169.254/metadata/instance?api-version=2017-08-01"`.

Restarting kubelet on the affected node fixed this.

Any ideas?

CNI NetworkPolicy support

I have installed a Kubernetes cluster that makes use of the azure vnet and ipam CNI plugins. However, they don't seem to support the NetworkPolicy functionality in order to manage accessibility between containers through kubernetes configuration. Am I right?

Are we supposed to use another tool like Calico for this NetworkPolicy functionality, do they work well together?

Thank you.

Containers in the CNM managed space can't be reached by loadbalancer, can't access internet

after the issues described in #78 i decided to go with a simple approach and use only one subnet. Everything works as expected, the only issue is that the container isn't reachable by the azure loadbalancer and also can't access the internet.

Docker hosts:

  • subnet 10.4.70.0/24
  • docker-1: ip 10.4.70.201 (ips .11, .12, .13, .14) as secondary ips in azure portal
  • docker-2: ip 10.4.70.202 (ips .21, .22, .23, .24) as secondary ips in azure portal

What works:

  • Docker containers on both hosts can connect to each other.
  • Hosts on different subnets (e.g. 10.4.69.0/24) can connect to the host ip (.201) and also to the containers
  • Load balancer can connect to the host ip (.201)
  • Docker container has connectivity to other subnets
  • DNS resolution inside docker container

What doesn't work:

  • Load balancer can't connect to services running on the container ips (.11,...)
  • Docker containers have no connectivity to the internet

POD cannot connect to itself via service ip

With the azure network plugin, a pod listening on a TCP port (say 8080), and a service fronting that pod:

The pod CAN connect back to itself via localhost:8080
The pod CAN connect back to itself via <pod-ip>:8080
The pod CAN NOT connect back to itself via <service-ip>:8080
Point 3 is a problem: this should be possible.

Other notes:

The node itself, as well as other pods in the system CAN connect correctly to :8080.
Point 3 works correctly with the kubenet network plugin, it is the azure network plugin that is at fault.

Windows container not pulling DNS server from VNET

Is this a request for help?:

Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):

Issue

Which release version?:

1.04

Which componenet (CNI/IPAM/CNM/CNS):

CNM

Which Orchestrator and version (e.g. Kubernetes, Docker)
Docker (Windows with Containers)

What happened:
Container receives IP address, subnet mask and gateway but not DNS servers from Azure Virtual Network

What you expected to happen:
Expected to have DNS servers set on vEthernet interface.

How to reproduce it (as minimally and precisely as possible):

  1. Create Azure VM running Windows Server 2016 Datacenter with Containers image.
  2. Execute azure-vnet-plugin.exe and create docker network named azure
  3. Create "ip-config" in Azure Portal for NIC associated with Azure VM.
  4. Create windowsservercore container using --network=azure
  5. Connect to container PowerShell (docker exec -i containername powershell.exe)
  6. Issue Get-DNSClientServerAddress command and note that IPv4 DNS servers are blank

Anything else we need to know:
I have not seen any other configuration in Azure ip-config settings that would allow me to specify DNS settings so I would assume the Virtual network DNS servers would be pulled. The ip-config created in step 3 sets the address that the container pulls from the VNET.

thanks for any help.

Will ipvlan and hardware offloading be used for Linux container networking in azure?

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Feature request

Which release version?:


Which componenet (CNI/IPAM/CNM/CNS):
Cni

the current version uses Linux bridge to forward eth package. There’re so many implements can do this. Do we have a benchmark about Linux bridge vs ipvlan l2, macvtap...etc I saw some ipvlan code footprint in the repos. But why we don’ Use it?

Does azure have some plan to use ethcard to do some offloading to improve the container networking performance?

Which Orchestrator and version (e.g. Kubernetes, Docker)
K8s cni

What happened:
Ebtables and software bridge

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

Azure CNI IP for Windows Node addresses exhausted

Is this a request for help?: YES


Is this an ISSUE or FEATURE REQUEST? (choose one): ISSUE

Which release version?: v1.0.2


Which component (CNI/IPAM/CNM/CNS): CNI / IPAM


Which Operating System (Linux/Windows): Windows 10.0.16299.371 (WinBuild.160101.0800)


Hi there!

Having this problem only on my windows node (custom vnet) as running a jenkins slave build on microsoft/nanoserver:1709 says failed to create sandbox.

Windows was deployed via acs-engine v0.15.0

Looking at the kubelet logs.

E0524 15:12:16.688837    4628 cni.go:259] Error adding network: Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses
E0524 15:12:16.688837    4628 cni.go:227] Error while adding to cni network: Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses
E0524 15:12:18.421038    4628 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "jenkins-slave-lh941-hwt5j_default" network: Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses
E0524 15:12:18.421038    4628 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "jenkins-slave-lh941-hwt5j_default(0bd1a2a0-5f64-11e8-bfb5-000d3af4a99e)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "jenkins-slave-lh941-hwt5j_default" network: Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses

..
..

E0524 15:51:19.835524    4628 kuberuntime_gc.go:153] Failed to stop sandbox "175039d06f33221e340c8e5946f5271efb02f915169d36032a61385a8c496b46" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "jenkins-slave-hwhjq-1wdz1_default" network: Failed to delete endpoint: HNS failed with error : The network was not found.
2018/05/24 15:51:20 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
2018/05/24 15:51:20 "Start Flag false Name CNI Version v1.0.2 ErrorMessage Failed to delete endpoint: HNS failed with error : The network was not found.  vnet []
                                Context  SubContext "
2018/05/24 15:51:20 OrchestratorDetails &{  kubectl command failed due to exit status 1}
2018/05/24 15:51:20 OSDetails &{windows    }
2018/05/24 15:51:20 SystemDetails &{0 0 0 0 0 0 }
2018/05/24 15:51:20 InterfaceDetails &{Primary 172.20.200.0/24 172.20.200.65 00:0d:3a:f4:60:6a vEthernet (Ethernet 3) 30 0 }
2018/05/24 15:51:20 BridgeDetails <nil>
2018/05/24 15:51:20 Send telemetry success 200
2018/05/24 15:51:20 SetReportState succeeded
E0524 15:51:20.365059    4628 cni.go:277] Error deleting network: Failed to delete endpoint: HNS failed with error : The network was not found.
E0524 15:51:20.365689    4628 remote_runtime.go:115] StopPodSandbox "b32575b25eaa5d59f713c85ca4626f5ebf934fa7b639873aaa52bc41443d152b" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "nuget-nuget-64c46c8845-6q7zb_default" network: Failed to delete endpoint: HNS failed with error : The network was not found.
E0524 15:51:20.365689    4628 kuberuntime_gc.go:153] Failed to stop sandbox "b32575b25eaa5d59f713c85ca4626f5ebf934fa7b639873aaa52bc41443d152b" before removing: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "nuget-nuget-64c46c8845-6q7zb_default" network: Failed to delete endpoint: HNS failed with error : The network was not found.
E0524 15:51:20.434680    4628 kubelet_network.go:225] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": executable file not found in %PATH%:
2018/05/24 15:51:20 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
2018/05/24 15:51:20 "Start Flag false Name CNI Version v1.0.2 ErrorMessage Failed to delete endpoint: HNS failed with error : The network was not found.  vnet []

I think this might be related to this issue?

#76

Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes

What happened: IP Addresses exhausted

What you expected to happen: Old IP Addresses to be removed, and new ones assigned

How to reproduce it (as minimally and precisely as possible):

  • Windows 10.0.16299.371 (WinBuild.160101.0800)
  • Azure CNI VNET v1.0.2
  • Believe this can be reproduced simply by starting and stopping many containers

Anything else we need to know: Happy to provide more information ^_^

can not complie with the latest code https://github.com/containernetworking/cni

Is this a request for help?:
YES

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

Which release version?:
The latest code

Which component (CNI/IPAM/CNM/CNS):
CNI

Which Operating System (Linux/Windows):
Fedora

For Linux: Include Distro and kernel version using "uname -a"
Linux localhost.localdomain 4.16.16-300.fc28.x86_64 #1 SMP Sun Jun 17 03:02:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

For windows: provide output of "$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion"


Which Orchestrator and version (e.g. Kubernetes, Docker)


What happened:

  1. go get -d github.com/Azure/azure-container-networking
  2. go get -d github.com/containernetworking/cni
  3. make all-binaries
    I got the following errors:
    chmod 0755 output/linux_amd64/cnm/azure-vnet-plugin
    cd output/linux_amd64/cnm && tar -czvf azure-vnet-cnm-linux-amd64-v1.0.6-3-ge29adde.tgz azure-vnet-plugin
    azure-vnet-plugin
    chown 1000:1000 output/linux_amd64/cnm/azure-vnet-cnm-linux-amd64-v1.0.6-3-ge29adde.tgz
    go build -v -o output/linux_amd64/cni/azure-vnet -ldflags "-X main.version=v1.0.6-3-ge29adde -s -w" cni/network/plugin/*.go
    github.com/Azure/azure-container-networking/cni

github.com/Azure/azure-container-networking/cni

cni/plugin.go:107:35: not enough arguments in call to invoke.DelegateAdd
have (string, []byte)
want (string, []byte, invoke.Exec)
cni/plugin.go:129:29: not enough arguments in call to invoke.DelegateDel
have (string, []byte)
want (string, []byte, invoke.Exec)
make: *** [Makefile:106: output/linux_amd64/cni/azure-vnet] Error 2

What you expected to happen:

Compile the code successfully

How to reproduce it (as minimally and precisely as possible):


Anything else we need to know:


dirPath is empty, nothing to create / plugin not found

Is this a request for help?:

Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):

Issue

Which release version?:

1.04

Which componenet (CNI/IPAM/CNM/CNS):

CNM

Which Orchestrator and version (e.g. Kubernetes, Docker)
Docker on Windows Server 2016 Enterprise with Containers (IaaS VM)

What happened:
Receiving an error when launching azure-vnet-plugin.exe and then am unable to create docker network.

What you expected to happen:
Unsure what the output is supposed to be, expected no errors and docker network creation success.

How to reproduce it (as minimally and precisely as possible):
azure-vnet.log

azure-vnet.json.txt
***Note - renamed azure-vnet.json to azure-vnet.json.txt to allow me to attach to this issue.

  • Create Windows Server Enterprise with Containers Virtual Machine in Azure.

  • Create data disk and move docker ProgramData and storage to data disk

  • Execute azure-vnet-plugin.exe as Administrator, from either OS disk or data disk

    • Receive an error: "dirPath is empty, nothing to create." The plugin begins listening on TCP 48080. azure-vnet.json is created in same directory as EXE. azure-vnet.txt is created in same directory also.
    • Move azure-vnet.json to E:\ProgramData\docker\plugins directory but receive same error.
  • Execute this command: C:\Windows\system32>docker network create --driver=azure-vnet --ipam-driver=aure-vnet --subnet=10.16.8.0/22 azure

    • Receive this error: Error response from daemon: plugin not found

Anything else we need to know:
Docker daemon is executing from non-standard location but this is not causing any other issues. This may be operator error, I am unsure where plugin should be stored.

CNI plugin crashes with nil panic

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Issue

Which release version?:
1.0.4

Which component (CNI/IPAM/CNM/CNS):


Which Operating System (Linux/Windows):


For Linux: Include Distro and kernel version using "uname -a"


For windows: provide output of "$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion"


Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

AKS customer's CNI Azure plugin fails to assign IP address for pods

this is where we fail :
// Call into IPAM plugin to allocate an address for the endpoint.
nwCfg.Ipam.Subnet = subnetPrefix
result, err = plugin.DelegateAdd(nwCfg.Ipam.Type, nwCfg)
if err != nil {
err = plugin.Errorf("Failed to allocate address: %v", err)
return err
}

KUBELET_CONFIG=--address=0.0.0.0 --allow-privileged=true --anonymous-auth=false --authorization-mode=Webhook --azure-container-registry-config=/etc/kubernetes/azure.json --cadvisor-port=0 --cgroups-per-qos=true --client-ca-file=/etc/kubernetes/certs/ca.crt --cloud-config=/etc/kubernetes/azure.json --cloud-provider=azure --cluster-dns=10.129.128.4 --cluster-domain=cluster.local --enforce-node-allocatable=pods --event-qps=0 --eviction-hard=memory.available<100Mi,nodefs.available<10%,nodefs.inodesFree<5% --feature-gates= --image-gc-high-threshold=85 --image-gc-low-threshold=80 --image-pull-progress-deadline=30m --keep-terminated-pod-volumes=false --kubeconfig=/var/lib/kubelet/kubeconfig --max-pods=30 --network-plugin=cni --node-status-update-frequency=10s --non-masquerade-cidr=10.128.16.0/20 --pod-infra-container-image=k8s-gcrio.azureedge.net/pause-amd64:3.1 --pod-manifest-path=/etc/kubernetes/manifests
KUBELET_IMAGE=k8s-gcrio.azureedge.net/hyperkube-amd64:v1.9.6
KUBELET_REGISTER_SCHEDULABLE=true
KUBELET_NODE_LABELS=kubernetes.io/role=agent,agentpool=agentpool,storageprofile=managed,storagetier=Premium_LRS,kubernetes.azure.com/cluster=MC_nexus-sandbox-dev-rg_nexus-sandbox-dev-cluster_westeurope

oot@aks-agentpool-66686873-8:/etc/cni/net.d# cat 10-azure.conflist
{
"cniVersion":"0.3.0",
"name":"azure",
"plugins":[
{
"type":"azure-vnet",
"mode":"bridge",
"bridge":"azure0",
"ipam":{
"type":"azure-vnet-ipam"
}
},
{
"type":"portmap",
"capabilities":{
"portMappings":true
},
"snat":true
}
]

AKS Customer cannot deploy pods becouse cni plugin is crashing :

rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nas-publish-1530001800-c4fsz_nexus-dev" network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: E0626 08:31:00.012250 16604 kuberuntime_manager.go:647] createPodSandbox for pod "nas-publish-1530001800-c4fsz_nexus-dev(1e2ff9eb-791b-11e8-8c9a-d6939f04c1f3)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nas-publish-1530001800-c4fsz_nexus-dev" network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: E0626 08:31:00.012849 16604 pod_workers.go:186] Error syncing pod 1e2ff9eb-791b-11e8-8c9a-d6939f04c1f3 ("nas-publish-1530001800-c4fsz_nexus-dev(1e2ff9eb-791b-11e8-8c9a-d6939f04c1f3)"), skipping: failed to "CreatePodSandbox" for "nas-publish-1530001800-c4fsz_nexus-dev(1e2ff9eb-791b-11e8-8c9a-d6939f04c1f3)" with CreatePodSandboxError: "CreatePodSandbox for pod "nas-publish-1530001800-c4fsz_nexus-dev(1e2ff9eb-791b-11e8-8c9a-d6939f04c1f3)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "nas-publish-1530001800-c4fsz_nexus-dev" network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input"
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 "Start Flag false CniSucceeded false Name CNI Version v1.0.4-1-gf0f090e ErrorMessage runtime error: invalid memory address or nil pointer dereference; goroutine 1 [running]:
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni.(*Plugin).Execute.func1(0xc420211ec0)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0xbc
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: panic(0x69b3c0, 0x83ed90)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /usr/local/go/src/runtime/panic.go:489 +0x2cf
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add.func1(0xc420211910, 0xc4201f2850, 0xc4202118d0, 0xc4202118f0, 0xc4200c09c0)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/network/network.go:150 +0x9f
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add(0xc4200c09c0, 0xc4201f2850, 0x818c20, 0xc4201007e0)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/network/network.go:285 +0x13b7
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni.(PluginApi).Add-fm(0xc4201f2850, 0xc420187fd6, 0x5)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/plugin.go:112 +0x39
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/containernetworking/cni/pkg/skel.(*dispatcher).checkVersionAndCall(0xc420177000, 0xc4201f2850, 0x81b220, 0xc4200ed680, 0xc420211e80, 0x0, 0xc420102040)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/containernetworking/cni/pkg/skel/skel.go:168 +0x19f
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/containernetworking/cni/pkg/skel.(*dispatcher).pluginMain(0xc420177000, 0xc420211e80, 0xc420211e68, 0x81b220, 0xc4200ed680, 0xc420211e38)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/containernetworking/cni/pkg/skel/skel.go:199 +0x384
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/containernetworking/cni/pkg/skel.PluginMainWithError(0xc420211e80, 0xc420211e68, 0x81b220, 0xc4200ed680, 0xc4200ed680)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/containernetworking/cni/pkg/skel/skel.go:236 +0xed
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni.(*Plugin).Execute(0xc42000e080, 0x81b0e0, 0xc4200c09c0, 0x0, 0x0)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/plugin.go:112 +0x127
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: main.main()
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/network/plugin/main.go:93 +0x4d4
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: vnet []
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: Context AzureCNI SubContext "
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 OrchestratorDetails &{Kubernetes v1.9.6
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: }
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 OSDetails &{linux 16.04.4 4.13.0-1018-azure Ubuntu }
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 SystemDetails &{32147 12944 0 29715 14942 4 }
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 InterfaceDetails &{ 0 0 xml decode failed due to expected element type but have }
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: 2018/06/26 08:31:00 BridgeDetails
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: E0626 08:31:00.217683 16604 cni.go:259] Error adding network: runtime error: invalid memory address or nil pointer dereference; goroutine 1 [running]:
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni.(*Plugin).Execute.func1(0xc420211ec0)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0xbc
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: panic(0x69b3c0, 0x83ed90)
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: /usr/local/go/src/runtime/panic.go:489 +0x2cf
Jun 26 08:31:00 aks-agentpool-66686873-8 kubelet[16604]: github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add.func1(0xc420211910

vnet log :
2018/06/26 09:31:36 [cni-net] Processing ADD command with args {ContainerID:cb38e2330422821950138d84dc16ca355412b724fbfa492deabed9d81524f051 Netns:/proc/29051/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=nexus-dev;K8S_POD_NAME=nexus-ui-db5c9656d-9t2ps;K8S_POD_INFRA_CONTAINER_ID=cb38e2330422821950138d84dc16ca355412b724fbfa492deabed9d81524f051 Path:/opt/azure-vnet/bin:/opt/cni/bin}.
2018/06/26 09:31:36 [cni-net] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet: Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/26 09:31:36 [cni-net] Found network azure with subnet 10.128.16.0/20.
2018/06/26 09:31:36 [cni] Calling plugin azure-vnet-ipam ADD nwCfg:&{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/26 09:31:36 [cni] Plugin azure-vnet-ipam returned result:, err:Failed to allocate address: No available addresses.
2018/06/26 09:31:36 [azure-vnet] Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses.
2018/06/26 09:31:36 [cni] Recovered panic: runtime error: invalid memory address or nil pointer dereference goroutine 1 [running]:
github.com/Azure/azure-container-networking/cni.(*Plugin).Execute.func1(0xc420153ec0)
/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0xbc
panic(0x69b3c0, 0x83ed90)
/usr/local/go/src/runtime/panic.go:489 +0x2cf
github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add.func1(0xc420153910, 0xc4201b0000, 0xc4201538d0, 0xc4201538f0, 0xc42015ca20)
/go/src/github.com/Azure/azure-container-networking/cni/network/network.go:150 +0x9f
github.com/Azure/azure-container-networking/cni/network.(*netPlugin).Add(0xc42015ca20, 0xc4201b0000, 0x818c20, 0xc4200eeff0)
/go/src/github.com/Azure/azure-container-networking/cni/network/network.go:285 +0x13b7
github.com/Azure/azure-container-networking/cni.(PluginApi).Add-fm(0xc4201b0000, 0xc420142290, 0x5)
/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:112 +0x39
github.com/containernetworking/cni/pkg/skel.(*dispatcher).checkVersionAndCall(0xc42013c400, 0xc4201b0000, 0x81b220, 0xc4200ee270, 0xc420153e80, 0x0, 0xc42013c200)
/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:168 +0x19f
github.com/containernetworking/cni/pkg/skel.(*dispatcher).pluginMain(0xc42013c400, 0xc420153e80, 0xc420153e68, 0x81b220, 0xc4200ee270, 0xc420153e38)
/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:199 +0x384
github.com/containernetworking/cni/pkg/skel.PluginMainWithError(0xc420153e80, 0xc420153e68, 0x81b220, 0xc4200ee270, 0xc4200ee270)
/go/src/github.com/containernetworking/cni/pkg/skel/skel.go:236 +0xed
github.com/Azure/azure-container-networking/cni.(*Plugin).Execute(0xc42000e108, 0x81b0e0, 0xc42015ca20, 0x0, 0x0)
/go/src/github.com/Azure/azure-container-networking/cni/plugin.go:112 +0x127
main.main()
/go/src/github.com/Azure/azure-container-networking/cni/network/plugin/main.go:93 +0x4d4

ipam logs :
018/06/26 08:12:25 [ipam] Starting source azure.
2018/06/26 08:12:25 [ipam] Refreshing address source.
2018/06/26 08:12:25 [ipam] Save succeeded.
2018/06/26 08:12:25 [ipam] Requesting address with address: options:map[].
2018/06/26 08:12:25 [ipam] Address request completed with address: err:.
2018/06/26 08:12:25 [azure-vnet-ipam] Failed to allocate address: No available addresses.
2018/06/26 08:12:25 [cni-ipam] ADD command completed with result: err:Failed to allocate address: No available addresses.
2018/06/26 08:12:25 [cni-ipam] Plugin stopped.
2018/06/26 08:12:27 [cni-ipam] Plugin azure-vnet-ipam version v1.0.4-1-gf0f090e.
2018/06/26 08:12:27 [cni-ipam] Running on Linux version 4.13.0-1018-azure (buildd@lcy01-amd64-014) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)) #21-Ubuntu SMP Thu May 17 13:58:38 UTC 2018
2018/06/26 08:12:27 [ipam] Store timestamp is 2018-06-26 08:12:25.4999318 +0000 UTC.
2018/06/26 08:12:27 [ipam] Restored state, &{Version:v1.0.4-1-gf0f090e TimeStamp:2018-06-26 08:12:25.5050892 +0000 UTC AddrSpaces:map[local:0xc4200751d0] store:0xc420074db0 source: netApi: Mutex:{state:0 sema:0}}
2018/06/26 08:12:27 [cni-ipam] Plugin started.
2018/06/26 08:12:27 [cni-ipam] Processing ADD command with args {ContainerID:64d3d3a709384e177afd13ae65bd456294582b036896ad0e3e425f8b209be75c Netns:/proc/61869/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=nexus-dev;K8S_POD_NAME=esb-central-5c9bbbccb7-l2rnx;K8S_POD_INFRA_CONTAINER_ID=64d3d3a709384e177afd13ae65bd456294582b036896ad0e3e425f8b209be75c Path:/opt/azure-vnet/bin:/opt/cni/bin}.
2018/06/26 08:12:27 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/26 08:12:27 [ipam] Starting source azure.
2018/06/26 08:12:27 [ipam] Refreshing address source.
2018/06/26 08:12:27 [ipam] Save succeeded.
2018/06/26 08:12:27 [ipam] Requesting address with address: options:map[].
2018/06/26 08:12:27 [ipam] Address request completed with address: err:.
2018/06/26 08:12:27 [azure-vnet-ipam] Failed to allocate address: No available addresses.
2018/06/26 08:12:27 [cni-ipam] ADD command completed with result: err:Failed to allocate address: No available addresses.

S
Createdby: Dinor Geler [ DIGELER ] Pod get stuck in ContainerCreating state:
outgoing-messages-service-7fc659fb84-rn2ht 0/1 ContainerCreating 0 6m aks-agentpool-66686873-5

On aks-agentpool-66686873-5 node there are 28 pods while the limit is 30 so there should not be any issues with assigning IP address, but in the azure-vnet-ipam.log there are entries that show lack of free IP addresses:

2018/06/22 13:22:25 [cni-ipam] Plugin started.
2018/06/22 13:22:25 [cni-ipam] Processing ADD command with args {ContainerID:d18c6829adb1015ef677f2319d33157637a37b80612fa7474a643b374432f72b Netns:/proc/81095/ns/net IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=nexus-dev;K8S_POD_NAME=outgoing-messages-service-7fc659fb84-rn2ht;K8S_POD_INFRA_CONTAINER_ID=d18c6829adb1015ef677f2319d33157637a37b80612fa7474a643b374432f72b Path:/opt/azure-vnet/bin:/opt/cni/bin}.
2018/06/22 13:22:25 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address: QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/22 13:22:25 [ipam] Starting source azure.
2018/06/22 13:22:25 [ipam] Refreshing address source.
2018/06/22 13:22:25 [ipam] Save succeeded.
2018/06/22 13:22:25 [ipam] Requesting address with address: options:map[].
2018/06/22 13:22:25 [ipam] Address request completed with address: err:.
2018/06/22 13:22:25 [azure-vnet-ipam] Failed to allocate address: No available addresses.
2018/06/22 13:22:25 [cni-ipam] ADD command completed with result: err:Failed to allocate address: No available addresses.
2018/06/22 13:22:25 [cni-ipam] Plugin stopped.
[Problem start date and time]
Thu, 21 Jun 2018 22:00:00 GMT

2018/06/26 09:31:36 [cni] Recovered panic: runtime error: invalid memory address or nil pointer dereference goroutine 1 [running]:
github.com/Azure/azure-container-networking/cni.(*Plugin).Execute.func1(0xc420153ec0)
 /go/src/github.com/Azure/azure-container-networking/cni/plugin.go:94 +0xbc
panic(0x69b3c0, 0x83ed90)```

Failed to start container: FailedSync

I create ACS by json below:

{
"apiVersion": "vlabs",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Kubernetes",
"kubernetesConfig": {
"networkPolicy": "azure"
}
},
"masterProfile": {
"count": 1,
"dnsPrefix": "kubernetescni",
"vmSize": "Standard_D2_v2"
},
"agentPoolProfiles": [
{
"name": "agentpool",
"count": 2,
"vmSize": "Standard_D2_v2",
"availabilityProfile": "AvailabilitySet"
}
],
"linuxProfile": {
"adminUsername": "chenyl",
"ssh": {
"publicKeys": [
{
"keyData": "xxxxxxxxxxx"
}
]
}
},
"servicePrincipalProfile": {
"servicePrincipalClientID": "xxxxxxxxxxx",
"servicePrincipalClientSecret": "xxxxxxxxxxx"
}
}
}

Then I run the command kubectl run nginx --image nginx
However the container start error:

Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message


1m 1m 1 default-scheduler Normal Scheduled Successfully assigned nginx-2371676037-2qbpj to k8s-agentpool-40914460-0
53s 53s 1 kubelet, k8s-agentpool-40914460-0 Warning FailedSync Error syncing pod, skipping: failed to "KillPodSandbox" for "6cdc07e9-9c51-11e7-9b21-000d3aa09b41" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "nginx-2371676037-2qbpj_default" network: CNI failed to retrieve network namespace path: Error: No such container: a3b7efc4338a4132d832cc122936380bc7e0c67b6a8eba46dce5029115f76141"

51s 51s 1 kubelet, k8s-agentpool-40914460-0 Warning FailedSync Error syncing pod, skipping: failed to "KillPodSandbox" for "6cdc07e9-9c51-11e7-9b21-000d3aa09b41" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "_" network: CNI failed to retrieve network namespace path: Error: No such container: 38b690d6457f9c0bd707f79d0d8bee6f85abe596bde2f74e6a386e83e2069c1a"

1m 1s 39 kubelet, k8s-agentpool-40914460-0 Warning FailedSync Error syncing pod, skipping: failed to "CreatePodSandbox" for "nginx-2371676037-2qbpj_default(6cdc07e9-9c51-11e7-9b21-000d3aa09b41)" with CreatePodSandboxError: "CreatePodSandbox for pod "nginx-2371676037-2qbpj_default(6cdc07e9-9c51-11e7-9b21-000d3aa09b41)" failed: rpc error: code = 2 desc = NetworkPlugin cni failed to set up pod "nginx-2371676037-2qbpj_default" network: failed to find plugin "loopback" in path [/opt/loopback/bin /opt/cni/bin]"

1m 0s 41 kubelet, k8s-agentpool-40914460-0 Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.

Ingress controller cannot reach itself (Azure-VNet, ACS)

Hi,
We recently stood up a new Kubernetes cluster with ACS engine, after previously using the ACS service in the azure cli. The new cluster has an issue where ingress controllers (we're using the stock nginx ingress controller) cannot make requests to "themselves", ie cannot reach its' own public IP.

The only difference I've found between the old clusters and the new one is that this automatically got Azure-VNet CNI, whereas the previous ones had no CNI configuration at all (which I guess means kubenet?).

This is a problem because external auth causes the controller to make a request to its' own domain. This traffic seems to be dropped, and the requests time out.

It is clearly related to pod networking, since the container host can reach the ingress properly:

# On the node itself
k8s-agentpool1-28163751-2:~$ curl -vIsm 1 13.95.23.109
* Rebuilt URL to: 13.95.23.109/
*   Trying 13.95.23.109...
* Connected to 13.95.23.109 (13.95.23.109) port 80 (#0)
> HEAD / HTTP/1.1
> Host: 13.95.23.109
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 [...]
# Exec into the container
k8s-agentpool1-28163751-2:~$ docker exec 43b4450a0f8c curl -vIsm 1 13.95.23.109
* Rebuilt URL to: 13.95.23.109/
*   Trying 13.95.23.109...
* TCP_NODELAY set
* Connection timed out after 1001 milliseconds
* Curl_http_done: called premature == 1
* stopped the pause stream!
* Closing connection 0

Is this a known limitation of azure-vnet, or a misconfiguration?

Azure IPAM not reloading store

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

Which release version?:
v1.0.6

Which component (CNI/IPAM/CNM/CNS):
IPAM

Which Operating System (Linux/Windows):
Both

Got regressed with change #161

How to build this?

Is this a request for help?:
YES

Is this an ISSUE or FEATURE REQUEST? (choose one):


Which release version?:
master

Which component (CNI/IPAM/CNM/CNS):


Which Operating System (Linux/Windows):

Centos 7 / Linux localhost 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Which Orchestrator and version (e.g. Kubernetes, Docker)

What happened:
make all-binaries fails

What you expected to happen:

make all-binaries success

How to reproduce it (as minimally and precisely as possible):

$ git clone https://github.com/Azure/azure-container-networking
$ cd azure-container-networking
$ make all-binaries

Anything else we need to know:

[user@localhost ~]$ git clone https://github.com/Azure/azure-container-networking
Cloning into 'azure-container-networking'...
remote: Counting objects: 2482, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 2482 (delta 0), reused 2 (delta 0), pack-reused 2476
Receiving objects: 100% (2482/2482), 4.33 MiB | 3.73 MiB/s, done.
Resolving deltas: 100% (1448/1448), done.
cd azure-container-networking/
[user@localhost azure-container-networking]$ make all-binaries
go build -v -o output/linux_amd64/cnm/azure-vnet-plugin -ldflags "-X main.version=v1.0.5-1-g89555ac -s -w" cnm/plugin/*.go
cnm/plugin/main.go:12:2: cannot find package "github.com/Azure/azure-container-networking/cnm/ipam" in any of:
        /usr/lib/golang/src/github.com/Azure/azure-container-networking/cnm/ipam (from $GOROOT)
        /home/user/go/src/github.com/Azure/azure-container-networking/cnm/ipam (from $GOPATH)
cnm/plugin/main.go:13:2: cannot find package "github.com/Azure/azure-container-networking/cnm/network" in any of:
        /usr/lib/golang/src/github.com/Azure/azure-container-networking/cnm/network (from $GOROOT)
        /home/user/go/src/github.com/Azure/azure-container-networking/cnm/network (from $GOPATH)
cnm/plugin/main.go:14:2: cannot find package "github.com/Azure/azure-container-networking/common" in any of:
        /usr/lib/golang/src/github.com/Azure/azure-container-networking/common (from $GOROOT)
        /home/user/go/src/github.com/Azure/azure-container-networking/common (from $GOPATH)
cnm/plugin/main.go:15:2: cannot find package "github.com/Azure/azure-container-networking/log" in any of:
        /usr/lib/golang/src/github.com/Azure/azure-container-networking/log (from $GOROOT)
        /home/user/go/src/github.com/Azure/azure-container-networking/log (from $GOPATH)
cnm/plugin/main.go:16:2: cannot find package "github.com/Azure/azure-container-networking/platform" in any of:
        /usr/lib/golang/src/github.com/Azure/azure-container-networking/platform (from $GOROOT)
        /home/user/go/src/github.com/Azure/azure-container-networking/platform (from $GOPATH)
cnm/plugin/main.go:17:2: cannot find package "github.com/Azure/azure-container-networking/store" in any of:
        /usr/lib/golang/src/github.com/Azure/azure-container-networking/store (from $GOROOT)
        /home/user/go/src/github.com/Azure/azure-container-networking/store (from $GOPATH)
make: *** [output/linux_amd64/cnm/azure-vnet-plugin] Error 1
[user@localhost azure-container-networking]$ go get github.com/Azure/azure-container-networking
package github.com/Azure/azure-container-networking: no Go files in /home/user/go/src/github.com/Azure/azure-container-networking
[user@localhost ]$ cd /home/user/go/src/github.com/Azure/azure-container-networking
make all-binaries
go build -v -o output/linux_amd64/cnm/azure-vnet-plugin -ldflags "-X main.version=v1.0.5-1-g89555ac -s -w" cnm/plugin/*.go
netlink/ip.go:9:2: cannot find package "golang.org/x/sys/unix" in any of:
        /usr/lib/golang/src/golang.org/x/sys/unix (from $GOROOT)
        /home/user/go/src/golang.org/x/sys/unix (from $GOPATH)
make: *** [output/linux_amd64/cnm/azure-vnet-plugin] Error 1
[user@localhost azure-container-networking]$ go get golang.org/x/sys/unix
make all-binaries
go build -v -o output/linux_amd64/cnm/azure-vnet-plugin -ldflags "-X main.version=v1.0.5-1-g89555ac -s -w" cnm/plugin/*.go
github.com/Azure/azure-container-networking/platform
github.com/Azure/azure-container-networking/log
github.com/Azure/azure-container-networking/store
github.com/Azure/azure-container-networking/common
github.com/Azure/azure-container-networking/cnm
github.com/Azure/azure-container-networking/ipam
github.com/Azure/azure-container-networking/cnm/ipam
github.com/Azure/azure-container-networking/ebtables
github.com/Azure/azure-container-networking/netlink
github.com/Azure/azure-container-networking/network/policy
github.com/Azure/azure-container-networking/network
github.com/Azure/azure-container-networking/cnm/network
command-line-arguments
chmod 0755 output/linux_amd64/cnm/azure-vnet-plugin
cd output/linux_amd64/cnm && tar -czvf azure-vnet-cnm-linux-amd64-v1.0.5-1-g89555ac.tgz azure-vnet-plugin
azure-vnet-plugin
chown 1000:1000 output/linux_amd64/cnm/azure-vnet-cnm-linux-amd64-v1.0.5-1-g89555ac.tgz
go build -v -o output/linux_amd64/cni/azure-vnet -ldflags "-X main.version=v1.0.5-1-g89555ac -s -w" cni/network/plugin/*.go
cni/plugin.go:16:2: cannot find package "github.com/containernetworking/cni/pkg/invoke" in any of:
        /usr/lib/golang/src/github.com/containernetworking/cni/pkg/invoke (from $GOROOT)
        /home/user/go/src/github.com/containernetworking/cni/pkg/invoke (from $GOPATH)
cni/cni.go:7:2: cannot find package "github.com/containernetworking/cni/pkg/skel" in any of:
        /usr/lib/golang/src/github.com/containernetworking/cni/pkg/skel (from $GOROOT)
        /home/user/go/src/github.com/containernetworking/cni/pkg/skel (from $GOPATH)
cni/internal.go:10:2: cannot find package "github.com/containernetworking/cni/pkg/types" in any of:
        /usr/lib/golang/src/github.com/containernetworking/cni/pkg/types (from $GOROOT)
        /home/user/go/src/github.com/containernetworking/cni/pkg/types (from $GOPATH)
cni/plugin.go:19:2: cannot find package "github.com/containernetworking/cni/pkg/types/current" in any of:
        /usr/lib/golang/src/github.com/containernetworking/cni/pkg/types/current (from $GOROOT)
        /home/user/go/src/github.com/containernetworking/cni/pkg/types/current (from $GOPATH)
cni/plugin.go:20:2: cannot find package "github.com/containernetworking/cni/pkg/version" in any of:
        /usr/lib/golang/src/github.com/containernetworking/cni/pkg/version (from $GOROOT)
        /home/user/go/src/github.com/containernetworking/cni/pkg/version (from $GOPATH)
make: *** [output/linux_amd64/cni/azure-vnet] Error 1
[user@localhost azure-container-networking]$ go get github.com/containernetworking/cni/pkg/invoke
[user@localhost  azure-container-networking]$ go get github.com/containernetworking/cni/pkg/skel
[user@localhost  azure-container-networking]$ make all-binaries
chmod 0755 output/linux_amd64/cnm/azure-vnet-plugin
cd output/linux_amd64/cnm && tar -czvf azure-vnet-cnm-linux-amd64-v1.0.5-1-g89555ac.tgz azure-vnet-plugin
azure-vnet-plugin
chown 1000:1000 output/linux_amd64/cnm/azure-vnet-cnm-linux-amd64-v1.0.5-1-g89555ac.tgz
go build -v -o output/linux_amd64/cni/azure-vnet -ldflags "-X main.version=v1.0.5-1-g89555ac -s -w" cni/network/plugin/*.go
github.com/Azure/azure-container-networking/platform
github.com/Azure/azure-container-networking/log
github.com/Azure/azure-container-networking/store
github.com/Azure/azure-container-networking/common
github.com/Azure/azure-container-networking/network/policy
github.com/Azure/azure-container-networking/cni
# github.com/Azure/azure-container-networking/cni
cni/plugin.go:112:39: not enough arguments in call to skel.PluginMainWithError
        have (func(*skel.CmdArgs) error, func(*skel.CmdArgs) error, version.PluginInfo)
        want (func(*skel.CmdArgs) error, func(*skel.CmdArgs) error, func(*skel.CmdArgs) error, version.PluginInfo)
github.com/Azure/azure-container-networking/ebtables
github.com/Azure/azure-container-networking/netlink
github.com/Azure/azure-container-networking/network
github.com/Azure/azure-container-networking/telemetry
make: *** [output/linux_amd64/cni/azure-vnet] Error 2

How to build this fucking shit?

DNS search suffix list is incomplete on Windows

Is this an ISSUE or FEATURE REQUEST? (choose one): Issue


Which release version?: 1.0.7


Which component (CNI/IPAM/CNM/CNS): CNI


Which Operating System (Linux/Windows): Windows Server version 1803 (17134)


Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes v1.11.1


What happened:

Windows pods still only get the dns search suffix for their namespace. They should also have svc.cluster.local and cluster.local so that services can be resolved for other namespaces.

Here's /etc/resolv.conf on Linux

kubectl exec -i busybox-sleep cat /etc/resolv.conf
nameserver 10.0.0.10
search default.svc.cluster.local svc.cluster.local cluster.local g5hqkyfpy0nuzckf3igv405gkg.xx.internal.cloudapp.net
options ndots:5

Windows should have the same DNS search suffix results, but it only has the current namespace:

$ kubectl exec iis-1803-688b96c57b-s6657 ipconfig

Windows IP Configuration


Ethernet adapter vEthernet (f6667ee9-eth0):

   Connection-specific DNS Suffix  . : default.svc.cluster.local
   Link-local IPv6 Address . . . . . : fe80::8193:c234:bebb:93db%27
   IPv4 Address. . . . . . . . . . . : 10.240.0.136
   Subnet Mask . . . . . . . . . . . : 255.240.0.0
   Default Gateway . . . . . . . . . : 10.240.0.1

How to reproduce it (as minimally and precisely as possible):

  1. Deploy a cluster with acs-engine 0.19.3 or newer. 0.20.0 preferred
  2. kubectl apply -f https://gist.githubusercontent.com/PatrickLang/0df013d20d32eb98bc57456c4f73461a/raw/052470dc51c9d3582e85e370d2fb1adca78345af/iis-1803-healthcheck.yaml
  3. kubectl get pod - find the pod once its running
  4. kubectl exec podname ipconfig

Anything else we need to know:

This is somewhat related to #146. That PR added the pod's namespace, but not the others. If you need more info on what needs to change, please check with @dineshgovindasamy and @daschott


Pods don't have connectivity to external networks

Is this a request for help?: yes


Is this an ISSUE or FEATURE REQUEST? (choose one): issue


Which release version?: v1.0.3


Which componenet (CNI/IPAM/CNM/CNS): CNI


Which Orchestrator and version (e.g. Kubernetes, Docker) Kubernetes v1.10.1 and containerd 1.1.0-rc.2

What happened:
Pod, node, master network: 10.1.0.0/16

  • ✅ node to pods: OK
  • ✅ master to pods: OK
  • ✅ pod to nodes: OK
  • ✅ pod to master: OK
  • ✅ pod to pod: OK
  • ❌ pod to any remote network (non-pod, non-node/master): no connectivity. This can be external hosts on the internet (e.g. 4.2.2.1) or the kubernetes service network which in my cluster is 10.3.0.0/24

I can see the outbound pod packets on the Azure bridge interface on the node:

ubuntu@node1:~$ sudo tcpdump -n host 10.1.1.63
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:15:02.450024 IP 10.1.1.63.35878 > 10.3.0.1.443: Flags [S], seq 3742509075, win 29200, options [mss 1460,sackOK,TS val 3312626107 ecr 0,nop,wscale 7], length 0
18:21:54.973662 IP 10.1.1.63 > 4.2.2.1: ICMP echo request, id 80, seq 1, length 64
18:21:55.999938 IP 10.1.1.63 > 4.2.2.1: ICMP echo request, id 80, seq 2, length 64
18:21:57.023913 IP 10.1.1.63 > 4.2.2.1: ICMP echo request, id 80, seq 3, length 64
18:21:58.047939 IP 10.1.1.63 > 4.2.2.1: ICMP echo request, id 80, seq 4, length 64
18:22:00.063929 ARP, Request who-has 10.1.0.1 tell 10.1.1.63, length 28
18:22:01.087965 ARP, Request who-has 10.1.0.1 tell 10.1.1.63, length 28
18:22:02.111919 ARP, Request who-has 10.1.0.1 tell 10.1.1.63, length 28

What you expected to happen:

  • remote connectivity works

How to reproduce it (as minimally and precisely as possible): try curl'ing the Kubernetes service IP:

  1. exec into a pod
  2. try pinging or connecting to any host not on the pod network:
# pod IP is 10.1.1.63
root@nettools2-5669f4cd85-k948n:/# curl -I -k https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: Connection timed out

root@nettools2-5669f4cd85-k948n:/# ping 4.2.2.1
PING 4.2.2.1 (4.2.2.1) 56(84) bytes of data.
^C
--- 4.2.2.1 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3074ms

Anything else we need to know:

# CNI config:
{
   "cniVersion":"0.3.0",
   "name":"azure",
   "plugins":[
      {
         "type":"azure-vnet",
         "mode":"bridge",
         "bridge":"azure0",
         "ipam":{
            "type":"azure-vnet-ipam"
         }
      },
      {
         "type":"portmap",
         "capabilities":{
            "portMappings":true
         },
         "snat":true
      }
   ]
}
# iptables rules
ubuntu@node1:/var/log$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
KUBE-EXTERNAL-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
KUBE-FORWARD  all  --  anywhere             anywhere             /* kubernetes forwarding rules */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere

Chain KUBE-EXTERNAL-SERVICES (1 references)
target     prot opt source               destination

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination
DROP       all  --  anywhere             anywhere             /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  10.1.0.0/16          anywhere             /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             10.1.0.0/16          /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED

Chain KUBE-SERVICES (1 references)
target     prot opt source               destination
REJECT     udp  --  anywhere             10.3.0.10            /* kube-system/kube-dns:dns has no endpoints */ udp dpt:domain reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.3.0.10            /* kube-system/kube-dns:dns-tcp has no endpoints */ tcp dpt:domain reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.3.0.11            /* kube-system/kubernetes-dashboard: has no endpoints */ tcp dpt:https reject-with icmp-port-unreachable
# ebtables NAT rules

ubuntu@node1:/var/log$ sudo /sbin/ebtables -t nat --list
Bridge table: nat

Bridge chain: PREROUTING, entries: 12, policy: ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.1.0.224 -j arpreply --arpreply-mac 0:d:3a:6:2c:a6
-p ARP -i eth0 --arp-op Reply -j dnat --to-dst ff:ff:ff:ff:ff:ff --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.1.0.227 -j arpreply --arpreply-mac ee:19:c0:4f:fd:15
-p IPv4 -i eth0 --ip-dst 10.1.0.227 -j dnat --to-dst ee:19:c0:4f:fd:15 --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.1.0.246 -j arpreply --arpreply-mac 12:49:d2:3c:3d:7b
-p IPv4 -i eth0 --ip-dst 10.1.0.246 -j dnat --to-dst 12:49:d2:3c:3d:7b --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.1.0.252 -j arpreply --arpreply-mac 52:71:2a:e3:eb:b7
-p IPv4 -i eth0 --ip-dst 10.1.0.252 -j dnat --to-dst 52:71:2a:e3:eb:b7 --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.1.0.228 -j arpreply --arpreply-mac da:bf:a4:b0:97:27
-p IPv4 -i eth0 --ip-dst 10.1.0.228 -j dnat --to-dst da:bf:a4:b0:97:27 --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.1.1.63 -j arpreply --arpreply-mac 6e:d:eb:76:f5:9
-p IPv4 -i eth0 --ip-dst 10.1.1.63 -j dnat --to-dst 6e:d:eb:76:f5:9 --dnat-target ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT
-s Unicast -o eth0 -j snat --to-src 0:d:3a:6:2c:a6 --snat-arp --snat-target ACCEPT

Wrong external interface detection if bridge already configured

I have kubernetes cluster with azure vnet cni enabled and after kubelet restart i see the following errors for new pods.

CNI improperly detects external interface azure0, but should be eth0

azure-vnet.log

  1 2017/11/15 10:52:20 [net] Setting link azure0 state down.
  2 2017/11/15 10:52:20 [net] Setting link azure0 master azure0.
  3 2017/11/15 10:52:20 [netlink] Received {NlMsghdr:{Len:60 Type:2 Flags:0 Seq:4 Pid:67602} data:[216 255 255 255 40 0 0 0 19 0 5 0 4 0 0 0 18 8 1 0 0 0 19 0 4 0 0 0 1 0 0 0 255 255 255 255 8 0     10 0
  4  4 0 0 0] payload:[]}, err=too many levels of symbolic links
  5 2017/11/15 10:52:20 [net] Connecting interface azure0 completed with err:too many levels of symbolic links.
  6 2017/11/15 10:52:20 [net] Failed to create network azure, err:too many levels of symbolic links.
  7 2017/11/15 10:52:20 [azure-vnet] Failed to create network: too many levels of symbolic links.

ip addr output

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc mq master azure0 state UP group default qlen 1000
    link/ether 00:0d:3a:29:c9:1a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::20d:3aff:fe29:c91a/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:1c:a4:f8:57 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
4: azure0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:0d:3a:29:c9:1a brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.7/24 brd 10.0.2.255 scope global azure0
       valid_lft forever preferred_lft forever
    inet6 fe80::20d:3aff:fe29:c91a/64 scope link tentative 
       valid_lft forever preferred_lft forever
5: azveth460f010@if6: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master azure0 state UP group default qlen 1000
    link/ether ce:86:6d:99:03:99 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::cc86:6dff:fe99:399/64 scope link 
       valid_lft forever preferred_lft forever
9: azveth11a7c76@if10: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master azure0 state UP group default qlen 1000
    link/ether 8a:ed:ca:70:85:83 brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::88ed:caff:fe70:8583/64 scope link 
       valid_lft forever preferred_lft forever

DNS search suffix not for namespace on Windows pods

Is this an ISSUE or FEATURE REQUEST? (choose one):

Issue

Which release version?:

v1.0.4-1-gf0f090e

Which component (CNI/IPAM/CNM/CNS):

CNI

Which Operating System (Linux/Windows):

Windows Server version 1803

Which Orchestrator and version (e.g. Kubernetes, Docker)

Kubernetes

What happened:

The DNS suffix is not set based on the pod's namespace.

What you expected to happen:

On Linux, we see the right resolution order.

/etc # cat resolv.conf
nameserver 10.0.0.10
search default.svc.cluster.local svc.cluster.local cluster.local lje1130gi5iehkdbexza4ptcba.xx.internal.cloudapp.net

Windows only has svc.cluster.local

How to reproduce it (as minimally and precisely as possible):

Deploy a Windows cluster with acs-engine v0.17

Plugin fails with "Store is locked" and "network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input"

Is this a request for help?:
Issue/Bug, request for confirmation of fix in later versions.

Is this an ISSUE or FEATURE REQUEST? (choose one):

ISSUE

Which release version?:

v1.0.2

Which component (CNI/IPAM/CNM/CNS):

CNI & IPAM

Which Operating System (Linux/Windows):

Linux (Ubuntu 16.04) 64bit
Linux k8s-agentpool1-62076056-0 4.15.0-1014-azure #14~16.04.1-Ubuntu SMP Thu Jun 14 15:42:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

We observed that our pods could not start or terminate on one of our development clusters since moving from acs to acs-engine:

store-weu-acc1     downloadservice-345209866-0q1rx                 0/1       ContainerCreating   0          2d
store-weu-acc1     downloadservice-345209866-3mf0h                 0/1       ContainerCreating   0          1d
store-weu-acc1     downloadservice-345209866-6g1s9                 1/1       Terminating         0          2d
store-weu-acc1     downloadservice-345209866-9lsft                 1/1       Terminating         0          2d

After looking through the logs we found the journal for kubelet had the following errors:

Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 Going to send Telemetry report to hostnetagent http://169.254.169.254/machine/plugins?comp=netagent&type=cnireport
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 "Start Flag false Name CNI Version v1.0.2 ErrorMessage Store is locked vnet []
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]:                                 Context  SubContext "
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 OrchestratorDetails &{Kubernetes v1.7.16
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]:  }
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 OSDetails &{linux 16.04.4 4.15.0-1013-azure Ubuntu }
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 SystemDetails &{6946 3192 0 29715 6375 2 }
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 InterfaceDetails &{Primary 10.240.0.0/12 10.240.2.2 00:0d:3a:29:83:17 azure0 127 0 }
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 BridgeDetails <nil>
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 Send telemetry success 200
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: 2018/07/02 11:27:25 SetReportState succeeded
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: E0702 11:27:25.437772   95278 cni.go:312] Error deleting network: netplugin failed but error parsing its diagnostic message "": unexpected end of JS
ON input
Jul 02 11:27:25 k8s-agentpool1-62076056-0 docker[79932]: time="2018-07-02T11:27:25.438398234Z" level=error msg="Handler for POST /v1.26/containers/f3fd5317fafcb25f515b0e2052120c6eb843540d2d3d262bc5ebaa67de2
62040/stop returned error: Container f3fd5317fafcb25f515b0e2052120c6eb843540d2d3d262bc5ebaa67de262040 is already stopped"
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: E0702 11:27:25.441303   95278 remote_runtime.go:114] StopPodSandbox "f3fd5317fafcb25f515b0e2052120c6eb843540d2d3d262bc5ebaa67de262040" from runtime 
service failed: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "heapster-186967039-cqmjk_kube-system" network: netplugin failed but error parsing its diagnostic message "": unexpected e
nd of JSON input
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: E0702 11:27:25.446164   95278 kuberuntime_gc.go:156] Failed to stop sandbox "f3fd5317fafcb25f515b0e2052120c6eb843540d2d3d262bc5ebaa67de262040" befor
e removing: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "heapster-186967039-cqmjk_kube-system" network: netplugin failed but error parsing its diagnostic message "": unexpected end o
f JSON input
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: W0702 11:27:25.448217   95278 docker_sandbox.go:342] failed to read pod IP from plugin/docker: NetworkPlugin cni failed on the status hook for pod "
heapster-186967039-cqmjk_kube-system": CNI failed to retrieve network namespace path: Cannot find network namespace for the terminated container "b9baf578c7996418ebad86e062b0d515f7ce2efa4a3e0e12a624c9c410a6
6d73"
Jul 02 11:27:25 k8s-agentpool1-62076056-0 kubelet[95278]: W0702 11:27:25.448875   95278 cni.go:258] CNI failed to retrieve network namespace path: Cannot find network namespace for the terminated container 
"b9baf578c7996418ebad86e062b0d515f7ce2efa4a3e0e12a624c9c410a66d73"

And in the CNI logs (forgive the discrepancy in timestamps, the issue was also occuring at same time as journal errors however this was the exact time the lock issue became apparent):

2018/07/02 08:16:42 [cni-net] Plugin azure-vnet version v1.0.2.
2018/07/02 08:16:42 [cni-net] Running on Linux version 4.15.0-1013-azure (buildd@lcy01-amd64-006) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)) #13~16.04.2-Ubuntu SMP Wed May 30 01:39:27 UTC 2018
2018/07/02 08:16:42 [net] Network interface: {Index:1 MTU:65536 Name:lo HardwareAddr: Flags:up|loopback} with IP addresses: [127.0.0.1/8 ::1/128]
2018/07/02 08:16:42 [net] Network interface: {Index:2 MTU:1500 Name:eth0 HardwareAddr:00:0d:3a:29:83:17 Flags:up|broadcast} with IP addresses: [fe80::20d:3aff:fe29:8317/64]
2018/07/02 08:16:42 [net] Network interface: {Index:3 MTU:1500 Name:docker0 HardwareAddr:02:42:ea:72:50:50 Flags:up|broadcast|multicast} with IP addresses: [172.17.0.1/16]
2018/07/02 08:16:42 [net] Network interface: {Index:4 MTU:1500 Name:azure0 HardwareAddr:00:0d:3a:29:83:17 Flags:up|broadcast|multicast} with IP addresses: [10.240.2.2/12 fe80::20d:3aff:fe29:8317/64]
2018/07/02 08:16:42 [net] Network interface: {Index:5 MTU:1500 Name:azvethd3b5ed9 HardwareAddr:f2:29:d3:58:12:9f Flags:up|broadcast} with IP addresses: [fe80::f029:d3ff:fe58:129f/64]
2018/07/02 08:16:42 [net] Store timestamp is 2018-06-29 22:55:32.841685859 +0000 UTC.
2018/07/02 08:16:42 [net] Restored state, &{Version:v1.0.2 TimeStamp:2018-06-29 22:55:32.848144792 +0000 UTC ExternalInterfaces:map[eth0:0xc420136000] store:0xc420016ed0 Mutex:{state:0 sema:0}}
2018/07/02 08:16:42 [cni-net] Plugin started.
2018/07/02 08:16:42 [cni-net] Processing DEL command with args {ContainerID:ee4e979f2a289c49e6bc5b5d83744763461036664405a51196394bd93812ba29 Netns: IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=sstore-weu-dev;K8S_POD_NAME=ui-3042035097-1lwcf;K8S_POD_INFRA_CONTAINER_ID=ee4e979f2a289c49e6bc5b5d83744763461036664405a51196394bd93812ba29 Path:/opt/azure-vnet/bin:/opt/cni/bin}.
2018/07/02 08:16:42 [cni-net] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet: Address: QueryInterval:}}.
2018/07/02 08:16:42 [azure-vnet] Failed to query endpoint: Endpoint not found.
2018/07/02 08:16:42 [cni-net] DEL command completed with err:<nil>.
2018/07/02 08:16:42 [cni-net] Plugin stopped.
2018/07/02 08:16:42 [cni-net] Plugin azure-vnet version v1.0.2.
2018/07/02 08:16:42 [cni-net] Running on Linux version 4.15.0-1013-azure (buildd@lcy01-amd64-006) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)) #13~16.04.2-Ubuntu SMP Wed May 30 01:39:27 UTC 2018
2018/07/02 08:16:42 [net] Network interface: {Index:1 MTU:65536 Name:lo HardwareAddr: Flags:up|loopback} with IP addresses: [127.0.0.1/8 ::1/128]
2018/07/02 08:16:42 [net] Network interface: {Index:2 MTU:1500 Name:eth0 HardwareAddr:00:0d:3a:29:83:17 Flags:up|broadcast} with IP addresses: [fe80::20d:3aff:fe29:8317/64]
2018/07/02 08:16:42 [net] Network interface: {Index:3 MTU:1500 Name:docker0 HardwareAddr:02:42:ea:72:50:50 Flags:up|broadcast|multicast} with IP addresses: [172.17.0.1/16]
2018/07/02 08:16:42 [net] Network interface: {Index:4 MTU:1500 Name:azure0 HardwareAddr:00:0d:3a:29:83:17 Flags:up|broadcast|multicast} with IP addresses: [10.240.2.2/12 fe80::20d:3aff:fe29:8317/64]
2018/07/02 08:16:42 [net] Network interface: {Index:5 MTU:1500 Name:azvethd3b5ed9 HardwareAddr:f2:29:d3:58:12:9f Flags:up|broadcast} with IP addresses: [fe80::f029:d3ff:fe58:129f/64]
2018/07/02 08:16:42 [net] Store timestamp is 2018-06-29 22:55:32.841685859 +0000 UTC.
2018/07/02 08:16:42 [net] Restored state, &{Version:v1.0.2 TimeStamp:2018-06-29 22:55:32.848144792 +0000 UTC ExternalInterfaces:map[eth0:0xc420136000] store:0xc420016ed0 Mutex:{state:0 sema:0}}
2018/07/02 08:16:42 [cni-net] Plugin started.
2018/07/02 08:25:26 [cni] Timed out on locking store, err:Store is locked.
2018/07/02 08:25:26 [cni-net] Failed to initialize base plugin, err:Store is locked.
2018/07/02 08:25:26 Failed to start network plugin, err:Store is locked.
2018/07/02 08:25:26 Report plugin error
2018/07/02 08:25:29 [cni] Timed out on locking store, err:Store is locked.
2018/07/02 08:25:29 [cni-net] Failed to initialize base plugin, err:Store is locked.
2018/07/02 08:25:29 Failed to start network plugin, err:Store is locked.
2018/07/02 08:25:29 Report plugin error
2018/07/02 08:25:29 [cni] Timed out on locking store, err:Store is locked.
2018/07/02 08:25:29 [cni-net] Failed to initialize base plugin, err:Store is locked.
2018/07/02 08:25:29 Failed to start network plugin, err:Store is locked.
2018/07/02 08:25:29 Report plugin error

This error continues until we restart the node.

Is this something an upgrade may address? I did look in the release notes for versions >1.0.2 but did not see anything Re: bugfixes for "Store locks" or similar.

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes 1.7.16 (built with acs-engine 0.16.2)
Docker 1.13.1

What happened:

CNI Failed with "err:Store is locked."

What you expected to happen:

CNI plugin to ADD & DEL interfaces as needed.

How to reproduce it (as minimally and precisely as possible):

Build kubernetes cluster using acs-engine 0.16.2 with K8s version '1.7'
Custom build parameters (max-pods=110 & "ipAddressCount": 128 for masters & nodes)

Anything else we need to know:

The cluster has been built and running for 41 days now, however this issue only appeared Monday 2nd July. The cluster has also had intermittent faults with memory exhaustion and has required a cluster restart to recover.

ERROR: failed to allocate gateway (x.x.x.x): No address returned

my docker host has 2 interfaces and i'm using the CNM driver:

  • eth0 with 10.4.70.11/24
  • eth1 with 10.4.71.201 (and in the azure vnet also 10.4.71.11, 12, 13, 14,... assigned)

when starting a docker container, a bridge name azure3 on eth1 and a interface for the container is being created. the container get's an ip from the available ips and i can reach the container from the host and from hosts that are in the 10.4.70.0/24. so far so good.

however - i can't reach the container from another subnet (and the loadbalancer can't as well). the problem is that the default gateway of the host is 10.4.70.1 which is bound to eth0. so the incoming packets will arrive via the 10.4.71.0/24 subnet on eth1 but it will try to send back the packets on eth0.

so i tried to pass the default gateway for the docker container (docker-compose.yml):

   ...
    networks:
      crate:
        ipv4_address: 10.4.71.11

networks:
  crate:
    driver: azure-vnet
    ipam:
      driver: azure-vnet
      config:
        - subnet: 10.4.71.0/24
          gateway: 10.4.71.1

this yields in an error: ERROR: failed to allocate gateway (10.4.71.1): No address returned.
the logfile shows:

Nov 08 18:25:27 docker-1 azure-vnet-plugin[12111]: 2017/11/08 18:25:27 [ipam] Requesting address with address:10.4.71.1 options:map[azure.address.type:gateway azure.address.id:].
Nov 08 18:25:27 docker-1 azure-vnet-plugin[12111]: 2017/11/08 18:25:27 [ipam] Address request completed with address:<nil> err:Address not found.
Nov 08 18:25:27 docker-1 azure-vnet-plugin[12111]: 2017/11/08 18:25:27 [azure-vnet-ipam] Sent *cnm.errorResponse &{Err:Address not found}.

the second approach i tried was modifying the routing table of the host so that traffic originating from the 10.4.71.0 ips is being sent over eth1. i created a second route table that sends the traffic from that ip back to that gateway

ip route add 10.4.71.0/24 dev azure3 src 10.4.71.201 table 2
ip route add default via 10.4.71.1 dev azure3 table 2
ip rule add from 10.4.71.201/32 table 2
ip rule add to 10.4.71.201/32 table 2

after that the main ip (10.4.71.201) is reachable from other subnets. but the ip's inside the docker containers are still not reachable.

another problem is that azure-vnet-plugin also wipes this routing table and i need to create it again.

what's the best/correct/working approach to get CNM working with two interfaces? how to properly set up routing?

failed to setup network

Is this a request for help?:


ISSUE

Which release version?:
1.0.8


Which component (CNI/IPAM/CNM/CNS):
CNI


Which Operating System (Linux/Windows):
Linux


For Linux: Include Distro and kernel version using "uname -a"
Linux kubetest-worker0 4.15.0-1013-azure #13-Ubuntu SMP Thu May 24 14:42:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

For windows: provide output of "$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion"


Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

What happened:
can't spin up a container. receive the following error message in jurnalctl

Jul 24 18:20:36 kubetest-worker0 kubelet[1147]: E0724 18:20:36.959668 1147 pod_workers.go:186] Error syncing pod ce7c5890-8f6d-11e8-b231-000d3ab3eb34 ("kube-dns-598d7bf7d4-hn2g4_kube-system(ce7c5890-8f6d-11e8-b231-000d3ab3eb34)"), skipping: failed to "CreatePodSandbox" for "kube-dns-598d7bf7d4-hn2g4_kube-system(ce7c5890-8f6d-11e8-b231-000d3ab3eb34)" with CreatePodSandboxError: "CreatePodSandbox for pod "kube-dns-598d7bf7d4-hn2g4_kube-system(ce7c5890-8f6d-11e8-b231-000d3ab3eb34)" failed: rpc error: code = Unknown desc = failed to setup network for sandbox "51f9b733e40b8d12e5a0c5b4825303adbe54376f4820f8bcff4dafa938afe31c": incompatible CNI versions; config is "0.3.1", plugin supports ["0.1.0" "0.2.0"]"

andrei@kubetest-worker0:~$ cat /etc/cni/net.d/10-azure.conflist
{
"cniVersion":"0.3.0",
"name":"azure",
"plugins":[
{
"type":"azure-vnet",
"mode":"bridge",
"bridge":"azure0",
"ipam":{
"type":"azure-vnet-ipam"
}
},
{
"type":"portmap",
"capabilities":{
"portMappings":true
},
"snat":true
}
]
}

container stuck on creation

Anything else we need to know:


mac " 12:34:56:78:9a:bc" in tunnel mode of CNI

Is this a request for help?:
yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

Which release version?:
ALL

Which componenet (CNI/IPAM/CNM/CNS):
CNI

in the k8s cni plugin. if I use the tunnel mode.
2 nodes and 4 pods:
azureuser@k8s-master-34676867-0:~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-7c87f569d-2222p 1/1 Running 0 30s 10.240.0.81 k8s-agentpool1-34676867-0
nginx-7c87f569d-lhh7p 1/1 Running 0 30s 10.240.0.43 k8s-agentpool1-34676867-1
nginx-7c87f569d-nhh4t 1/1 Running 0 30s 10.240.0.86 k8s-agentpool1-34676867-0
nginx-7c87f569d-zndr2 1/1 Running 0 30s 10.240.0.55 k8s-agentpool1-34676867-1

the ebtables will be like:
root@k8s-agentpool1-34676867-1:~# ebtables -t nat -L
Bridge table: nat

Bridge chain: PREROUTING, entries: 11, policy: ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.240.0.34 -j arpreply --arpreply-mac 0:d:3a:30:47:42
-p ARP -i eth0 --arp-op Reply -j dnat --to-dst ff:ff:ff:ff:ff:ff --dnat-target ACCEPT
-i az+ -j dnat --to-dst 12:34:56:78:9a:bc --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.240.0.45 -j arpreply --arpreply-mac 12:34:56:78:9a:bc
-p IPv4 -i eth0 --ip-dst 10.240.0.45 -j dnat --to-dst da:63:e7:2e:d0:be --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.240.0.37 -j arpreply --arpreply-mac 12:34:56:78:9a:bc
-p IPv4 -i eth0 --ip-dst 10.240.0.37 -j dnat --to-dst ce:b9:fd:c5:36:79 --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.240.0.43 -j arpreply --arpreply-mac 12:34:56:78:9a:bc
-p IPv4 -i eth0 --ip-dst 10.240.0.43 -j dnat --to-dst 3e:90:86:28:e5:39 --dnat-target ACCEPT
-p ARP --arp-op Request --arp-ip-dst 10.240.0.55 -j arpreply --arpreply-mac 12:34:56:78:9a:bc
-p IPv4 -i eth0 --ip-dst 10.240.0.55 -j dnat --to-dst 86:f9:28:da:36:27 --dnat-target ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: POSTROUTING, entries: 1, policy: ACCEPT
-s Unicast -o eth0 -j snat --to-src 0:d:3a:30:47:42 --snat-arp --snat-target ACCEPT

all the traffic between pods in the same host will go to the -to-dst 12:34:56:78:9a:bc
-i az+ -j dnat --to-dst 12:34:56:78:9a:bc --dnat-target ACCEPT

and all the arp reply from pods will also be 12:34:56:78:9a:bc

what's the 12:34:56:78:9a:bc ,Is it the a special device in the azure VFP?

Which Orchestrator and version (e.g. Kubernetes, Docker)

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

ipam plugin aks assignes duplicate ips

Is this a request for help?:YES


Is this an ISSUE or FEATURE REQUEST? (choose one):
issu

Which release version?:
AKS latest version with cni

Which component (CNI/IPAM/CNM/CNS):

---IPAM

Which Operating System (Linux/Windows):
linux

For Linux: Include Distro and kernel version using "uname -a"


For windows: provide output of "$(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion"


Which Orchestrator and version (e.g. Kubernetes, Docker)

---Kubernetes

What happened:

---we see ipam is sending duplicate ips

What you expected to happen:

---unique ip

How to reproduce it (as minimally and precisely as possible):

---we create new cluster and we see this behavior

Anything else we need to know:


from the cluster

[cin@l004 ~]$ kubectl get pods -o=wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system azureproxy-6496d6f4c6-qkk86 0/1 CrashLoopBackOff 749 2d 10.128.16.133 aks-agentpool-66686873-5
kube-system heapster-5f7c7df649-84zhx 2/2 Running 9 3d 10.128.16.50 aks-agentpool-66686873-4
kube-system kube-dns-v20-7c556f89c5-d8xgt 3/3 Running 3 5d 10.128.16.87 aks-agentpool-66686873-2
kube-system kube-dns-v20-7c556f89c5-zdm54 3/3 Running 3 5d 10.128.16.21 aks-agentpool-66686873-3
kube-system kube-proxy-28576 1/1 Running 1 5d 10.128.16.66 aks-agentpool-66686873-2
kube-system kube-proxy-c2tc9 1/1 Running 1 5d 10.128.16.97 aks-agentpool-66686873-1
kube-system kube-proxy-hr7wh 1/1 Running 1 5d 10.128.16.159 aks-agentpool-66686873-0
kube-system kube-proxy-j5w2s 1/1 Running 2 5d 10.128.16.252 aks-agentpool-66686873-8
kube-system kube-proxy-n28vl 1/1 Running 2 5d 10.128.16.35 aks-agentpool-66686873-4
kube-system kube-proxy-qj7x8 1/1 Running 1 5d 10.128.16.128 aks-agentpool-66686873-5
kube-system kube-proxy-qxx66 1/1 Running 1 5d 10.128.16.4 aks-agentpool-66686873-3
kube-system kube-proxy-tqhd2 1/1 Running 2 5d 10.128.16.190 aks-agentpool-66686873-7
kube-system kube-state-metrics-599b4df445-4tk49 2/2 Running 3 5d 10.128.16.163 aks-agentpool-66686873-0
kube-system kube-svc-redirect-8crml 1/1 Running 653 5d 10.128.16.190 aks-agentpool-66686873-7
kube-system kube-svc-redirect-hnvbd 0/1 CrashLoopBackOff 676 5d 10.128.16.159 aks-agentpool-66686873-0
kube-system kube-svc-redirect-jdsd5 1/1 Running 293 5d 10.128.16.128 aks-agentpool-66686873-5
kube-system kube-svc-redirect-ldvbd 1/1 Running 678 5d 10.128.16.97 aks-agentpool-66686873-1
kube-system kube-svc-redirect-prgnk 0/1 CrashLoopBackOff 673 5d 10.128.16.35 aks-agentpool-66686873-4
kube-system kube-svc-redirect-vm886 0/1 CrashLoopBackOff 676 5d 10.128.16.66 aks-agentpool-66686873-2
kube-system kube-svc-redirect-z6cwh 0/1 CrashLoopBackOff 677 5d 10.128.16.252 aks-agentpool-66686873-8
kube-system kube-svc-redirect-z9cxf 1/1 Running 680 5d 10.128.16.4 aks-agentpool-66686873-3
kube-system kubernetes-dashboard-d876d4cdf-z7rtl 0/1 CrashLoopBackOff 636 2d 10.128.16.75 aks-agentpool-66686873-2
kube-system omsagent-cxsfg 1/1 Running 0 3d 10.128.16.178 aks-agentpool-66686873-0
kube-system omsagent-dc52f 0/1 CrashLoopBackOff 893 3d 10.128.17.0 aks-agentpool-66686873-8
kube-system omsagent-nb7pj 1/1 Running 0 3d 10.128.16.31 aks-agentpool-66686873-3
kube-system omsagent-rs-6c5d58b44c-kd884 1/1 Running 895 3d 10.128.16.45 aks-agentpool-66686873-4
kube-system omsagent-t2qqj 0/1 CrashLoopBackOff 895 3d 10.128.16.41 aks-agentpool-66686873-4
kube-system omsagent-t99t2 1/1 Running 411 3d 10.128.16.153 aks-agentpool-66686873-5
kube-system omsagent-twkfr 1/1 Running 179 3d 10.128.16.199 aks-agentpool-66686873-7
kube-system omsagent-ws9wn 1/1 Running 0 3d 10.128.16.125 aks-agentpool-66686873-1
kube-system omsagent-xzwlx 1/1 Running 0 3d 10.128.16.75 aks-agentpool-66686873-2
kube-system tiller-deploy-6dcb966466-mnnrf 1/1 Running 150 2d 10.128.16.131 aks-agentpool-66686873-5
kube-system tunnelfront-74c8d6658c-pb9qf 1/1 Running 266 2d 10.128.16.134 aks-agentpool-66686873-5
nexus-dev-monitoring grafana-687699bc96-wn79p 0/1 Running 107 2d 10.128.16.151 aks-agentpool-66686873-5
nexus-dev-monitoring prometheus-alertmanager-85d944f874-hthh7 1/2 Running 4 5d 10.128.17.9 aks-agentpool-66686873-8
nexus-dev-monitoring prometheus-blackbox-exporter-5856c85b8-cpv6b 1/1 Running 1 5d 10.128.16.103 aks-agentpool-66686873-1
nexus-dev-monitoring prometheus-kube-state-metrics-566669df8c-zzfdw 1/1 Running 3 5d 10.128.16.19 aks-agentpool-66686873-3
nexus-dev-monitoring prometheus-node-exporter-8lj6z 1/1 Running 1 5d 10.128.16.66 aks-agentpool-66686873-2
nexus-dev-monitoring prometheus-node-exporter-flhnh 1/1 Running 1 5d 10.128.16.4 aks-agentpool-66686873-3
nexus-dev-monitoring prometheus-node-exporter-gqlk7 1/1 Running 1 5d 10.128.16.159 aks-agentpool-66686873-0
nexus-dev-monitoring prometheus-node-exporter-l52zq 1/1 Running 2 5d 10.128.16.252 aks-agentpool-66686873-8
nexus-dev-monitoring prometheus-node-exporter-nrm6n 1/1 Running 1 5d 10.128.16.128 aks-agentpool-66686873-5
nexus-dev-monitoring prometheus-node-exporter-t959r 1/1 Running 2 5d 10.128.16.190 aks-agentpool-66686873-7
nexus-dev-monitoring prometheus-node-exporter-w8mcl 1/1 Running 2 5d 10.128.16.35 aks-agentpool-66686873-4
nexus-dev-monitoring prometheus-node-exporter-whdgt 1/1 Running 1 5d 10.128.16.97 aks-agentpool-66686873-1
nexus-dev-monitoring prometheus-pushgateway-585bdfdcd5-lm2fx 1/1 Running 1 5d 10.128.16.172 aks-agentpool-66686873-0
nexus-dev-monitoring prometheus-server-79d4f794cc-qqn4j 2/2 Running 324 5d 10.128.16.119 aks-agentpool-66686873-1
nexus-dev api-gateway-5796bf487-5r2zs 0/1 CrashLoopBackOff 1252 5d 10.128.16.45 aks-agentpool-66686873-4
nexus-dev api-gateway-5796bf487-tk9j5 0/1 Running 764 5d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev api-server-8579dbf5cb-9tqhr 0/2 Init:CrashLoopBackOff 1015 3d 10.128.16.103 aks-agentpool-66686873-1
nexus-dev api-server-8579dbf5cb-cnbn6 0/2 Init:CrashLoopBackOff 883 3d 10.128.16.67 aks-agentpool-66686873-2
nexus-dev assignments-service-7f8b8d4c4f-6zv7m 2/2 Running 0 2d 10.128.16.44 aks-agentpool-66686873-4
nexus-dev auth-server-76754f9cf5-ftm95 0/1 Running 134 2d 10.128.16.133 aks-agentpool-66686873-5
nexus-dev basic-sftp-f7479796d-tq549 1/1 Running 2 5d 10.128.16.253 aks-agentpool-66686873-8
nexus-dev caseflow-unity-7745d7dbfc-q2hh2 2/2 Running 0 4d 10.128.16.166 aks-agentpool-66686873-0
nexus-dev caseflow-unity2-868f8d4665-s84ff 2/2 Running 0 4d 10.128.16.9 aks-agentpool-66686873-3
nexus-dev caseflow-unity3-6f94558bdf-6svsb 0/2 Init:CrashLoopBackOff 1218 4d 10.128.17.25 aks-agentpool-66686873-8
nexus-dev caseflow-unity3-7b4f6c875f-xm6z4 2/2 Running 2 5d 10.128.16.14 aks-agentpool-66686873-3
nexus-dev cassandra-0 2/2 Running 2 5d 10.128.16.163 aks-agentpool-66686873-0
nexus-dev cassandra-1 2/2 Running 2 5d 10.128.16.100 aks-agentpool-66686873-1
nexus-dev cassandra-2 1/2 Running 857 5d 10.128.17.10 aks-agentpool-66686873-8
nexus-dev citizen-registry-5dbf5945cd-rqsdk 1/1 Running 23 4d 10.128.16.51 aks-agentpool-66686873-4
nexus-dev digital-letter-microservice-7744554b78-gnk78 1/2 CrashLoopBackOff 792 5d 10.128.17.0 aks-agentpool-66686873-8
nexus-dev dst-microservice-65bfb998b7-lqbkk 0/1 CrashLoopBackOff 486 2d 10.128.16.58 aks-agentpool-66686873-4
nexus-dev email-delivery-service-c5c567d6-dlnnr 1/1 Running 8 5d 10.128.16.61 aks-agentpool-66686873-4
nexus-dev eop-microservice-67786f85cb-72sf2 2/2 Running 0 2d 10.128.16.13 aks-agentpool-66686873-3
nexus-dev eop-microservice-8659c5474-x6mkl 0/2 ContainerCreating 0 1m aks-agentpool-66686873-5
nexus-dev eop-ui-54595468f4-jm779 1/1 Running 0 2d 10.128.16.43 aks-agentpool-66686873-4
nexus-dev esb-central-5cf9c64d65-xmmzr 0/1 Terminating 0 3d 10.128.16.141 aks-agentpool-66686873-5
nexus-dev external-dns-ccb88ddb4-r6qjn 1/1 Running 2 5d 10.128.16.255 aks-agentpool-66686873-8
nexus-dev file-conversion-clear-directories-1530514800-qqx84 1/1 Running 0 33s 10.128.16.194 aks-agentpool-66686873-7
nexus-dev file-conversion-microservice-6c8675865b-rjg4r 1/1 Running 0 2d 10.128.16.24 aks-agentpool-66686873-3
nexus-dev finance-microservice-79685d496b-4krdc 0/1 Init:0/1 371 3d 10.128.16.143 aks-agentpool-66686873-5
nexus-dev finance-microservice-845b897594-5xcdt 0/1 Running 668 3d 10.128.16.49 aks-agentpool-66686873-4
nexus-dev flower-chart-service-646cd899f6-rq82l 1/1 Running 0 2d 10.128.16.64 aks-agentpool-66686873-4
nexus-dev fmk-microservice-54569445b-qb92d 0/1 CrashLoopBackOff 940 3d 10.128.16.49 aks-agentpool-66686873-4
nexus-dev fmk-microservice-654c4868f9-4n2ss 0/1 Running 625 5d 10.128.16.138 aks-agentpool-66686873-5
nexus-dev fmk-microservice-654c4868f9-k59tp 0/1 CrashLoopBackOff 519 2d 10.128.16.255 aks-agentpool-66686873-8
nexus-dev formbuilder-6d69fcccbd-8w7g6 1/1 Running 110 5d 10.128.16.150 aks-agentpool-66686873-5
nexus-dev fs3-classifications-77fb87b96b-tptl8 1/1 Running 0 2d 10.128.16.7 aks-agentpool-66686873-3
nexus-dev grant-offer-portal-microservice-57d567b5d8-l4bw2 0/1 RunContainerError 744 2d 10.128.16.129 aks-agentpool-66686873-5
nexus-dev hcl-catalog-ccfffcbd8-8sk5v 1/1 Running 595 5d 10.128.16.152 aks-agentpool-66686873-5
nexus-dev hcl-catalog-ccfffcbd8-xh7r7 1/1 Running 68 5d 10.128.16.39 aks-agentpool-66686873-4
nexus-dev hcl-logistics-ui-77fbc67f85-jbwxj 1/1 Running 0 2d 10.128.16.47 aks-agentpool-66686873-4
nexus-dev hcl-logistics-ui-77fbc67f85-p7bgc 1/1 Running 1 5d 10.128.16.147 aks-agentpool-66686873-5
nexus-dev hcl-matching-ui-7f6b958bcb-r7skf 1/1 Running 1 5d 10.128.16.150 aks-agentpool-66686873-5
nexus-dev hcl-matching-ui-7f6b958bcb-v27wq 1/1 Running 1 5d 10.128.16.133 aks-agentpool-66686873-5
nexus-dev hcl-search-elasticsearch-client-895968fdb-cdxp5 1/1 Running 1 5d 10.128.16.112 aks-agentpool-66686873-1
nexus-dev hcl-search-elasticsearch-data-0 1/1 Running 22 5d 10.128.16.42 aks-agentpool-66686873-4
nexus-dev hcl-search-elasticsearch-kibana-5d4b595d95-d7sff 1/1 Running 1 5d 10.128.16.101 aks-agentpool-66686873-1
nexus-dev hcl-search-elasticsearch-master-0 1/1 Running 11 5d 10.128.16.63 aks-agentpool-66686873-4
nexus-dev html2pdf-microservice-57647f95-9dxms 0/1 Running 281 2d 10.128.16.134 aks-agentpool-66686873-5
nexus-dev internal-ingress-controller-nginx-plus-ingress-controller-78rvr 1/2 Running 6 5d 10.128.16.68 aks-agentpool-66686873-2
nexus-dev kafka-kafka-0 1/1 Running 4 5d 10.128.16.104 aks-agentpool-66686873-1
nexus-dev kafka-kafka-1 0/1 CrashLoopBackOff 1218 5d 10.128.16.19 aks-agentpool-66686873-3
nexus-dev kafka-zookeeper-0 1/1 Running 1 5d 10.128.16.32 aks-agentpool-66686873-3
nexus-dev kafka-zookeeper-1 1/1 Running 2 5d 10.128.16.41 aks-agentpool-66686873-4
nexus-dev kafka-zookeeper-2 1/1 Running 1 5d 10.128.16.72 aks-agentpool-66686873-2
nexus-dev keycloak-9c8ff78f9-68fqn 2/2 Running 1 2d 10.128.16.74 aks-agentpool-66686873-2
nexus-dev kmd-care-citizen-updater-6ddbdbf976-hh4t5 1/1 Running 0 2d 10.128.16.51 aks-agentpool-66686873-4
nexus-dev letter-layout-toolkit-5bd49585d-sr566 0/1 Running 415 2d 10.128.16.41 aks-agentpool-66686873-4
nexus-dev logviewer-auth-proxy-547f8bd6f6-q6xh8 0/1 Running 548 5d 10.128.16.129 aks-agentpool-66686873-5
nexus-dev logviewer-curator-1530234120-tm588 0/1 CrashLoopBackOff 465 2d 10.128.16.67 aks-agentpool-66686873-2
nexus-dev logviewer-curator-1530320520-llwv7 1/1 Running 339 2d 10.128.16.76 aks-agentpool-66686873-2
nexus-dev logviewer-curator-1530406920-rvv8q 0/1 CrashLoopBackOff 329 1d 10.128.16.215 aks-agentpool-66686873-7
nexus-dev logviewer-curator-1530493320-ls5hr 0/1 CrashLoopBackOff 69 6h 10.128.16.208 aks-agentpool-66686873-7
nexus-dev logviewer-elasticsearch-client-797f78855-8ht2p 0/1 Running 1307 5d 10.128.16.55 aks-agentpool-66686873-4
nexus-dev logviewer-elasticsearch-client-797f78855-smf4f 0/1 CrashLoopBackOff 1377 5d 10.128.16.81 aks-agentpool-66686873-2
nexus-dev logviewer-elasticsearch-data-0 1/1 Running 1 5d 10.128.16.154 aks-agentpool-66686873-5
nexus-dev logviewer-elasticsearch-data-1 1/1 Running 2 5d 10.128.16.46 aks-agentpool-66686873-4
nexus-dev logviewer-elasticsearch-master-0 1/1 Running 1 5d 10.128.16.8 aks-agentpool-66686873-3
nexus-dev logviewer-elasticsearch-master-1 1/1 Running 1 5d 10.128.16.177 aks-agentpool-66686873-0
nexus-dev logviewer-elasticsearch-master-2 1/1 Running 1 5d 10.128.16.83 aks-agentpool-66686873-2
nexus-dev logviewer-fluentd-6546579d87-r6tpb 1/1 Running 1 5d 10.128.16.176 aks-agentpool-66686873-0
nexus-dev logviewer-kibana-586d8b69cf-dfvk9 1/1 Running 1 5d 10.128.16.34 aks-agentpool-66686873-3
nexus-dev medcom-microservice-687bd86b58-kbnsl 0/1 CrashLoopBackOff 1196 4d 10.128.16.50 aks-agentpool-66686873-4
nexus-dev mobile-disentangler-microservice-66bd6f6695-gjcj7 1/1 Running 2 5d 10.128.17.3 aks-agentpool-66686873-8
nexus-dev mobile-disentangler-microservice-66bd6f6695-wft7w 0/1 CrashLoopBackOff 727 5d 10.128.17.13 aks-agentpool-66686873-8
nexus-dev mobile-workforce-management-7db94f5b97-vh55c 0/1 Running 1239 5d 10.128.17.10 aks-agentpool-66686873-8
nexus-dev mobility-aids-cf6c9f6cd-czmrf 0/1 Running 1380 5d 10.128.16.46 aks-agentpool-66686873-4
nexus-dev mock-cpr-service-platform-7fbbbdcd5f-xhbvm 1/1 Running 2 5d 10.128.16.36 aks-agentpool-66686873-4
nexus-dev mysql-proxy-apiserver-58fdccf6f-tk4hr 1/1 Running 958 5d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev mysql-proxy-apiserver-dbaas-5ff9b4f49d-vg5b5 1/1 Running 1 5d 10.128.16.10 aks-agentpool-66686873-3
nexus-dev mysql-proxy-microservices-5c6f9997-lp68f 0/1 CrashLoopBackOff 1591 5d 10.128.16.45 aks-agentpool-66686873-4
nexus-dev mysql-proxy-microservices-dbaas-6c587cd888-whpfb 1/1 Running 1 5d 10.128.16.33 aks-agentpool-66686873-3
nexus-dev nas-microservice-77845784cf-skrzj 0/1 Running 223 2d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev nas-microservice-7ff6c79c95-n4rj7 0/1 CrashLoopBackOff 419 2d 10.128.16.58 aks-agentpool-66686873-4
nexus-dev nas-publish-1530515100-zjqsb 0/1 ContainerCreating 0 23s aks-agentpool-66686873-7
nexus-dev nexus-offline-data-service-576dcc477c-fx5mp 0/1 Running 101 4d 10.128.16.142 aks-agentpool-66686873-5
nexus-dev nexus-offline-data-service-576dcc477c-k6lzg 0/1 CrashLoopBackOff 807 2d 10.128.16.253 aks-agentpool-66686873-8
nexus-dev nexus-queue-rabbitmq-ha-0 1/1 Running 6 5d 10.128.16.182 aks-agentpool-66686873-0
nexus-dev nexus-queue-rabbitmq-ha-1 0/1 Running 13 5d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev nexus-queue-rabbitmq-ha-2 1/1 Running 5 5d 10.128.16.11 aks-agentpool-66686873-3
nexus-dev nexus-ui-68d88fb65f-6lc9f 1/1 Running 0 2h 10.128.16.191 aks-agentpool-66686873-7
nexus-dev nexus-ui-68d88fb65f-l9zpv 0/1 ContainerCreating 0 2h aks-agentpool-66686873-5
nexus-dev nexus-ui-core-bdf6479cf-v6jcz 1/1 Running 0 2d 10.128.16.52 aks-agentpool-66686873-4
nexus-dev nginx-plus-ingress-controller-5c4f8f877f-qvhgn 1/2 Running 8 5d 10.128.16.69 aks-agentpool-66686873-2
nexus-dev nginx-plus-ingress-controller-5c4f8f877f-scvsm 1/2 CrashLoopBackOff 1551 5d 10.128.16.16 aks-agentpool-66686873-3
nexus-dev outgoing-messages-service-7fc659fb84-wsgf9 1/1 Running 3 5d 10.128.17.24 aks-agentpool-66686873-8
nexus-dev professional-shift-microservice-7ff9954fc8-msn7n 1/1 Running 184 2d 10.128.17.13 aks-agentpool-66686873-8
nexus-dev push-to-exchange-microservice-7b74d4cc6c-rcffr 1/1 Running 14 5d 10.128.17.9 aks-agentpool-66686873-8
nexus-dev qampo-microservice-b6b6f6944-8kwgg 1/1 Running 20 2d 10.128.16.220 aks-agentpool-66686873-7
nexus-dev smdb-microservice-7bc66b8b9d-6njn7 1/1 Running 92 2d 10.128.16.209 aks-agentpool-66686873-7
nexus-dev sms-service-6c6d7bdf87-d9tjf 0/1 CrashLoopBackOff 1465 5d 10.128.16.129 aks-agentpool-66686873-5
nexus-dev sms-service-6c6d7bdf87-vbg2f 1/1 Running 3 5d 10.128.16.253 aks-agentpool-66686873-8
nexus-dev user-data-access-577675f6d6-c9jwd 0/1 Running 698 5d 10.128.16.129 aks-agentpool-66686873-5

I’ve checked ipam logs for address 10.128.16.130 and this address is being allocated and de-allocated from Azure, despite the fact it’s already used by other pods:

2018/06/29 15:15:39 [ipam] Address request completed with address:10.128.16.130/20 err:.
2018/06/29 15:15:39 [cni-ipam] Allocated address 10.128.16.130/20.
2018/06/29 15:15:39 [cni-ipam] ADD command completed with result:IP:[{Version:4 Interface: Address:{IP:10.128.16.130 Mask:fffff000} Gateway:10.128.16.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.128.16.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:.
2018/06/29 15:16:10 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address:10.128.16.130 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/29 15:16:11 [ipam] Releasing address with address:10.128.16.130 options:map[].
2018/06/29 15:16:11 [ipam] Address release completed with address:10.128.16.130 err:.
2018/06/29 15:16:21 [ipam] Address request completed with address:10.128.16.130/20 err:.
2018/06/29 15:16:21 [cni-ipam] Allocated address 10.128.16.130/20.
2018/06/29 15:16:21 [cni-ipam] ADD command completed with result:IP:[{Version:4 Interface: Address:{IP:10.128.16.130 Mask:fffff000} Gateway:10.128.16.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.128.16.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:.
2018/06/29 15:16:53 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address:10.128.16.130 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/29 15:16:53 [ipam] Releasing address with address:10.128.16.130 options:map[].
2018/06/29 15:16:53 [ipam] Address release completed with address:10.128.16.130 err:.
2018/06/29 16:05:06 [ipam] Address request completed with address:10.128.16.130/20 err:.
2018/06/29 16:05:06 [cni-ipam] Allocated address 10.128.16.130/20.
2018/06/29 16:05:06 [cni-ipam] ADD command completed with result:IP:[{Version:4 Interface: Address:{IP:10.128.16.130 Mask:fffff000} Gateway:10.128.16.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.128.16.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:.
2018/06/29 16:06:34 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address:10.128.16.130 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/29 16:06:35 [ipam] Releasing address with address:10.128.16.130 options:map[].
2018/06/29 16:06:35 [ipam] Address release completed with address:10.128.16.130 err:.
2018/06/29 16:17:18 [ipam] Address request completed with address:10.128.16.130/20 err:.
2018/06/29 16:17:18 [cni-ipam] Allocated address 10.128.16.130/20.
2018/06/29 16:17:18 [cni-ipam] ADD command completed with result:IP:[{Version:4 Interface: Address:{IP:10.128.16.130 Mask:fffff000} Gateway:10.128.16.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.128.16.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:.
2018/06/29 16:17:44 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address:10.128.16.130 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/29 16:17:44 [ipam] Releasing address with address:10.128.16.130 options:map[].
2018/06/29 16:17:44 [ipam] Address release completed with address:10.128.16.130 err:.
2018/06/30 03:00:37 [ipam] Address request completed with address:10.128.16.130/20 err:.
2018/06/30 03:00:37 [cni-ipam] Allocated address 10.128.16.130/20.
2018/06/30 03:00:37 [cni-ipam] ADD command completed with result:IP:[{Version:4 Interface: Address:{IP:10.128.16.130 Mask:fffff000} Gateway:10.128.16.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.128.16.1}], DNS:{Nameservers:[168.63.129.16] Domain: Search:[] Options:[]} err:.
2018/06/30 03:05:08 [cni-ipam] Read network configuration &{CNIVersion:0.3.0 Name:azure Type:azure-vnet Mode:bridge Master: Bridge:azure0 LogLevel: LogTarget: Ipam:{Type:azure-vnet-ipam Environment: AddrSpace: Subnet:10.128.16.0/20 Address:10.128.16.130 QueryInterval:} DNS:{Nameservers:[] Domain: Search:[] Options:[]} AdditionalArgs:[]}.
2018/06/30 03:05:09 [ipam] Releasing address with address:10.128.16.130 options:map[].
2018/06/30 03:05:09 [ipam] Address release completed with address:10.128.16.130 err:.

[cin@l004 .kube]$ kubectl get pods -o=wide --all-namespaces | grep aks-agentpool-66686873-5 | grep 130
nexus-dev api-gateway-5796bf487-tk9j5 0/1 Running 764 5d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev mysql-proxy-apiserver-58fdccf6f-tk4hr 1/1 Running 958 5d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev nas-microservice-77845784cf-skrzj 0/1 Running 223 2d 10.128.16.130 aks-agentpool-66686873-5
nexus-dev nexus-queue-rabbitmq-ha-1 0/1 Running 13 5d 10.128.16.130 aks-agentpool-66686873-5

CNM, --ipam-query-url switch missing from cmd

Is this a request for help?:
Yes

Which release version?:
v1.0.3

Which component (CNI/IPAM/CNM/CNS):
CNM

Which environment (Azure/MAS)
MAS

Which Operating System (Linux/Windows):
Windows, 10.0.17134.1 (WinBuild.160101.0800)

Which Orchestrator and version (e.g. Kubernetes, Docker)
Docker CE 18.03.1-ce

What happened:
Not able to get subnets from MAS as it not listening hardcoded http://169.254.169.254:6642/ListNetwork address and there is no cmd option to specify this.

What you expected to happen:
To be able to specify correct URL using --ipam-query-url cmd switch.

How to reproduce it (as minimally and precisely as possible):

  • Look docs/cnm.md and see that there is no --ipam-query-url switch available.
  • Look ipam/mas.go and see that hardcoded masQueryUrl will be used if anything else is not specified.

Anything else we need to know:
I implemented fix to this one already on #140 but it is waiting for review.

CNI plugins show "index out of range" on CentOS 7.4 for kubernetes 1.8.9

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUES

Which release version?:
v1.0.3

Which componenet (CNI/IPAM/CNM/CNS):
CNI

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes v1.8.9, CentOS 7.4

What happened:

The CNI does not work after installation, the error from /var/log/messages asbelow:

Mar 31 10:24:17 localhost kubelet: 2018/03/31 10:24:17 File not exist /var/run/AzureCNITelemetry.json
Mar 31 10:24:17 localhost kubelet: 2018/03/31 10:24:17 GetReport state file didn't exist. Setting flag to true
Mar 31 10:24:17 localhost kubelet: panic: runtime error: index out of range
Mar 31 10:24:17 localhost kubelet: goroutine 1 [running]:
Mar 31 10:24:17 localhost kubelet: github.com/Azure/azure-container-networking/telemetry.(*Report).GetOSDetails(0xc4200d0640)
Mar 31 10:24:17 localhost kubelet: /go/src/github.com/Azure/azure-container-networking/telemetry/telemetry_linux.go:112 +0x55c
Mar 31 10:24:17 localhost kubelet: github.com/Azure/azure-container-networking/telemetry.(*ReportManager).GetReport(0xc420012340, 0x6ea639, 0x3, 0x70d874, 0x6)
Mar 31 10:24:17 localhost kubelet: /go/src/github.com/Azure/azure-container-networking/telemetry/telemetry.go:138 +0xa3
Mar 31 10:24:17 localhost kubelet: main.main()
Mar 31 10:24:17 localhost kubelet: /go/src/github.com/Azure/azure-container-networking/cni/network/plugin/main.go:65 +0x6e6
Mar 31 10:24:17 localhost kubelet: E0331 10:24:17.369675 103320 cni.go:319] Error deleting network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
Mar 31 10:24:17 localhost dockerd-current: time="2018-03-31T10:24:17.370528075Z" level=error msg="Handler for POST /v1.26/containers/f2846470d4734c61bee35b9e76bcff3ca9d704ff0b7320398ebe3da2a2a26163/stop returned error: Container f2846470d4734c61bee35b9e76bcff3ca9d704ff0b7320398ebe3da2a2a26163 is already stopped"
Mar 31 10:24:17 localhost kubelet: E0331 10:24:17.371366 103320 remote_runtime.go:115] StopPodSandbox "f2846470d4734c61bee35b9e76bcff3ca9d704ff0b7320398ebe3da2a2a26163" from runtime service failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "kube-dns-545bc4bfd4-pzc4c_kube-system" network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
Mar 31 10:24:17 localhost kubelet: E0331 10:24:17.371439 103320 kuberuntime_manager.go:784] Failed to stop sandbox {"docker" "f2846470d4734c61bee35b9e76bcff3ca9d704ff0b7320398ebe3da2a2a26163"}
Mar 31 10:24:17 localhost kubelet: E0331 10:24:17.371537 103320 kuberuntime_manager.go:584] killPodWithSyncResult failed: failed to "KillPodSandbox" for "6b0cd455-34c6-11e8-a328-000d3a1ab23a" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "kube-dns-545bc4bfd4-pzc4c_kube-system" network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input"
Mar 31 10:24:17 localhost kubelet: E0331 10:24:17.371587 103320 pod_workers.go:182] Error syncing pod 6b0cd455-34c6-11e8-a328-000d3a1ab23a ("kube-dns-545bc4bfd4-pzc4c_kube-system(6b0cd455-34c6-11e8-a328-000d3a1ab23a)"), skipping: failed to "KillPodSandbox" for "6b0cd455-34c6-11e8-a328-000d3a1ab23a" with KillPodSandboxError: "rpc error: code = Unknown desc = NetworkPlugin cni failed to teardown pod "kube-dns-545bc4bfd4-pzc4c_kube-system" network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input"

What you expected to happen:
The CNI for kubernetes should work properly on CentOS 7.X

How to reproduce it (as minimally and precisely as possible):

  1. Find an CentOS 7.4 VM
  2. Install kubeadm, install kubernetes 1.8.9
  3. Install Azure CNI plugins for kubernetest on CentOS
  4. Check /var/log/messages, kubectl describe nodes

Anything else we need to know:

kube-dns should be first and only DNS server for Windows pods

Is this an ISSUE or FEATURE REQUEST? (choose one): Issue


Which release version?: v1.0.3


Which component (CNI/IPAM/CNM/CNS): CNI


Which Operating System (Linux/Windows):

Windows Server version 1803


Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

What happened:

168.x.x.x was first in the DNS server list. It can't resolve DNS within the k8s cluster.

What you expected to happen:

On Linux, 10.0.0.10 is the first and only server. This should be the same on Windows

/etc # cat resolv.conf
nameserver 10.0.0.10
search default.svc.cluster.local svc.cluster.local cluster.local lje1130gi5iehkdbexza4ptcba.xx.internal.cloudapp.net

How to reproduce it (as minimally and precisely as possible):
Deploy a Windows cluster using acs-engine v0.17

default mode of Microsoft Azure Container Networking plugin

Is this a request for help?:
yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
issue

Which release version?:
1.0.2

Which componenet (CNI/IPAM/CNM/CNS):
cni

the configurations generated by acs-engine would just copy the file https://github.com/Azure/azure-container-networking/blob/master/cni/azure-linux.conflist. which define the mode is bridge.
this make me confuse the docs in https://github.com/Azure/azure-container-networking/blob/master/docs/network.md is mentioned that the tunnel is default.

Which Orchestrator and version (e.g. Kubernetes, Docker)

What happened:
if I do nothing. the default mode is bridge

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

Errors while configuring Azure CNI with DC/OS

Hi,

I tried to setup the CNI plugin for Azure using Mesosphere DC/OS. For that I adapted the example from the following guide: https://github.com/Azure/azure-container-networking/blob/master/docs/cni.md

The binaries used are from this package: https://github.com/Azure/azure-container-networking/releases/download/v0.91/azure-vnet-cni-linux-amd64-v0.91.tgz

  • The files azure-vnet and azure-vnet-ipam are placed in the directory /opt/mesosphere/active/cni/
  • The configuration azure-vnet.cni is placed under /opt/mesosphere/etc/dcos/network/cni/ and has the following content:
{
  "cniVersion": "0.2.0",
  "name": "azure",
  "type": "azure-vnet",
  "master": "eth0",
  "bridge": "azure0",
  "logLevel": "debug",
  "ipam": {
    "type": "azure-vnet-ipam",
    "environment": "azure"
  }
}

The CNI plugins gets loaded, but it seems like, it's not able to allocate IP addresses from the local/private interface eth0:

/var/log/azure-vnet.log:

2017/12/04 15:45:53 [cni] Calling plugin azure-vnet-ipam ADD nwCfg:&{CNIVersion:0.2.0 Name:azure Type:azure-vnet Mode: Master:eth0 Bridge:azure0 LogLevel:debug LogTarget: Ipam:{Type:azure-vnet-ipam Environment:azure AddrSpace: Subnet: Address: QueryInterval:}}.
2017/12/04 15:45:54 [cni] Plugin azure-vnet-ipam returned result:<nil>, err:Failed to allocate address: No available addresses.
2017/12/04 15:45:54 [azure-vnet] Failed to allocate pool: Failed to delegate: Failed to allocate address: No available addresses.
2017/12/04 15:45:54 [cni-net] ADD command completed with result:<nil> err:Failed to allocate pool: Failed to delegate: Failed to allocate address: No available addresses.
2017/12/04 15:45:54 [cni-net] Plugin stopped.

/var/log/azure-vnet-ipam.log:

2017/12/04 15:45:54 [ipam] Refreshing address source.
2017/12/04 15:45:54 [ipam] Requesting address with address: options:map[azure.address.id:efd73ce3-eth0].
2017/12/04 15:45:54 [ipam] Address request completed with address:<nil> err:<nil>.
2017/12/04 15:45:54 [azure-vnet-ipam] Failed to allocate address: No available addresses.

Full logs attached to this issue:

I'm running CentOS 7.4 machines with DC/OS 1.10.2 on Azure. Is there anything else, I need to configure either directly on the nodes or within the Azure Portal to get this running? Any hints appreciated.

Regards,
Jan

CNI/IPAM plugin on Ubuntu 17.10 breaks DNS

Is this a request for help?:

yes

Is this an ISSUE or FEATURE REQUEST? (choose one):

ISSUE

Which release version?:

v1.0.3

Which componenet (CNI/IPAM/CNM/CNS):

CNI/IPAM

Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes

What happened:
Activating the CNI plugin on Ubuntu 17.10 breaks name resolution (DNS). The switch from eth0 to azure0 leaves DNS in an unusable state.

What you expected to happen:
DNS should continue to work

How to reproduce it (as minimally and precisely as possible):

  1. Provision a Kubernetes cluster with an Ubuntu 17.10 node and CNI/IPAM configured
  2. Schedule a pod
  3. Pod starts to run and errors with ErrImagePull due to being unable to resolve docker.io to pull images

At this point, the node must be rebooted for DNS to work again.

Anything else we need to know:
Ubuntu 17.10 uses systemd-resolved as a caching, stub nameserver. More info here: http://manpages.ubuntu.com/manpages/xenial/man8/systemd-resolved.service.8.html

cc @lachie83

Azure CNI IP addresses exhausted, old pod IPs not being reclaimed?

I have a Kubernetes 1.7.5 cluster which has run out of IPs to allocate, even though I only have about 15 pods running. Scheduling a new deployment fails. The events on the pod to be scheduled are:

default   2017-09-28 03:57:02 -0400 EDT   2017-09-28 03:57:02 -0400 EDT   1         hello-4059723819-8s35v   Pod       spec.containers{hello}   Normal    Pulled    kubelet, k8s-agentpool1-18117938-2   Successfully pulled image "myregistry.azurecr.io/mybiz/hello"
default   2017-09-28 03:57:02 -0400 EDT   2017-09-28 03:57:02 -0400 EDT   1         hello-4059723819-8s35v   Pod       spec.containers{hello}   Normal    Created   kubelet, k8s-agentpool1-18117938-2   Created container
default   2017-09-28 03:57:03 -0400 EDT   2017-09-28 03:57:03 -0400 EDT   1         hello-4059723819-8s35v   Pod       spec.containers{hello}   Normal    Started   kubelet, k8s-agentpool1-18117938-2   Started container
default   2017-09-28 03:57:13 -0400 EDT   2017-09-28 03:57:01 -0400 EDT   2         hello-4059723819-tj043   Pod                 Warning   FailedSync   kubelet, k8s-agentpool1-18117938-3   Error syncing pod
default   2017-09-28 03:57:13 -0400 EDT   2017-09-28 03:57:02 -0400 EDT   2         hello-4059723819-tj043   Pod                 Normal    SandboxChanged   kubelet, k8s-agentpool1-18117938-3   Pod sandbox changed, it will be killed and re-created.
default   2017-09-28 03:57:24 -0400 EDT   2017-09-28 03:57:01 -0400 EDT   3         hello-4059723819-tj043   Pod                 Warning   FailedSync   kubelet, k8s-agentpool1-18117938-3   Error syncing pod
default   2017-09-28 03:57:25 -0400 EDT   2017-09-28 03:57:02 -0400 EDT   3         hello-4059723819-tj043   Pod                 Normal    SandboxChanged   kubelet, k8s-agentpool1-18117938-3   Pod sandbox changed, it will be killed and re-created.
[...]

The dashboard looks like this:

image

Eventually, the dashboard shows the error:

Error: failed to start container "hello": Error response from daemon: {"message":"cannot join network of a non running container: 7e95918c6b546714ae20f12349efcc6b4b5b9c1e84b5505cf907807efd57525c"}

The kubelet log on the node shows that an IP address cannot be allocated due to "No available addresses":

E0928 20:54:01.733682    1750 pod_workers.go:182] Error syncing pod 65127a94-a425-11e7-8d64-000d3af4357e ("hello-4059723819-xx16n_default(65127a94-a425-11e7-8d64-000d3af4357e)"), skipping: failed to "CreatePodSandbox" for "hello-4059723819-xx16n_default(65127a94-a425-11e7-8d64-000d3af4357e)" with CreatePodSandboxError: "CreatePodSandbox for pod \"hello-4059723819-xx16n_default(65127a94-a425-11e7-8d64-000d3af4357e)\" failed: rpc error: code = 2 desc = NetworkPlugin cni failed to set up pod \"hello-4059723819-xx16n_default\" network: Failed to allocate address: Failed to delegate: Failed to allocate address: No available addresses"

I did have a cron job running on this cluster every minute, which would have created (and destroyed) a lot of pods in a short amount of time.

Is this perhaps a bug in Azure CNI not reclaiming IP addresses from terminated containers?

How to reproduce it (as minimally and precisely as possible):

I believe this can be reproduced simply by starting and stopping many containers, such as by enabling the batch/v2alpha1 API and running a CronJob every few seconds.

Anything else we need to know?:
No

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.6", GitCommit:"4bc5e7f9a6c25dc4c03d4d656f2cefd21540e28c", GitTreeState:"clean", BuildDate:"2017-09-14T06:55:55Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T08:56:23Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration**: Azure with Azure CNI networking plugin
  • OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.2 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
  • Kernel (e.g. uname -a): Linux k8s-master-18117938-0 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12 14:59:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: ACS-Engine
  • Others:
    Restarting the master has no effect

This is essentially a copy of upstream bug kubernetes/kubernetes#53226.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.