Git Product home page Git Product logo

ip-masq-agent's Introduction

ip-masq-agent

The ip-masq-agent configures iptables rules to MASQUERADE traffic outside link-local (optional, enabled by default) and additional arbitrary IP ranges.

It creates an iptables chain called IP-MASQ-AGENT, which contains match rules for link local (169.254.0.0/16) and each of the user-specified IP ranges. It also creates a rule in POSTROUTING that jumps to this chain for any traffic not bound for a LOCAL destination.

IPs that match the rules (except for the final rule) in IP-MASQ-AGENT are not subject to MASQUERADE via the IP-MASQ-AGENT chain (they RETURN early from the chain). The final rule in the IP-MASQ-AGENT chain will MASQUERADE any non-LOCAL traffic.

RETURN in IP-MASQ-AGENT resumes rule processing at the next rule the calling chain, POSTROUTING. Take care to avoid creating additional rules in POSTROUTING that cause packets bound for your configured ranges to undergo MASQUERADE.

Launching the agent as a DaemonSet

This repo includes an example yaml file that can be used to launch the ip-masq-agent as a DaemonSet in a Kubernetes cluster.

kubectl create -f ip-masq-agent.yaml

The spec in ip-masq-agent.yaml specifies the kube-system namespace for the DaemonSet Pods.

Configuring the agent

Important: You should not attempt to run this agent in a cluster where the Kubelet is also configuring a non-masquerade CIDR. You can pass --non-masquerade-cidr=0.0.0.0/0 to the Kubelet to nullify its rule, which will prevent the Kubelet from interfering with this agent.

By default, the agent is configured to treat the three private IP ranges specified by RFC 1918 as non-masquerade CIDRs. These ranges are 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. To change this behavior, see the flags section below. The agent will also treat link-local (169.254.0.0/16) as a non-masquerade CIDR by default.

By default, the agent is configured to reload its configuration from the /etc/config/ip-masq-agent file in its container every 60 seconds.

The agent configuration file should be written in yaml or json syntax, and may contain four optional keys:

  • nonMasqueradeCIDRs []string: A list strings in CIDR notation that specify the non-masquerade ranges.
  • cidrLimit int: Maximum number of nonMasqueradeCIDRs entries allowed. 64 by default.
  • masqLinkLocal bool: Whether to masquerade traffic to 169.254.0.0/16. False by default.
  • masqLinkLocalIPv6 bool: Whether to masquerade traffic to fe80::/10. False by default.
  • resyncInterval string: The interval at which the agent attempts to reload config from disk. The syntax is any format accepted by Go's time.ParseDuration function.

The agent will look for a config file in its container at /etc/config/ip-masq-agent. This file can be provided via a ConfigMap, plumbed into the container via a ConfigMapVolumeSource. As a result, the agent can be reconfigured in a live cluster by creating or editing this ConfigMap.

This repo includes a directory-representation of a ConfigMap that can configure the agent (the agent-config directory). To use this directory to create the ConfigMap in your cluster:

kubectl create configmap ip-masq-agent --from-file=agent-config --namespace=kube-system

Note that we created the ConfigMap in the same namespace as the DaemonSet Pods, and named the ConfigMap to match the spec in ip-masq-agent.yaml. This is necessary for the ConfigMap to appear in the Pods' filesystems.

Agent Flags

The agent accepts two flags, which may be specified in the yaml file.

masq-chain : The name of the iptables chain to use. By default set to IP-MASQ-AGENT

nomasq-all-reserved-ranges : Whether or not to masquerade all RFC reserved ranges when the configmap is empty. The default is false. When false, the agent will masquerade to every destination except the ranges reserved by RFC 1918 (namely 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16). When true, the agent will masquerade to every destination that is not marked reserved by an RFC. The full list of ranges is (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 100.64.0.0/10, 192.0.0.0/24, 192.0.2.0/24, 192.88.99.0/24, 198.18.0.0/15, 198.51.100.0/24, 203.0.113.0/24, and 240.0.0.0/4). Note however, that this list of ranges is overridden by specifying the nonMasqueradeCIDRs key in the agent configmap.

enable-ipv6 : Whether to configurate ip6tables rules. By default enable-ipv6 is false.

Rationale

(from the incubator proposal)

This agent solves the problem of configuring the CIDR ranges for non-masquerade in a cluster (via iptables rules). Today, this is accomplished by passing a --non-masquerade-cidr flag to the Kubelet, which only allows one CIDR to be configured as non-masquerade. RFC 1918, however, defines three ranges (10/8, 172.16/12, 192.168/16) for the private IP address space.

Some users will want to communicate between these ranges without masquerade - for instance, if an organization's existing network uses the 10/8 range, they may wish to run their cluster and Pods in 192.168/16 to avoid IP conflicts. They will also want these Pods to be able to communicate efficiently (no masquerade) with each-other and with their existing network resources in 10/8. This requires that every node in their cluster skips masquerade for both ranges.

We are trying to eliminate networking code from the Kubelet, so rather than extend the Kubelet to accept multiple CIDRs, ip-masq-agent allows you to run a DaemonSet that configures a list of CIDRs as non-masquerade.

Releasing

See RELEASE.

Developing

Clone the repo to $GOPATH/src/k8s.io/ip-masq-agent.

The build tooling is based on thockin/go-build-template.

Run make or make build to compile the ip-masq-agent. This will use a Docker image to build the agent, with the current directory volume-mounted into place. This will store incremental state for the fastest possible build. Run make all-build to build for all architectures.

Run make test to run the unit tests.

Run make container to build the container image. It will calculate the image tag based on the most recent git tag, and whether the repo is "dirty" since that tag (see make version). Run make all-container to build containers for all architectures.

Run make push to push the container image to REGISTRY. Run make all-push to push the container images for all architectures.

Run make clean to clean up.

ip-masq-agent's People

Contributors

aramase avatar bentheelder avatar bowei avatar dependabot[bot] avatar grayluck avatar ixdy avatar jbeda avatar jingyuanliang avatar k8s-ci-robot avatar kundan2707 avatar ldx avatar mbssaiakhil avatar mrhohn avatar mtaufen avatar nikhita avatar ramkumar-k avatar satyasm avatar sozercan avatar spiffxp avatar stonith avatar thockin avatar varunmar avatar xmudrii avatar yuwenma avatar zhuxiaow0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ip-masq-agent's Issues

Take the /run/xtables.lock when updating iptables

E1018 07:00:59.411818 1 ip-masq-agent.go:146] error syncing masquerade rules: exit status 1 (iptables-restore: line 7 failed

We can avoid stepping on other applications that also write iptables rules by taking the lock (like other applications do)

Lint failures

Running golangci-lint: FAIL
cmd/ip-masq-agent/ip-masq-agent.go:274:24: Error return value of `m.iptables.EnsureChain` is not checked (errcheck)
	m.iptables.EnsureChain(utiliptables.TableNAT, masqChain)
	                      ^
cmd/ip-masq-agent/ip-masq-agent_test.go:35:10: Error return value of `flag.Set` is not checked (errcheck)
	flag.Set("logtostderr", "false")
	        ^
cmd/ip-masq-agent/ip-masq-agent_test.go:36:10: Error return value of `flag.Set` is not checked (errcheck)
	flag.Set("masq-chain", "IP-MASQ-AGENT")
	        ^
cmd/ip-masq-agent/ip-masq-agent_test.go:255:11: Error return value of `flag.Set` is not checked (errcheck)
		flag.Set("enable-ipv6", "true")
		        ^
cmd/ip-masq-agent/ip-masq-agent_test.go:354:19: Error return value of `m.syncMasqRules` is not checked (errcheck)
			m.syncMasqRules()
			               ^
cmd/ip-masq-agent/ip-masq-agent_test.go:360:17: Error return value of `fipt.SaveInto` is not checked (errcheck)
			fipt.SaveInto("nat", buf)
			             ^
cmd/ip-masq-agent/ip-masq-agent_test.go:428:23: Error return value of `m.syncMasqRulesIPv6` is not checked (errcheck)
			m.syncMasqRulesIPv6()
			                   ^
cmd/ip-masq-agent/ip-masq-agent_test.go:434:18: Error return value of `fipt6.SaveInto` is not checked (errcheck)
			fipt6.SaveInto("nat", buf)
			              ^
cmd/ip-masq-agent/ip-masq-agent_test.go:361:7: S1030: should use buf.String() instead of string(buf.Bytes()) (gosimple)
			if string(buf.Bytes()) != string(tt.want) {
			   ^
cmd/ip-masq-agent/ip-masq-agent_test.go:362:49: S1030: should use buf.String() instead of string(buf.Bytes()) (gosimple)
				t.Errorf("syncMasqRules wrote %q, want %q", string(buf.Bytes()), tt.want)
				                                            ^
cmd/ip-masq-agent/ip-masq-agent_test.go:435:7: S1030: should use buf.String() instead of string(buf.Bytes()) (gosimple)
			if string(buf.Bytes()) != tt.want {
			   ^
cmd/ip-masq-agent/testing/fakefs/fakefs.go:21:2: SA1019: "io/ioutil" has been deprecated since Go 1.19: As of Go 1.16, the same functionality is now provided by package io or package os, and those implementations should be preferred in new code. See the specific function documentation for details. (staticcheck)
	"io/ioutil"
	^

make: *** [Makefile:372: lint] Error 1

[FEATURE] Implement masquerade CIDRs

Let me preface this PR by indicating I realize there may be reasons this feature has already been considered and rejected, or never considered, but I can't find trace of that and don't want to assume anything.

Context:

  • Kubernetes 1.7.2
  • Flannel overlay networking that is not meshed with networking outside the cluster
  • Nodes can NAT in broad ways via core network configuration, and managing that configuration in an effort to control what pods can NAT to is not tenable

I can think of a number of reasons this feature could be useful, but I'll describe my most straightforward use case.

I want to allow pods to masquerade to an explicit list of CIDRs. If that range of CIDRs is small and/or can be collapsed easily, using the existing nonMasqueradeCIDR list causes me to think a little bit backwards about what I'm trying to accomplish, but it's doable relatively easily.

However, if the CIDRs to which I want to allow masquerade are large in number and/or cannot be collapsed into a very small set of CIDRs, I run into two problems:

  • The number of CIDRs I have to provide to ip-masq-agent is greater than the limit of 64 the code enforces
  • Having even only 64 CIDRs in nonMasqueradeCIDRs makes the config and resulting iptables rules fairly impossible to understand at a glance. The question "what can pods masquerade to?" cannot be easily answered.

For more context on the use case, I would use the feature by creating an explicit list of masqueradeCIDRs, and a single entry of 0.0.0.0/0 for nonMasqueradeCIDRs, effectively disabling the end of chain MASQUERADE catch-all.

A pull request implementing this feature has already been opened.

Thanks for your consideration.

ip-masq-agent conflict with cni rule

k8s version: v1.13.2
host version: centos 7.3
behavor: ip-masq-agent not work, the ip in range of 10.0.0.0/8 does SNAT. Any thing wrong with my ip-masq-agent settings?

iptables:

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
CNI-1fbf321be92bf70398b61947  all  --  10.107.64.64/26      0.0.0.0/0            /* name: "cbr0" id: "8f8c0a517344e3ebf1e09399dda1d552b90a9474dcd5e30a4f598590e9c12ed5" */
CNI-e3cd2dddf7bb8b63f393b425  all  --  10.107.64.64/26      0.0.0.0/0            /* name: "cbr0" id: "55ff0c4ad643d52fb70be88585b3ab152f508b67350cf10b5880bfe306dc0274" */
IP-MASQ-AGENT  all  --  0.0.0.0/0            0.0.0.0/0            /* ip-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom IP-MASQ-AGENT chain */ ADDRTYPE match dst-type !LOCAL

Chain CNI-1fbf321be92bf70398b61947 (1 references)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            10.107.64.64/26      /* name: "cbr0" id: "8f8c0a517344e3ebf1e09399dda1d552b90a9474dcd5e30a4f598590e9c12ed5" */
MASQUERADE  all  --  0.0.0.0/0           !224.0.0.0/4          /* name: "cbr0" id: "8f8c0a517344e3ebf1e09399dda1d552b90a9474dcd5e30a4f598590e9c12ed5" */

Chain CNI-2ee8d1d3c08248c9b227f9af (0 references)
target     prot opt source               destination

Chain CNI-30f721968232f9c23ba9544b (0 references)
target     prot opt source               destination

Chain CNI-6fe6f9122a41e4c9a2957bdc (0 references)
target     prot opt source               destination

Chain CNI-7bb5b1a56865087732f6db6a (0 references)
target     prot opt source               destination

Chain CNI-e3cd2dddf7bb8b63f393b425 (1 references)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            10.107.64.64/26      /* name: "cbr0" id: "55ff0c4ad643d52fb70be88585b3ab152f508b67350cf10b5880bfe306dc0274" */
MASQUERADE  all  --  0.0.0.0/0           !224.0.0.0/4          /* name: "cbr0" id: "55ff0c4ad643d52fb70be88585b3ab152f508b67350cf10b5880bfe306dc0274" */

Chain DOCKER (2 references)
target     prot opt source               destination

Chain IP-MASQ-AGENT (1 references)
target     prot opt source               destination
RETURN     all  --  0.0.0.0/0            169.254.0.0/16       /* ip-masq-agent: cluster-local traffic should not be subject to MASQUERADE */ ADDRTYPE match dst-type !LOCAL
RETURN     all  --  0.0.0.0/0            10.0.0.0/8           /* ip-masq-agent: cluster-local traffic should not be subject to MASQUERADE */ ADDRTYPE match dst-type !LOCAL
MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* ip-masq-agent: outbound traffic should be subject to MASQUERADE (this match must come after cluster-local CIDR matches) */ ADDRTYPE match dst-type !LOCAL

Create a SECURITY_CONTACTS file.

As per the email sent to kubernetes-dev[1], please create a SECURITY_CONTACTS
file.

The template for the file can be found in the kubernetes-template repository[2].
A description for the file is in the steering-committee docs[3], you might need
to search that page for "Security Contacts".

Please feel free to ping me on the PR when you make it, otherwise I will see when
you close this issue. :)

Thanks so much, let me know if you have any questions.

(This issue was generated from a tool, apologies for any weirdness.)

[1] https://groups.google.com/forum/#!topic/kubernetes-dev/codeiIoQ6QE
[2] https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS
[3] https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance-template-short.md

Add exceptions in nonMasqueradeCIDRs

Hi,

Would it be possible to implement an exception clause for nonMasqueradeCIDR, that would take a range or list of IP ranges.
Something like:
7

nonMasqueradeCIDRs:
  - 10.0.0.0/8
  except:
      - 10.100.10.0/24

So that exceptionally, requests to 10.100.10.0/24 would be masqueraded and the rest of requests to 10.0.0.0/8 would not be masqueraded.

make test is broken at head

Thanks to @Ramkumar-K for noticing this - we found out ip-masq-agent unit test is broken with the below error:

$ make test
Running tests:
# k8s.io/ip-masq-agent/cmd/ip-masq-agent [k8s.io/ip-masq-agent/cmd/ip-masq-agent.test]
cmd/ip-masq-agent/ip-masq-agent_test.go:322:19: fipt.Lines undefined (type *"k8s.io/kubernetes/pkg/util/iptables/testing".FakeIPTables has no field or method Lines)
cmd/ip-masq-agent/ip-masq-agent_test.go:323:61: fipt.Lines undefined (type *"k8s.io/kubernetes/pkg/util/iptables/testing".FakeIPTables has no field or method Lines)
cmd/ip-masq-agent/ip-masq-agent_test.go:384:20: fipt6.Lines undefined (type *"k8s.io/kubernetes/pkg/util/iptables/testing".FakeIPTables has no field or method Lines)
cmd/ip-masq-agent/ip-masq-agent_test.go:385:66: fipt6.Lines undefined (type *"k8s.io/kubernetes/pkg/util/iptables/testing".FakeIPTables has no field or method Lines)
?       k8s.io/ip-masq-agent/cmd/ip-masq-agent/testing/fakefs   [no test files]
?       k8s.io/ip-masq-agent/pkg/version        [no test files]
FAIL    k8s.io/ip-masq-agent/cmd/ip-masq-agent [build failed]
FAIL
make: *** [Makefile:349: test] Error 1

This seems to be introduced because of an upstream change in Kubernetes, where iptables/testing/fake.go got revamped: kubernetes/kubernetes#109844

cc @jingyuanliang

Is there a way to set a minimum log level?

Hello,

Just curious if there is any chance of disabling information logs (got them every minute and by default datadog kubernetes agent thinks they are errors).

Cheers

Ignore node taints by default ( on GKE )

Hi,

Apologies if this isn't helpful, this issue is related to the configuration ip-masq-agent runs with under addon-manager in GKE. I'm having a tough time getting the GCP support guys to understand the problem, so I thought I'd post it here too, in the hope that the right people see it. (Case no 0016000000MB1EI/U-14664368 ) - If there's a better place to post this, please let me know.

My clusters run with a 172.16.x.x internal address range, and communicate with remote datacenters over a VPN link. If I don't have working ip-masq-agent (or equivalent), k8s cannot serve traffic to my remote datacenters.

The enforced config for ip-masq-agent on GKE is not to ignore node taints. This means that as soon as I use node taints, that node no longer runs ip-masq-agent, and may not be able to serve traffic correctly.

I can't think of a reason for this behavior - you either need ip-masq-agent running, or you don't. It should ignore node taints, like many other k8s critical services.

eg. Compare the tolerations for fluentd and ip-masq-agent on GKE

fluentd

  tolerations:
  - effect: NoSchedule
    key: node.alpha.kubernetes.io/ismaster
  - effect: NoExecute
    operator: Exists
  - effect: NoSchedule
    operator: Exists

ip-masq-agent

  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists

thanks

James M

Update base image

Last three Kubernetes releases (1.9 - 1.11) uses Debian Stretch as base image that contains iptables 1.6:

# docker run --rm gcr.io/google_containers/hyperkube:v1.8.15 iptables --version
iptables v1.4.21

# docker run --rm gcr.io/google_containers/hyperkube:v1.9.10 iptables --version
iptables v1.6.0

# docker run --rm gcr.io/google_containers/hyperkube:v1.10.6 iptables --version
iptables v1.6.0

# docker run --rm gcr.io/google_containers/hyperkube:v1.11.1 iptables --version
iptables v1.6.0

Latest ip-masq-agent image:

# docker run --rm --entrypoint=iptables gcr.io/google-containers/ip-masq-agent-amd64:v2.0.1 --version
iptables v1.4.21

iptables 1.4 and 1.6 uses different ways for locking (flock() on /run/xtables.lock for 1.6 and bind() on unix-socket for 1.4), so they can not be mixed on one k8s node. See also: kubernetes/kubernetes#46103.

Please update base image to match current kube-proxy images.

Newer container image is missing on gcr.io

The docker image used by the Pod on the reference DaemonSet manifest is missing:
% docker pull gcr.io/k8s-staging-networking/ip-masq-agent:v2.7.0 Error response from daemon: manifest for gcr.io/k8s-staging-networking/ip-masq-agent:v2.7.0 not found: manifest unknown: Failed to fetch "v2.7.0" from request "/v2/k8s-staging-networking/ip-masq-agent/manifests/v2.7.0".

ipv6 option not honored

The enable-ipv6 feature did not work when added to the config map. When I viewed the container logs, it parsed all the ConfigMap contents except enable-ipv6. The only way I could get it to pass IPv6 traffic is to change cmd/ip-masq-agent/ip-masq-agent.go line 53 and explicitly set enable-ipv6 to true. This way it becomes the default.

Does ip-masq-agent intentionally crash on errors?

https://github.com/kubernetes-incubator/ip-masq-agent/blob/480a627/cmd/ip-masq-agent/ip-masq-agent.go#L117

It seems like ip-masq-agent is designed to exit non zero when any error occurs. For example without #22 ip-masq-agent often crashes on busy nodes with an obtuse iptables-restore error. With #22 it will crash if it can't get the iptables lock:

E0922 00:30:24.709933       1 ip-masq-agent.go:145] error syncing masquerade rules: failed to ensure that nat chain IP-MASQ-AGENT jumps to MASQUERADE: error checking rule: exit status 4: Another app is currently holding the xtables lock. Stopped waiting after 5s.
failed to ensure that nat chain IP-MASQ-AGENT jumps to MASQUERADE: error checking rule: exit status 4: Another app is currently holding the xtables lock. Stopped waiting after 5s.

Other tools such as kube-proxy seem to log about this and move on with their lives. Is ip-masq-agent designed to crash and restart; i.e. should we consider frequent ip-masq-agent container restarts normal behaviour? I'm wondering whether I should attempt to 'fix' ip-masq-agent or simply not alert when it restarts too frequently.

Release confusion

๐Ÿ‘‹ I noticed that there's a 2.10.0 tag, but the latest github release and images seem to be 2.9.x.

Proposal: Read a second ConfigMap for cloud provider usage

Problem:

AKS needs to continuously reconcile the NonMasqueradeCIDRs to stay in sync with all of the CIDRs in the VNET used by a cluster. Otherwise, internal traffic ends up getting improperly masqueraded.

However, AKS cannot reconcile the existing ConfigMap because it is also used for scenarios were custom ranges have been manually written, such as in the case of a peered VNET.

GKE also faces the same problem, forcing manual updating of this ConfigMap instead of being able to keep it continuously updated:

If you use ip-masq-agent configured with the nonMasqueradeCIDRs parameter, you must update the nonMasqueradeCIDRs to include all Pod CIDR ranges.

Proposal:

Addition bool flag --enable-cloud-config that, when true, will extend the behavior of the syncConfig function to read in a second ConfigMap with the same data keys.

When this second ConfigMap is read, the nonMasqueradeCIDRs list is merged with the nonMasqueradeCIDRs from the first. Additional options such as masqLinkLocal, such as could be ignored or OR'd with the first.

The second ConfigMap's purpose will be for cloud providers to reconcile and keep in-sync with virtual network CIDRs, while the first ConfigMap will remain as a way for custom ranges to be added.

--non-masquerade-cidr to disable SNAT but still recording in nf_conntrack?

I have set --non-masquerade-cidr to disable SNAT for certain ip cidr, and the iptables generated the RETURN rules in table NAT as expected.

 iptables -t nat -nvL|grep  <ip cidr that disable SNAT>

    100     100 RETURN     all  --  *      *       <ip cidr that disable SNAT>       0.0.0.0/0            /* ip-masq-agent: local traffic is not subject to MASQUERADE */

But I can still get connection records from nf_conntrack tables

conntrack -L -o extended | grep <ip cidr that disable SNAT>
...
...

Is that means even though the connections don't go through SNAT(matches the RETURN rules in table NAT), it can still be recorded in nf_conntrack tables?

ipv6 egress traffic not working

Hi,
Running k8s cluster with calico as CNI. We have a usecase to connect to ipv6 device from the pod which is configured with ipv4.

  • Added ip-masq-agent to masquerade daemonset

  • It created the iptables for masquerading in the host machine

  • Able to ping the ipv6 device from the container of the ip-masq-agent

  • Unable to reach the ipv6 device from the container of other workloads in the same cluster. Getting connect: Network is unreachable when running ping6

kubernetes logs required

in a security breach event, several containers would run behind the primary ip of the node

hostnetwork true
nodeport

then do we have some way to check logs and see which container actually originated the traffic behind the node ip?

Multiple Connections with same source port

Hi,
I've some GKE clusters with ip-masq-agent, without a custom configuration, that translate connections to public IPs keeping the source port of pod connection.

In my context, where I open a lot of connections and I have an implicit egress proxy that use a REDIRECT iptables rule to be "transparent", this behaviour create issues with connections that exit from the same node and has same source pod port, because my iptables rule "translate" them with same destination and port destination.

I'm trying to setup egress proxy with TPROXY iptables rule to prevent this issue, but I ask you if it is possible to scramble source port of NAT'ed connections to prevent the problem definitively.

Thanks

Question: Does the daemon need to run continuosly?

If my cluster's POD CIDR does not change during the lifetime of the node, do I need to run this masq agent continuously?

As in, if I manually (or using shell scripts) configure the relevant IPTable Masquerading rules, will that serve the purpose of this ip-masq-agent?

Basically if the /etc/config/ip-masq-agent file in my node never changes, does ip-masq-agent need to be a daemon or can it just be a job that runs once when the node starts for the first time.

Or is my understanding of this agent flawed?

the agent masquerades IPs unexpectedly: should it masquerade incoming traffic?

@roy-work described it here (Azure/AKS#2076) and I've just seen the same thing.

In a private GKE cluster (1.19.10-gke.1600) we'd set up a config to aggressively masquerade everything:

nonMasqueradeCIDRs:
masqLinkLocal: true

And then while digging into something I observed that traffic which should (by virtue of my internal TCP load balancer and externalTrafficPolicy: Local service) be logged by the NGINX ingress controller with the actual client source IPs was instead being logged with IPs from the pod subnet.

Since these clients are elsewhere in private CIDR ranges, the default list of non-masquerade CIDRs wouldn't work. Instead I modified my config like so:

nonMasqueradeCIDRs:
  # pod CIDR
  - 172.16.0.0/12

  # our node CIDR (including the ingress controller's internal TCP load balancer IP)
  - 10..../22

# back to default
masqLinkLocal: false

And as @roy-work described that fixed my problem and got NGINX the right IPs.

Expected

I thought that setting an empty nonMasqueradeCIDRs list would cause source IP masquerading for outbound connections (from my workloads), but leave inbound connections unmodified.

There's no pod-to-pod traffic in this cluster that would suffer. And with externalTrafficPolicy: Local, inbound connections shouldn't be pod-to-pod...

Actual

In fact it seems that nonMasqueradeCIDRs (and masqLinkLocal) are meant to be comprehensive to the point where inbound external traffic also has its source IPs modified!

A suggestion

Of course if this behavior is not by design this is a bug. But if this is intentional can we clarify the docs?

For example, where the user guide (https://kubernetes.io/docs/tasks/administer-cluster/ip-masq-agent/) says

nonMasqueradeCIDRs: A list of strings in CIDR notation that specify the non-masquerade ranges.

It would be nice to explain what a "masquerade range" is. Perhaps say this:

nonMasqueradeCIDRs: A list of strings in CIDR notation that specify the non-masquerade ranges. These indicate destination IPs for which the agent won't masquerade source IPs.

At the moment I don't know where the docs are defined or I'd submit a pull req. now...

Needs a better solution to " --enable-ipv6"

Why isn't there a flag in the configmap or when ipv6 is detected not enabled ??

apiVersion: v1
data:
  config: |
    masqLinkLocal: false
    masqLinkLocalIPv6: true
    enableIPv6 : true  <--- Missing some kind of that
    resyncInterval: 5s
kind: ConfigMap
metadata:
  name: ip-masq-agent
  namespace: kube-system

Need always to change daemonset manual by add:

      - args:
        - --masq-chain=IP-MASQ
        - --enable-ipv6

Everywhere I'm look around the repos is cloned to add this arguments.

The other way of patch in is a different solution.

support SNAT to custom IP

Hi,

I have a use case around rtp traffic where packets are either received or sent by a k8s pod. Incoming traffic lands on a load-balancer that forwards all ports to a single k8s node. Outbound internet traffic is currently masqueraded. This leads to an asymmetric traffic flow where the in- and outbound IPs are not the same. This is causing issues as asymmetric traffic is not accepted by some SIP trunk providers.

A more general problem statement would be that someone needs a static IP address for outbound traffic from k8s workloads, maybe because they need to set firewalls or whitelists on the other end.

Since we already have an LB that forwards all traffic to the node, a simple fix is to replace the iptables MASQUERADE nat rule with an SNAT rule to the load-balancer's IP. That way, all egress traffic uses the load-balancers IP as the source, and because of the load-balancer, return traffic is still routed to the correct place.
This is trivial in the case of a backend with a single node, but a similar pattern can also be used with multiple nodes and multiple LBs.

I wrote and tested a small patch that adds a new optional parameter that can be set to an SNAT ip address.

curious to hear what others are thinking about this approach and if there is interest to include this in the agent.

Add support for not-masquerading all RFC reserved ranges, not just RFC1918

Context: GCE considers all RFC reserved IPv4 ranges to be internal to a VPC. In order to correctly route to any possible endpoint inside a VPC, these ranges should not be masqueraded.

This feature request is to add an option for ip-masq-agent to use the entire RFC reserved range as the default nonMasqueradeCidrs value when the configMap does not specify particular values.

IP-MASQ-AGENT rule location in POSTROUTING chain is wrong - openshift

The agent puts the IP-MASQ-AGENT rule at the end of the POSTROUTING chain.
In Openshift it does not take any effect as there is already the OPENSHIFT-MASQUERADE at the beginning and being matched.

The following is how this chain looks like in Openshift:

Chain POSTROUTING (policy ACCEPT 42 packets, 4260 bytes)
    pkts      bytes target     prot opt in     out     source               destination         
   34337  3071902 OPENSHIFT-MASQUERADE  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* rules for masquerading OpenShift traffic */
   32964  2975002 KUBE-POSTROUTING  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
       0        0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           
       0        0 MASQUERADE  all  --  *      tun0    127.0.0.0/8          0.0.0.0/0            /* SNAT for localhost access to hostports */
    9545   931580 IP-MASQ-AGENT  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* ip-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom IP-MASQ-

updateStrategy for clusters running since pre v1.5.

Hi,

I have no idea if this is the right place for this kind of issue, nor if this is the place for a fix, but here goes;

We are seeing alerts according to this rule for ip-masq-agent on two of our (older) clusters, but not on our newer clusters.

We noticed that these two older clusters (currently running v1.22.12-gke.500) have .spec.updateStrategy.type set to OnDelete while the other clusters have RollingUpdate.

According to the docs, OnDelete was the default pre k8s 1.5. And I think our two older clusters may have been around since then, which would also explain why these clusters still has this old default setting.

The ip-masq-agent.yaml-manifest in this repo doesn't specify any updateStrategy
and I figured this may be the source of this issue, and that others with long-running clusters may face the same issue.

I don't want to poke in the kube-system namespace manually, but this perhaps would be the occasion where it would be justified.

Error syncing masquerade rules

I've recently observed my ip-masq-agent containers restarting fairly frequently due to errors syncing masquerade rules. It's hard to determine exactly what is wrong with the rules given the limited log output. I've observed this in both v2.0.1 and v2.1.1 of ip-masq-agent.

I0914 22:40:52.590226       1 ip-masq-agent.go:180] config file found at "/etc/config/ip-masq-agent"
I0914 22:40:52.591164       1 ip-masq-agent.go:167] using config: {"nonMasqueradeCIDRs":["172.16.0.0/13","192.168.0.0/16"],"masqLinkLocal":false,"resyncInterval":60000000000}
I0914 22:40:52.591195       1 iptables.go:361] running iptables -N [IP-MASQ-AGENT -t nat]
I0914 22:40:52.599954       1 iptables.go:361] running iptables -C [POSTROUTING -t nat -m comment --comment ip-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom IP-MASQ-AGENT chain -m addrtype ! --dst-type LOCAL -j IP-MASQ-AGENT]
I0914 22:40:52.608243       1 iptables.go:338] running iptables-restore [--noflush]
E0914 22:40:52.622691       1 ip-masq-agent.go:145] error syncing masquerade rules: exit status 1 (iptables-restore: line 7 failed
)
exit status 1 (iptables-restore: line 7 failed
)

I wonder if perhaps I'm being hit by iptables locking issues as described in kubernetes/kubernetes#44895? I notice ip-masq-agent is pinned to a version of util/iptables from before that PR was merged, and thus (I believe) won't use any iptables locking.

Error: Unable to redirect iptables binaries in v2.9.2

What

I've tried to use the latest v2.9.2 image, but I get an error when the agent tries to execute iptables commands.

v.2.9.2 errors:

$ docker run -it --entrypoint "/usr/sbin/iptables" --rm registry.k8s.io/networking/ip-masq-agent:v2.9.2 --version
Unable to redirect iptables binaries. (Are you running in an unprivileged pod?)
ERROR: No valid subcommand given.
Valid subcommands:
[...]

Also tried with --priviliged, same result.

v2.9.0 works:

$ docker run -it --entrypoint "/usr/sbin/iptables" --rm registry.k8s.io/networking/ip-masq-agent:v2.9.0 --version
iptables v1.8.7 (nf_tables)

Question regarding masquerading all, lets say SSH only, egress traffic to a lot of destinations as sourced from limited number of fixed IPs

I apologize if I this is not the correct place to ask this question, but I didn't see directions on how/where to ask questions, maybe I missed them. Also, I have a limited understanding of iptables, routing and masquerading.

Like a lot of people on the internet, I have a need to change the source IPs on our outbound SSH connections because the remote SSHD has restrictions by IP address.

Assuming our cluster has, lets say 3, nodes with a fixed IP (Nodes A, B, C), but there are more nodes (lets say 20) in the cluster, is there a way for all the outgoing SSH connections from all the pods (not on Nodes A, B, C) to have the fixed IP from nodes A, B or C.

I am not sure if I am asking this question correctly but I am hoping that someone would say "Oh yeah, thats easy! You forward all your port 22 traffic to Nodes A, B or C round-robinly (or pick one node), and then set up some kind of MASQ rule or something on Nodes A, B and C to forward that one to the remote destination from the ethernet device with the fixed IP." or something ...

I am assuming that using a limited number of fixed public IPs might be better than adding a fixed public IP to every node in the cluster. Also, I am thinking that this might be better than using SSH proxy hops as each SSHD connection would use up a decent chunk of memory/resources on the proxy hop. We make a lot of outbound SSH calls.

Any advice would be greatly appreciated and thank you in advance for your help!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.