Comments (9)
I've gotten this to work by adding a rule like this for every active service endpoint:
iptables -t nat -A POSTROUTING -s $ENDPOINT_IP -d $ENDPOINT_IP -m ipvs --vaddr $SVC_IP -j SNAT --to-source $SVC_IP
The result is that loopback traffic back to the pod is seen as coming from the service IP -- a possible drawback? Also, assuming lots and lots of services and/or endpoints, would the number of rules be too many?
If this looks reasonable I will submit a patch to implement it.
from kube-router.
@bzub seems reasonable solution.
"The result is that loopback traffic back to the pod is seen as coming from the service IP -- a possible drawback?"
Yeah, its has to be either node ip or service ip. Either of which will have problem with network policies if its permitting traffic only form certian set of pod IP's. But there is nothing much can be done.
"Assuming lots and lots of services and/or endpoints, would the number of rules be too many?"
We will only be adding rules for the pods running on the node. still it can lead to lots of rule.
We might need to check Kube-proxy how they are addressing this problem. Please take a look at kube-proxy. If there is no better alternative please go ahead with the PR.
from kube-router.
I believe kube-proxy and kubelet delegates this to the CNI plugin when CNI is used. I was able to get the "bridge" CNI plugin to set the sysctl hairpin_mode
to 1
for all the veth devices, but that wasn't enough. I will see about what kubernetes does with "kubenet" mode which is the only other option other than CNI. That I believe sets hairpin-mode to "promiscuous bridge" but when I tested setting the kube-bridge
interface to promiscuous mode it broke things. So I'll have to look further.
from kube-router.
@bzub yes, it seems its offloaded to CNI plugin. Not sure merit of this use case.
Please go ahed with your current fix, but keep flag '--hairpin-mode' like kubelet does (perhaps for kubenet). So if some one wants they can take any perf hit (if there is any) and enable this flag.
from kube-router.
OK, so the SNAT rule should only affect hairpin traffic going back to the endpoint that used its own service, so network policy issues would only affect that communication. I'm thinking I'll make a new flag like kube-router --hairpin-mode=true
to enable the rule for all endpoints, and also allow the user to set a service annotation like service-proxy.kube-router.io/hairpin-mode=true
if they only want this enabled on a subset of services.
from kube-router.
Yes please keep the flag 'kube-router --hairpin-mode=true' like i mentioned in above comment kubelet has this as well.
from kube-router.
Update: I am pursuing a solution I've tested that manipulates the network stack namespace within the pods that need hairpin. Being able to manipulate ip interfaces, routes, and even ipvs within pods is extremely powerful and could be a way to provide other features in kube-router. The biggest drawback is that we must access /proc
on the host, which could be a security issue if kube-router pod is compromised. There are ways to minimize what accesses /proc. For example running a program/script dedicated to only exposing what we need from /proc and having kube-router consume from that.
For this issue (hairpin) the result will be that a pod traffic to its own service/port will never leave the pod, and the source IP will be preserved. This can be enhanced later to load-balance between all service endpoints for hairpin traffic.
Basic command that will be implemented per hairpin-enabled pod:
nsenter --net=/host/proc/$PID_OF_PAUSE_CONTAINER/ns/net iptables -t nat -D OUTPUT -d $SERVICE_IP -j REDIRECT
from kube-router.
@bzub, "manipulates the network stack namespace within the pods that need hairpin.", how do we know pods that need hairpin?
"This can be enhanced later to load-balance between all service endpoints for hairpin traffic."
I am not clear on this as well. Please elobrate.
from kube-router.
@bzub, "manipulates the network stack namespace within the pods that need hairpin.", how do we know pods that need hairpin?
We can still use a --hairpin-mode=true
flag and then all pods that are service endpoints will get this iptables rule inside the pod. Or if the service has the annotation, same result for only those endpoints.
"This can be enhanced later to load-balance between all service endpoints for hairpin traffic."
I figured the best goal would be that hairpin traffic gets loadbalanced the same as outside client traffic. The current solution I have only gotten hairpin traffic to stay inside the pod, so it doesn't go to other endpoints. In the future it should balance between itself and other endpoints.
I will get a this implemented today most likely so we can try it out.
from kube-router.
Related Issues (20)
- v2.1.1: TCPMSS not setup with DSR HOT 2
- Bug in network policy ipsets when using dualStack HOT 2
- . HOT 1
- Initial BGP sync during kube-router startup extremely slow in kubernetes v1.29 HOT 6
- /var/lib/kube-router/kubeconfig does not regenerate when configmap changes are made HOT 3
- kube-router crashloop backoff without obvious cause on brand new cluster HOT 8
- v2.1: DSR+TCPMSS with non-ready services not set-up correctly HOT 6
- Globally enable hairpin mode for externalIPs HOT 2
- kube-router should cleanup rules it does not handle anymore in its chains HOT 2
- kube-router duplicates rules in the KUBE-ROUTER-INPUT chain HOT 3
- Custom ipset sets and entries get reverted periodically HOT 12
- Creating LoadBalancer service blocks API server IP HOT 12
- IPv6 Error: `unknown option "--icmpv6-type"`
- Unable To Route to IPv6 Service VIPs from Same Node
- Service traffic being dropped when NetworkPolicy is present HOT 2
- Unknown option "--icmp-type" HOT 3
- Seeking New Maintainers HOT 3
- documentation/DSR examples: mount directory containing the socket instead the socket directly HOT 2
- Race condition between netpol and IPVS based ipset updates
- kube-router Holding on to Routes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kube-router.