Git Product home page Git Product logo

Comments (12)

aauren avatar aauren commented on June 27, 2024

kube-router only touches it's own ipsets. You can see this from the delete logic here: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/netpol/network_policy_controller.go#L709-L714

The only way that I could see this happening is if you got incredibly lucky and you somehow consistently managed to add your ipset in the microseconds between when kube-router saves the ipset's and the restores them (see logic above). But for this to be consistently happening seems very, very unlikely unless something else is also going wrong on the node causing a severe performance regression.

Additionally, I tested this on my own cluster just now by creating an ipset and then forcing a network policy sync and seeing that the ipset survived.

Maybe this is something that k3s is doing? If you want to be super sure, you can actually increase the logging verbosity level 3 and see the entire output of the ipset restore that kube-router is doing (https://github.com/cloudnativelabs/kube-router/blob/master/pkg/utils/ipset.go#L595).

from kube-router.

shouya avatar shouya commented on June 27, 2024

I'd like to increase log verbosity level and inspect the full restored ipset save string. But I didn't find a place to inject -v3 to kube-router launch option via k3s agent. Is there an environment variable or other means I can modify kube-router's launch parameters?

from kube-router.

aauren avatar aauren commented on June 27, 2024

To be honest, I'm not really familiar with k3s's options as I don't use it myself.

from kube-router.

shouya avatar shouya commented on June 27, 2024

I just dialed up the verbosity of k3s agent and found it works too. So here's the captured ipset restore string:

May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.534687 1291507 ipset.go:564] ipset restore looks like: create TMP-XRYGEMNBG5CVZI2Y hash:net family inet hashsize 1024 maxelem 65536 bucketsize 12
May 24 23:12:01 scout k3s[1291507]: flush TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: add TMP-XRYGEMNBG5CVZI2Y 1.2.3.4
May 24 23:12:01 scout k3s[1291507]: create DIRECT_SERVER hash:net family inet hashsize 1024 maxelem 65536 bucketsize 12
May 24 23:12:01 scout k3s[1291507]: swap TMP-XRYGEMNBG5CVZI2Y DIRECT_SERVER
May 24 23:12:01 scout k3s[1291507]: flush TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: add TMP-XRYGEMNBG5CVZI2Y 1.2.3.4
May 24 23:12:01 scout k3s[1291507]: create PROXIED_SERVER_12345 hash:net family inet hashsize 1024 maxelem 65536 bucketsize 12
May 24 23:12:01 scout k3s[1291507]: swap TMP-XRYGEMNBG5CVZI2Y PROXIED_SERVER_12345
May 24 23:12:01 scout k3s[1291507]: flush TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: destroy TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.561481 1291507 policy.go:177] Restoring IPv4 ipset took 26.9803ms

I see it uses a temporary ipset TMP-XRYGEMNBG5CVZI2Y to replace the content of my two ipsets to the state prior to netpol controller launch: neither of the ipset's names (DIRECT_SERVER and PROXIED_SERVER_12345) are owned by kube-router as specified here. As a result, my later added entry 5.6.7.8 is gone from these ipsets.

from kube-router.

shouya avatar shouya commented on June 27, 2024

The k3s code for netpol looks suspicious to me in its Run function. Can you confirm if the call of set.Save() prior to controller initialization may be causing the result?

from kube-router.

aauren avatar aauren commented on June 27, 2024

I'm glad that you were able to figure out how to increase the logging verbosity for kube-router within k3s, that is definitely helpful.

So yes, I can confirm that it is expected that you would see your ipset in the output for kube-router ipset restore function. This is because it reads all ipsets during the save, and then only modifies its own.

What is unexpected is that it would revert the value to a previous value.

So following through the logic we can look at how kube-router encounters / mutates ipsets in the network policy controller:

So I can't see any obvious place where it would be reverting user chain information.

One thing that I did think of when reviewing the code, is maybe if https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/netpol/network_policy_controller.go#L229-L243 code block was excluded from your build of k3s then it might cause issues for you? Because saved data would have been added into the ipset object from one run, and then possibly restored in the next at: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/netpol/policy.go#L149-L151 ?

I can see that upstream, they have this block in their code because they are linking against kube-router v2.1.2 which has this fix in it. But maybe the version of k3s that you're running is using an older version of kube-router?

from kube-router.

shouya avatar shouya commented on June 27, 2024

Thank you for the detailed explanation. I think the initial call to ipset.Save() originates from k3s code, not inside kube-router. This logic on its own can explain how the ipset state on startup gets restored over and over.


Another confusing thing from execsnoop reports is, it seem the last step does involve in issuing an ipset save execution from somewhere, which seems absurd given that we find no place calling ipset.Save() within either k3s or kube-router's normal operating logic.

Let me include the full log that contains hints about the calling lines from the source files:

May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.435020 1291507 health_controller.go:91] Received heartbeat from MC
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.446371 1291507 network_policy_controller.go:201] Requesting periodic sync of iptables to reflect network policies
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.446463 1291507 network_policy_controller.go:216] Full sync request queue was empty so a full sync request was successfully sent
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.446504 1291507 network_policy_controller.go:193] Received request for a full sync, processing
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.446542 1291507 network_policy_controller.go:241] Starting sync of iptables with version: 1716592321446527320
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.447598 1291507 health_controller.go:91] Received heartbeat from NPC
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.451486 1291507 network_policy_controller.go:449] Ensuring jump to chain KUBE-ROUTER-FORWARD from FORWARD at position 1
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.465615 1291507 network_policy_controller.go:449] Ensuring jump to chain KUBE-ROUTER-OUTPUT from OUTPUT at position 1
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.478754 1291507 network_policy_controller.go:449] Ensuring jump to chain KUBE-ROUTER-INPUT from INPUT at position 1
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.486890 1291507 network_policy_controller.go:463] Allow traffic to ingress towards Cluster IP Range: 10.43.0.0/16 for family: IPv4
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.496057 1291507 network_policy_controller.go:481] Allow TCP traffic to ingress towards node port range: 30000:32767 for family: IPv4
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.504866 1291507 network_policy_controller.go:494] Allow UDP traffic to ingress towards node port range: 30000:32767 for family: IPv4
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.534369 1291507 network_policy_controller.go:270] Saving IPv4 iptables rules took 8.56443ms
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.534433 1291507 policy.go:83] Attempting to attain ipset mutex lock
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.534453 1291507 policy.go:85] Attained ipset mutex lock, continuing...
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.534687 1291507 ipset.go:564] ipset restore looks like: create TMP-XRYGEMNBG5CVZI2Y hash:net family inet hashsize 1024 maxelem 65536 bucketsize 12
May 24 23:12:01 scout k3s[1291507]: flush TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: add TMP-XRYGEMNBG5CVZI2Y 1.2.3.4
May 24 23:12:01 scout k3s[1291507]: create DIRECT_SERVER hash:net family inet hashsize 1024 maxelem 65536 bucketsize 12
May 24 23:12:01 scout k3s[1291507]: swap TMP-XRYGEMNBG5CVZI2Y DIRECT_SERVER
May 24 23:12:01 scout k3s[1291507]: flush TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: add TMP-XRYGEMNBG5CVZI2Y 1.2.3.4
May 24 23:12:01 scout k3s[1291507]: create PROXIED_SERVER_12345 hash:net family inet hashsize 1024 maxelem 65536 bucketsize 12
May 24 23:12:01 scout k3s[1291507]: swap TMP-XRYGEMNBG5CVZI2Y PROXIED_SERVER_12345
May 24 23:12:01 scout k3s[1291507]: flush TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: destroy TMP-XRYGEMNBG5CVZI2Y
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.561481 1291507 policy.go:177] Restoring IPv4 ipset took 26.9803ms
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.561544 1291507 policy.go:184] Iptables chains in the filter table are synchronized with the network policies.
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.561564 1291507 policy.go:88] Returned ipset mutex lock
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.561579 1291507 policy.go:80] Syncing network policy chains took 27.151674ms
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.593673 1291507 network_policy_controller.go:310] Restoring IPv4 iptables rules took 15.060754ms
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.593736 1291507 network_policy_controller.go:699] Attempting to attain ipset mutex lock
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.593767 1291507 network_policy_controller.go:701] Attained ipset mutex lock, continuing...
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.597505 1291507 network_policy_controller.go:704] Returned ipset mutex lock
May 24 23:12:01 scout k3s[1291507]: I0524 23:12:01.597569 1291507 network_policy_controller.go:238] sync iptables took 151.020789ms

I'm inclined to believe the k3s version I run doesn't actually link against kube-router v2.1: If you look at my log it shows "ipset restore looks like: ...", which is consistent with the log format in v2.0, whereas in v2.1 the format was updated to contain "ipv6?". Will v2.0 explain the we're seeing here?

from kube-router.

aauren avatar aauren commented on June 27, 2024

I haven't gone in-depth on what changes between versions for this one, but I believe that v2.0 could explain it. Maybe try building k3s from HEAD and see if the problem goes away?

from kube-router.

shouya avatar shouya commented on June 27, 2024

If my understanding is correct, the problem won't go away even if I rebuild k3s from HEAD because of the initial call of ipset.Save() here on startup will still save the ipset state at that time. Then any subsequent ipset.Restore() is going to essentially revert any external changes made after startup.

To stop ipset.Restore from interfering with sets not created by kube-router, I'm looking forward for one of two changes:

  • First, we can eliminate any calls to ipset.Save(), both from startup and from the periodic sync subroutine. This was what we tried hunting for so far. If there is confirmed no place ipset.Save can be invoked during sync, I can file a bug report on k3s so the startup call may also be removed.
  • Alternatively, we can add a filter in ipset.Save() to only handle sets that are related to kube-router, probably in the ParseIPSetSave function.

Either options would be sufficient to solve the problem I encountered. I'm personally inclined to the second option because logically it may be more correct for kube-router to only save anything it creates.

from kube-router.

aauren avatar aauren commented on June 27, 2024

I don't believe that first save does anything. Saving the ipset only puts info in the struct, but that struct doesn't get used again. I'm nearly positive that the problem you're experiencing is do to an old version of kube-router being used.

from kube-router.

shouya avatar shouya commented on June 27, 2024

Good. Let me try upgrading my k3s and report back to you.

from kube-router.

shouya avatar shouya commented on June 27, 2024

I tried upgrading to the latest k3s version (v1.30.0+k3s1) and still see the same problem occurring. And the logging of ipset restore string is still in the old format. Upon closer inspection I found their go.sum specifies their own fork of kube-router. Let me file a bug report to k3s developers. Thank you for your time helping me looking into this issue.

from kube-router.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.