Git Product home page Git Product logo

pmtud's Introduction

Path MTU daemon

With ECMP enabled the ICMP messages are routed mostly to wrong server. To fix that let's broadcast the ICMP messages that we think are worth it to every machine in colo. Some reading:

$ ./pmtud --help

Usage:

    pmtud [options]

Path MTU Daemon is captures and broadcasts ICMP messages related to
MTU detection. It listens on an interface, waiting for ICMP messages
(IPv4 type 3 code 4 or IPv6 type 2 code 0) and it forwards them
verbatim to the broadcast ethernet address.

Options:

  --iface              Network interface to listen on
  --src-rate           Pps limit from single source (default=1.0 pss)
  --iface-rate         Pps limit to send on a single interface (default=10.0 pps)
  --verbose            Print forwarded packets on screen
  --dry-run            Don't inject packets, just dry run
  --cpu                Pin to particular cpu
  --ports              Forward only ICMP packets with payload
                       containing L4 source port on this list
                       (comma separated)
  --help               Print this message

Example:

    pmtud --iface=eth2 --src-rate=1.0 --iface-rate=10.0

Once again, it listens waiting for packets matching:

((icmp and icmp[0] == 3 and icmp[1] == 4) or
  (icmp6 and ip6[40+0] == 2 and ip6[40+1] == 0)) and
 (ether dst not ff:ff:ff:ff:ff:ff)

And having appropriate length, and forwards them to ethernet broadcast ff:ff:ff:ff:ff:ff.

To debug use tcpdump:

sudo tcpdump -s0 -e -ni eth0 '((icmp and icmp[0] == 3 and icmp[1] == 4) or
                               (icmp6 and ip6[40+0] == 2 and ip6[40+1] == 0))'

To build type:

git submodule update --init --recursive
make

To test run it in dry-run and verbose mode:

sudo ./pmtud --iface=eth0 --dry-run -v -v -v

If you want to use NFLOG interface:

iptables -I INPUT -i lo -p icmp -m icmp --icmp-type 3/4 --j NFLOG --nflog-group 33
ip6tables -I INPUT -i lo -p icmpv6 -m icmpv6 --icmpv6-type 2/0 -j NFLOG --nflog-group 33

You can add -m pkttype ! --pkt-type broadcast to be even more specific. Then to use the NFLOG api run:

sudo ./pmtud --iface=eth0 --dry-run -v -v -v --nflog 33

This will cause pmtud to listen to packets from NFLOG and use eth0 to brodcast them if neccesary. Debug by listing this /proc file:

cat /proc/net/netfilter/nfnetlink_log
33  32781     0 2 65535      0  1

Where columns read:

  • nflog group number of a given queue (16 bits)
  • peer portid: most likely the pid of process
  • number of messages buffered on the kernel side
  • copy mode: 2 for full packet copy
  • copy range: max packet size
  • flush timeout in 1/100th of a second
  • use count

pmtud's People

Contributors

kerolasa avatar majek avatar martintopholm avatar niku64 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pmtud's Issues

Issues when running with pcap_next_ex

We were considering running pmtud, but needed to expand it to cover L3
use cases. During this work I discovered that handle_packet wasn't called
for every received packet. This appears to be the case for v0.6.

I suspected this was related to the interaction between the nonblocking
pcap interface and uevent system. So I tried to run the main loop as
pcap_loop instead which resulted in all packets beeing handled (and
ratelimited).

My test setup comprise a workstation (client) with curl, a router (r1)
running Debian, a subnet br1 with two Busybox nodes (n1, n2), and a
subnet br2 with two more Busybox nodes (n3, n4). The link from r1 to
client is restricted to mtu 600. See also pmtud-test-setup.svg.

For this case a service address is routed to n2, unless it is tcp in
which case it is routed to n1 using fwmarks. The client retrieves PNG
file from the service address, which triggers unreachable from r1 to n2.

The http transfer was successful, but tcpdump from n2 shows 5
received icmp-unreach while only one line of "10.0.4.1 transmitting
mtu=600 sport=-1". See case1-tcpdumps.txt.

When I modified the source to use pcap_loop all packets are handled
(in case2 it is 5 of them 2 forwarded and 3 rate limited). See
case2-tcpdumps.txt.

pmtud-test-setup.pdf
case1-tcpdumps.txt
case2-tcpdumps.txt

nflog.c: nflog_unbind_pf called instead of nflog_bind_pf

Just a quick issue I noticed while browsing the code. This would easily go undetected, because it appears that nflog_bind_pf() ignores second parameter and always binds for both AF_INET and AF_INET6. Thus "nflog_bind_pf(n->h, AF_INET6)" a few lines further hides the bug.

--- src/nflog.c.orig    2016-01-07 12:53:09.000000000 -0600
+++ src/nflog.c 2016-01-07 12:53:39.933119500 -0600
@@ -81,7 +81,7 @@
         PFATAL("nflog_unbind_pf(AF_INET6)");
     }

-    r = nflog_unbind_pf(n->h, AF_INET);
+    r = nflog_bind_pf(n->h, AF_INET);
     if (r < 0) {
         PFATAL("nflog_bind_pf(AF_INET)");
     }

New release?

I see at least 2 fixes which might be useful to get out into a tagged release

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.