Git Product home page Git Product logo

sbulb's Introduction

udploadbalancer

An UDP load-balancer prototype using bcc (XDP/Bpf)

usage: sbulb [-h] -vs VIRTUAL_SERVER
             (-rs REAL_SERVER [REAL_SERVER ...] | -cfg CONFIG_FILE) -p PORT
             [PORT ...] [-d {0,1,2,3,4}]
             [-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,TRACE}] [-mp MAX_PORTS]
             [-mrs MAX_REALSERVERS] [-ma MAX_ASSOCIATIONS]
             ifnet

positional arguments:
  ifnet                 network interface to load balance (e.g. eth0)

optional arguments:
  -h, --help            show this help message and exit
  -vs VIRTUAL_SERVER, --virtual_server VIRTUAL_SERVER
                        <Required> Virtual server address (e.g. 10.40.0.1)
  -rs REAL_SERVER [REAL_SERVER ...], --real_server REAL_SERVER [REAL_SERVER ...]
                        <Required> Real server address(es) (e.g. 10.40.0.2 10.40.0.3)
  -cfg CONFIG_FILE, --config_file CONFIG_FILE
                        <Required> a path to a file containing real server address(es). 
                        File will be polled each second for modification and configuration
                        updated dynamically. A file content example :
                        
                        [Real Servers]
                        10.0.0.4
                        10.0.0.2
                        10.0.0.6
                        
  -p PORT [PORT ...], --port PORT [PORT ...]
                        <Required> UDP port(s) to load balance
  -d {0,1,2,3,4}, --debug {0,1,2,3,4}
                        Use to set bpf verbosity, 0 is minimal. (default: 0)
  -l {CRITICAL,ERROR,WARNING,INFO,DEBUG,TRACE}, --loglevel {CRITICAL,ERROR,WARNING,INFO,DEBUG,TRACE}
                        Use to set logging verbosity. (default: ERROR)
  -mp MAX_PORTS, --max_ports MAX_PORTS
                        Set the maximum number of port to load balance. (default: 16)
  -mrs MAX_REALSERVERS, --max_realservers MAX_REALSERVERS
                        Set the maximum number of real servers. (default: 32)
  -ma MAX_ASSOCIATIONS, --max_associations MAX_ASSOCIATIONS
                        Set the maximum number of associations. (default: 1048576)
                        This defined the maximum number of foreign peers supported at the same time.

Eg : sudo python3 -m sbulb eth0 -vs 10.188.7.99 -rs 10.188.100.163 10.188.100.230 -p 5683 5684

Behavior

This load balancer can be considered as a Layer-4 NAT load-balancer as it only modifies IP address.

For ingress traffic :

  • we search if we have a clientip:port/realserverip association.
  • if yes, we modify destination address (dest NAT) replacing virtual IP address by the real server IP one.
  • if no, we pick a real server and create a new association, and do dest NAT as above.

For egress traffic :

  • we search if we have an clientip:port/realserverip association.
  • if yes and packet comes from the associated real server , we modify source address (source NAT) replacing real server address by the virtual server ip address.
  • if yes and packet comes from "not associated" real server, we drop the packet
  • if no, we create a new association using the source IP address(real server IP address) and modifying source address(source NAT) by the virtual server ip address.

We keep this association is a large LRU map as long as possible, meaning the oldest association is only removed if LRU map is full and new association must be created.

The algorithm used is a simple round-robin.

⚠️ All packets from the realservers to the client must go through the udp load-balancer machine/director (like with LVS-NAT).

Why create a new load balancer ?

In a cluster, generally the good practice is to share states between each server instances, but sometime some states can not be shared...
E.g. a cluster of servers which can not share DTLS connection, in this case you want to always send packet from a given client to the same server to limit the number of handshakes.
To do that you need to create a long-lived association between the client and the server, but most of the UDP loadbalancer are thougth to have ephemere association. Most of the time this association lifetime can be configured and you can set a large value, but here thanks to the LRU map we can keep the association as long as we can.

The other point is server initiated communication. We want to be able to initiate communication from a server exactly as if communication was initiated by a client. Meaning same association table is used.

Limitation

This is a simple load-balancer and so it have some limitations :

  • All traffic (ingress and egress) should be handled by the same network interface.
  • All traffic should go to the same ethernet gateway (which is the case most of the time).
  • Does not support IP fragmentation.
  • Does not support IP packet with header options for now. Meaning IP header size (number of 32 bits word) must be set to 5.

Requirements & dependencies

You need :

  • a recent linux kernel to be able to launch xdp/bpf code. (currently tested with 4.19.x package)
  • bcc installed. (currently tested with v0.8 : package python3-bpfcc on debian)
  • linux-headers installed to allow bcc to compile bpf code.

Usage with systemd

Sbulb supports the sd_notify(3) mechanism, but does not require systemd or any systemd library to run. This allows sbulb to notify systemd when it is ready to accept connections. To use this feature, you can write a service like this:

[Unit]
Description=UDP Load Balancer
Wants=network-online.target
[Service]
Type=notify
NotifyAccess=all
Environment=PYTHONUNBUFFERED=1
ExecStart=/usr/bin/python3 -m slulb args...
[Install]
WantedBy=multi-user.target

Performance

See our wiki page about that.

Unit Tests

To launch unit tests :

sudo python3 -m unittest                                   # all tests
sudo python3 -m unittest sbulb.tests.IPv4TestCase          # only 1 test case
sudo python3 -m unittest sbulb.tests.IPv4TestCase.test_lru # only 1 test

Tests needs bcc v0.14 (python3-bpfcc) and scapy (python3-scapy).

XDP/Bpf

Why XDP/Bpf ? Why is the kernel community replacing iptables with BPF?.

Read about XDP/Bpf : Dive into BPF: a list of reading material.

Inspirations :

Documentations :

Thesis :

sbulb's People

Contributors

oliwer avatar sbernard31 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sbulb's Issues

Improve logging

This issue aims to centralize all possible amelioration about current logging facilities (#22):

  • use the python logging facilities instead of print
  • logging level should also impact python output (not only bpf one)
  • currently in log we only show MAC/IP, maybe it would make more sense to show MAC/IP:port
    when possible.
  • add support of syslog, maybe using sysloghandler
  • bpf log could show if packet is DROP, PASS, TX.
  • use macro for bpf logging instead of if to improve performance ? we should first find a way to measure performance !

Handle TTL field from IP header.

The IP header contains a TTL field (see RFC 791):

This field indicates the maximum time the datagram is allowed to
remain in the internet system. If this field contains the value
zero, then the datagram must be destroyed. This field is modified
in internet header processing. The time is measured in units of
seconds, but since every module that processes a datagram must
decrease the TTL by at least one even if it process the datagram in
less than a second, the TTL must be thought of only as an upper
bound on the time a datagram may exist. The intention is to cause
undeliverable datagrams to be discarded, and to bound the maximum
datagram lifetime.

The wikipedia explanation is maybe better :

The time-to-live value can be thought of as an upper bound on the time that an IP datagram can exist in an Internet system. The TTL field is set by the sender of the datagram, and reduced by every router on the route to its destination. If the TTL field reaches zero before the datagram arrives at its destination, then the datagram is discarded and an Internet Control Message Protocol (ICMP) error datagram (11 - Time Exceeded) is sent back to the sender. The purpose of the TTL field is to avoid a situation in which an undeliverable datagram keeps circulating on an Internet system, and such a system eventually becoming swamped by such "immortals".

We can consider sbulb as a router, so If we want to be a good internet citizen we should update TTL field and discard packet if needed. This part should be easy to implement.

About sending an ICMP packet, I don't know if this is easy maybe we can just let the Linux kernel do that ? (return XDP_PASS)

About the calculation of udp checksum

Hello, in your code, only the checksum of the udp modification target ip is recalculated. If I change the source ip:port of this packet, and also modify the destination ip:port, how can I compute the checksum efficiently? My current implementation is to use a for loop to calculate the checksum on the packet, but it feels very inefficient.

Notify systemd if sbulb managed to start successfuly

When systemd launches a daemon (such as ulb.py) it does not wait to see if there was an error after the process was launched. Thus, even the bpf code compilation failed, systemd will start the dependents services as if everything was ok.

To solve this issue, ulb.py simply needs to call /bin/systemd-notify --ready to signal it is ready to accept connections.

problem of bcc

i have this issue even though i have installed bcc in ubuntu
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.6/runpy.py", line 142, in _get_module_details
return _get_module_details(pkg_main_name, error)
File "/usr/lib/python3.6/runpy.py", line 109, in _get_module_details
import(pkg_name)
File "/home/alla/Bureau/loadbalancer/sbulb/sbulb/init.py", line 2, in
from bcc import BPF
ImportError: cannot import name 'BPF'

Have a first simple way to logs

Logs will just be print on standard output.
4 levels logs : NONE / ERROR / DEBUG / TRACE

NONE : nothing printed
ERROR : for unexpected state
DEBUG : ERROR + all packet unmodified packet.
TRACE : DEBUG + all modified packet.

(can not be changed without restart)

Cleanning code

Remove unused code, cleaning warning, add some documentation, fix TODO ...
(#1 is related to this issue)

Verify behavior description of egress traffic

From README.md:

For egress traffic :

  • we search if we have an clientip:port/realserverip association.
  • if yes, we drop the packet (as this server is not associated to this clientip:port)
  • if no, we create a new association using the source IP address(real server IP address) and replacing source address(source NAT) by the virtual server ip address.

My feeling is, that the drop in the second cause trouble. Doesn't that depend on the server sending it? If it's the "associated one", then the packet should pass as in the third.

Being able to change configuration without restart

The idea would be to :

  1. give a file in parameter.
  2. watch for file modification
  3. update configuration accordingly.

The targeted setting is mainly all about real server, to be able to handle real server redeployment.

Add support to IPv6

This idea is either to use ipv4 or ipv6 for real and virtual server but not both at same time
Meaning that you can have a virtual v4 ip and real server v6 ips or vice&versa.

From "round-robin" to "random-pick" ?

Currently the algorithm to dispatch traffic is very simple.
It's just a simple circular rotation :

  1. For 1st association we take the 1st real server,
  2. Then 2nd association we take the 2nd real server,
  3. After the last real server we go back to the beginning.

To do that we need a state : the last(or next) real server used.

The idea of "doing a random pick" was raised. This way we could remove the state.
We could investigate this.

Maybe we can use : bpf_get_prandom_u32 helper function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.