ecklm / adaptive-network-slicing Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 5.0 1.3 MB

Project holding the implementation and results of my thesis project at University of Trento, Italy

Python 91.84% Shell 8.16%

mininet network-slicing qos ryu-controller sdn

adaptive-network-slicing's People

Contributors

Stargazers

Watchers

Forkers

suyankai kamatchu passcet46 sujeshpadhi91 ddhung81

adaptive-network-slicing's Issues

controller: QoS queue setting is rather slow

Calls from QoSManager to the actual QoS implementing app (the name to be looked up) get slow after generally 3 subsequent calls per endpoint as presented below.

This makes the setup process and even later the adaptations destructively slow. This time-span may often result in trying to use deprecated data -- eg. setting queues in a switch that has just disconnected from the controller. Therefore, exceptions may also happen.

127.0.0.1 - - [23/Apr/2020 08:00:15] "PUT /v1.0/conf/switches/0000000000000002/ovsdb_addr HTTP/1.1" 201 144 0.004762
(24939) accepted ('127.0.0.1', 46658)
127.0.0.1 - - [23/Apr/2020 08:00:15] "POST /qos/rules/0000000000000002 HTTP/1.1" 200 247 0.001050
(24939) accepted ('127.0.0.1', 46664)
127.0.0.1 - - [23/Apr/2020 08:00:16] "POST /qos/rules/0000000000000002 HTTP/1.1" 200 247 0.000786
(24939) accepted ('127.0.0.1', 46668)
127.0.0.1 - - [23/Apr/2020 08:00:16] "POST /qos/rules/0000000000000002 HTTP/1.1" 200 247 0.000779
(24939) accepted ('127.0.0.1', 46672)
127.0.0.1 - - [23/Apr/2020 08:00:18] "POST /qos/queue/0000000000000002 HTTP/1.1" 200 387 2.120427
(24939) accepted ('127.0.0.1', 46680)
127.0.0.1 - - [23/Apr/2020 08:00:20] "POST /qos/queue/0000000000000002 HTTP/1.1" 200 387 2.135312
(24939) accepted ('127.0.0.1', 46688)
127.0.0.1 - - [23/Apr/2020 08:00:20] "PUT /v1.0/conf/switches/0000000000000003/ovsdb_addr HTTP/1.1" 201 144 0.000313
(24939) accepted ('127.0.0.1', 46694)
127.0.0.1 - - [23/Apr/2020 08:00:20] "POST /qos/rules/0000000000000003 HTTP/1.1" 200 247 0.000935
(24939) accepted ('127.0.0.1', 46700)
127.0.0.1 - - [23/Apr/2020 08:00:20] "POST /qos/rules/0000000000000003 HTTP/1.1" 200 247 0.000673
(24939) accepted ('127.0.0.1', 46704)
127.0.0.1 - - [23/Apr/2020 08:00:20] "POST /qos/rules/0000000000000003 HTTP/1.1" 200 247 0.000914
(24939) accepted ('127.0.0.1', 46708)
127.0.0.1 - - [23/Apr/2020 08:00:22] "POST /qos/queue/0000000000000003 HTTP/1.1" 200 387 2.170584
(24939) accepted ('127.0.0.1', 46716)
127.0.0.1 - - [23/Apr/2020 08:00:24] "POST /qos/queue/0000000000000003 HTTP/1.1" 200 387 2.096152

The solution may be in close connection with #3.

Create a demonstrative graph of experiment results

Some changes may still be necessary to have enough and the right data for the graph generation.

Create mininet experiment scripts for non-functional experiments as well

outlier
slowdemo
baseline

Mininet: Make commands easier to run

As a first step, it would be good to have a script that starts either the iperf server or client with the port number passed as a parameter.

Controller: Extract more hardwired parameters to configuration

Flowstat window size
OVSDB address
Base URL of the QoS app, now http://localhost:8080 (might be overkill, but it can happen that the QoS controller runs separately from the adaptive slicer)
Eliminate static variables

controller: Separate measurement scope from QoS setting scope

As measurements are done per switch, adaptation time increases by the number of switches along different flows. It is because a switch only receives traffic let through by the previous switch, and therefore it does not have the same average values, it still needs to grow to be adapted as well.

If we say that controlling the network and not separate routes is enough, -- which I'd say is enough for the thesis -- the separation of measurements and configuration can offer a solution to this problem.
Luckily, switch_id can be all in the API calls to the rest_qos application, so network-scope rule setting should be rather easy.

ryu/app/rest_qos.py:
47: # =============================
48: #          REST API
49: # =============================
50: #
51: #  Note: specify switch and vlan group, as follows.
52: #   {switch-id} : 'all' or switchID
53: #   {vlan-id}   : 'all' or vlanID

This could also decrase significantly the number of calls necessary to make modifications to the switches.

Controller: Make csv default output format

Make csv default
Create a default.human.yml which uses human readable format

Controller:log: Present switch names instead of dpids

Comprehensive naming is of no use if the log prints this

        datapath   ipv4-dst udp-dst avg-speed (Mb/s) current limit (Mb/s) original limit (Mb/s)
---------------- ---------- ------- ---------------- -------------------- --------------------
0000000000000001  10.0.0.11    5001             0.00                 1.25                 5.00
0000000000000002  10.0.0.11    5001             0.00                 1.25                 5.00
0000000000000003  10.0.0.11    5001             0.00                 1.25                 5.00
0000000000000004  10.0.0.11    5001             0.00                 1.25                 5.00
0000000000000005  10.0.0.11    5001             0.00                 1.25                 5.00
0000000000000001  10.0.0.12    5002             0.00                 6.25                25.00
0000000000000002  10.0.0.12    5002             0.00                 6.25                25.00
0000000000000003  10.0.0.12    5002             0.00                 6.25                25.00
0000000000000004  10.0.0.12    5002             0.00                 6.25                25.00
0000000000000005  10.0.0.12    5002             0.00                 6.25                25.00
0000000000000001  10.0.0.13    5003             0.00                 3.75                15.00
0000000000000002  10.0.0.13    5003             0.00                 3.75                15.00
0000000000000003  10.0.0.13    5003             0.00                 3.75                15.00
0000000000000004  10.0.0.13    5003             0.00                 3.75                15.00
0000000000000005  10.0.0.13    5003             0.00                 3.75                15.00

Having names instead of datapath id-s would significantly help making sense of the log output corresponding to a specific experiment and topology.

Controller: Write tests for the config parser

Implement the selected adaptation algorithm

The selected algorithm is defined in this PDF in section 4.2.

Implement simulation output converter

The output of mininet and my experiment scripts is not suitable for generating graphs from them directly. There must be a solution to convert them to a suitable format without human interaction.

The output must be in a format from where it can directly be used to create grahs and include into the paper.

integerating ryu controller with 5G core

hello @ecklm i want the ryu with the following topology can you help me of how to integrate controller with the my 5G core

Add timestamps to everything that may relate to experiment output

Preferably every log message of the controller
Flow statistics logs
Iperf running scripts (already done to some extent, need to check if sufficient), both server and client side.

Controller: Implement a basic unit testing framework

There are now quite separate functionalities inside the project which can be tested individually through unit tests. The test framework for this should be set up.

Controller: Implement rule setting more natively than HTTP calls

In some cases (especially initialization) it might be problematic to make an HTTP call to another app in the same controller. If there is a better way, it should be implemented.

Mininet: Implement mininet scipts to perform more automatic functionality testing

This lies in my stash as WIP not working for some reason. May be good for starting point

diff --git a/slicing/mininet/mn-slicing.py b/slicing/mininet/mn-slicing.py
old mode 100644
new mode 100755
index 5b20cbc..6d8f645
--- a/slicing/mininet/mn-slicing.py
+++ b/slicing/mininet/mn-slicing.py
@@ -1,4 +1,10 @@
+#!/usr/bin/env python2
+
+from mininet.node import RemoteController
 from mininet.topo import Topo
+from mininet.net import Mininet
+from mininet.log import setLogLevel
+from mininet.cli import CLI
 
 
 class SlicingTopo(Topo):
@@ -23,3 +29,26 @@ class SlicingTopo(Topo):
 
 
 topos = {'slicingtopo': (lambda: SlicingTopo())}
+
+
+def run_process(node, cmd):
+    # type: (object, str) -> int
+    node.cmd(cmd)
+    return int(node.cmd('echo $!'))
+
+
+if __name__ == '__main__':
+    # Tell mininet to print useful information
+    setLogLevel('info')
+    "Create and test a simple network"
+    topo = SlicingTopo()
+    net = Mininet(topo, controller=RemoteController('c0', ip='192.0.2.1', port=6653))
+    net.start()
+    h1 = net.get('h1')
+    h2 = net.get('h2')
+    server_pids = []
+    h1.cmd("ping 10.0.0.1")
+    # for x in [5001, 5002, 5003]:
+    #     server_pids.append(run_process(h1, "iperf -s -u -i 1 -p %d" % x))
+#    CLI(net)
+    net.stop()

Update READMEs based on the changes

The new READMEs should reflect the architectural and purpose-related changes so that it will make sense when submitted as part of the master's degree.

controller: Accommodate logging to the multi-switch setup

Logging has been designed with a single switch in mind and it is confusing or not comprehensive sometimes.
Until the whole project becomes flow-centric (from currently being switch-centric), logging needs to be adjusted to reflect the switches about which it logs.

Consider setting default iperf client bandwidth to 40Mbps

With 30Mbps, it is not obvious enough how adaptation affects different participants. As ue1 and ue2 do not have any common outgoing ports with ue3 along the paths from b and c, the path to ue1 and ue2 only transmits 60Mbps which does not show a really significant drop in either of their received data rates. It's a few Mbps on both sides that show up in the log but that is not much more than what is lost anyway due to the imperfect implementation of the bandwidth limiting solution.

With 40Mbps outgoing traffic, both ue1 and ue2 experience under 20Mbps incoming traffic sometimes which makes the effect obvious. Then, the slicing will show, ue2 have 25Mbps granted which is the whole point.

Default bandwidth must be updated in all experiments, not only in the baseline.

Controller: Implement clearing rules from switches

Problem

When I restart the controller with new flow configurations, the old flows remain in the switches, so I get stats to flows which are eventually not managed by the QoSManager. That causes an exception looking like this:

AdaptingMonitor13: Exception occurred during handler processing. Backtrace from offending handler [_flow_stats_reply_handler] servicing event [EventOFPFlowStatsReply] follows.
Traceback (most recent call last):
  File "/home/ecklm/Projects/Diploma/implementation/controller/.venv/lib/python3.8/site-packages/ryu/base/app_manager.py", line 290, in _event_loop
    handler(ev)
  File "/home/ecklm/Projects/Diploma/implementation/controller/adapting_monitor_13.py", line 133, in _flow_stats_reply_handler
    self.qos_managers[dpid].adapt_queues(self.stats[dpid].export_avg_speeds_bps())
  File "/home/ecklm/Projects/Diploma/implementation/controller/adapting_monitor_13.py", line 195, in adapt_queues
    unused_candidates = [k for k, v in flowstats.items() if v < self.FLOWS_INIT_LIMITS[k][0] / 2]
  File "/home/ecklm/Projects/Diploma/implementation/controller/adapting_monitor_13.py", line 195, in <listcomp>
    unused_candidates = [k for k, v in flowstats.items() if v < self.FLOWS_INIT_LIMITS[k][0] / 2]
KeyError: FlowId(ipv4_dst='10.0.0.1', udp_dst=5009)

For now, I intentionally don't catch this exception so that I don't suppress the problem.

Ideal solution

Ideally, deprecated flow rules should be cleared out from the switches.

Controller: Implement a way to change the path of the config file outside of the code

Options:

Environment variable
Somehow passing command-line arguments through controller init

Controller: Implement diff-serv-based slicing

Hopefully, DiffServ-based slicing will be more flexible to work with more complex topologies than one switch and a few hosts.

This is not obvious though, the use-case does not necessarily benefit from simplifying the technical details of flow setting.

Controller: Integrate some configuration parser

A config file format should be selected and its parsing implemented in the controller.
Preferably YAML or INI.

As a first step, extract parameters such as time_step, QoSManager.LIMIT_STEP and others which do not require touching the internals of classes.

Controller: Create experiment configs

These configs should play with the optional parameters and the mininet scenarios should be run against them.

Controller: Distribute code accross different files

adapting_monitor_13.py Is becoming bloated. Having the ability of configuration, some components need to be more separated from others (eg. The FlowStat class can and should be completely independent from the others).

This step should force more correct configurability and data/functionality separation.

Eliminate device-related hardwired values

Things like the following restrict topology expansion for multiple switches.

139:        "port_name": "s1-eth1", "type": "linux-htb", "max_rate": "50000000",

Clean up useless functions from QoS Manager

Virtually all getters as they do not actually get anything, at least do not provide information to the caller.

Timestamping
Have a preconfigured playbook for when and how much egress traffic should change
Traffic type (protocol and port) should be selectable
Output easy to save and use as CSV (for graph generation later on)