pesos / heiko Goto Github PK

View Code? Open in Web Editor NEW

13.0 13.0 5.0 101 KB

A fancy load balancer for light weight devices

License: Apache License 2.0

Python 97.84% Dockerfile 2.16%

heiko's People

Contributors

Stargazers

Watchers

Forkers

sreya-01 madhavjivrajani raghavroy145 soham4abc rjdp

heiko's Issues

Logs follow argument takes too much CPU

Describe the bug

When heiko logs is used with the follow (-f) parameter, it utilizes 100% of a CPU thread.

To Reproduce
Steps to reproduce the behavior:

Set up a heiko app and start it with heiko start --name something
Check logs with heiko logs -f --name something
Open your favourite system monitor tool (example: htop, top, bpytop)
Notice heiko going brr

Expected behavior
Following the logs shouldn't take so much CPU, things like tail -f do it without much CPU usage.

Desktop (please complete the following information):

OS: Fedora 33 KDE
heiko Version: master
Python version: 3.9

Also noticed on a Raspberry Pi 4

Additional context

The function which is responsible for following logs is here:

heiko/heiko/cli.py

Lines 122 to 143 in 29c726f

 def follow(file) -> Iterator[str]: 

 """Yields each line in a file as they are written 

  (or yields when more than 5 characters have been written) 

  :param file: file handle to read 

  :type file: file handle 

  :yield: a newline terminated line or a string of 5 characters 

  :rtype: Iterator[str] 

  """ 

 line = "" 

 i = 0 

 while True: 

 tmp = file.readline() 

 if tmp is not None: 

 line += tmp 

 i += 1 

 if line.endswith("\n") or i >= 5: 

 yield line 

 line = "" 

 i = 0 

 else: 

 time.sleep(0.1)

Which is used here:

heiko/heiko/cli.py

Lines 213 to 216 in 29c726f

 if args.follow: 

 # follow log as it is written 

 for line in follow(f): 

 print(line, end="")

Missing rsync install for test user in Dockerfile

When heiko is run with the test user using the docker-networks.py file, logs show an error saying that rsync is not found.

This should be pretty straightforward to fix:

RUN echo yabe | sudo -S apt-get install rsync -y

Eliminate use of binaries to get node info

Hey, so I was trying to understand how heiko is getting node information like CPU and RAM usage and things like that.

I noticed that in a lot of places it's making use of binaries such as lscpu and free. That's actually a really convenient way to do it, but I recently found out that in quite a few cases, binaries like lscpu aren't installed by default. So that might cause an issue. Another improvement that heiko might be able to achieve is, since it's going to be checking for these details multiple times, you'll have to load the binary I to memory atleast once. Instead what you could do is, you could maybe just read the info directly from the source itself that binaries like lscpu and free use.

So for example, for lscpu, the info you need can be gotten from /proc/cpuinfo and
/sys/devices/system/cpu*/cpufreq/cpuinfo_max_freq

For free you can get the info from /proc/meminfo

Simplify the interface of NodeDetails by making implementation private

The Problem

If you look into utils/load.py, specifically the NodeDetails class, you'll notice that all of the class functions are public but the only function that is actually useful to a user of the class is getDetails. Similarly, some of the class members (ram_pattern and load_pattern) have no use outside of the class, but they are made public.

In python, by default, all members and functions are public. Variables and functions can be made "private" by prefixing the name with an underscore (Note: nothing is truly private in python, these variables can be accessed outside the class but the responsibility of using these in the expected way is unto the user. This is different from languages like Java or C++ where private variables are truly private.)

Why do you need this?

Let's see this with a short story. You, a (first time) contributor to heiko, are working a new scheduler for heiko (you don't need to know what that means atm, but if you're interested, you can find a basic scheduler for heiko implemented here). You need to get details of a node to make some decision (say, to decide which node to run a process on). You make an instance of the class NodeDetails:

node_deets = NodeDetails(node)

Now, to get the actual details from it, you type node_deets. and wait for your editor's code completer to suggest the methods of the class. You see this:

For now you only need the CPU details, so you use getCpuDetails and your editor nicely shows the documentation of it:

So now you need to make an asyncssh connection object, whatever that is. Also, it only returns the output of a command which, in this case, happens to be quite big (try running lscpu -J if you have a linux system and see for yourself). Maybe you notice the parseCpuInfo function which gives only what you need. Maybe you don't notice and write your own parsing function (yikes, now if something is changed inside the getCpuDetails function, you'll have to change this too!). All of this for something that should have abstracted away in the class. This is exactly what is done in the getDetails function which hides all of the asyncssh connection object, calling the functions, etc.. Note that you are not wrong in this case, there were too many functions and variables to go through. It was not easy to do the right thing. Or in other words, it was easy to do the wrong thing.

The getCpuDetails and parseCpuInfo are internal functions used by getDetails. Here, getDetails is the only interface function and the rest are implementation details. Everything outside the class should only use the interface and it is the responsibility of the writer of the class to not change this interface (for example, adding more parameters, changing the return type, etc.). Whereas the implementation could change without affecting the interface. (here is a short example of implementation vs interface in C++).

I also highly recommend watching this amazing talk on designing interfaces - The Most Important Design Guideline by Scott Meyer (don't be fooled by the "lecture" in the title, the talk is quite interesting and you might find it relatable too).

The Solution

Phew, that was a large wall of text. All of this just to say:

Rename all of the implementation functions and members to prefix them with an underscore.
- ram_pattern to _ram_pattern
- load_pattern to _load_pattern
- getNodeRam to _getNodeRam
- getCpuUsage ....you get the idea
- getCpuDetails
- parserRam
- parseLoad
- parseCpuInfo
Rename in all the places these functions and members are used - all of this should be inside the class!

Documentation specifies incorrect min python-version to install

The Requirements specifies 3.6 as the min version but while installing with 3.6 from git , the following error is noticed

The documentation in the Contributing section specifies 3.7 and above for it to work

[FEATURE REQ] check if PID actually exists when reading pidfile

Description

When the master unexpectedly shuts off (a power cut, for example), the pidfile isn't removed after it is rebooted leading to heiko thinking that the daemon is already running (which it isn't)

Possible Fix

heiko/heiko/daemon.py

Lines 71 to 78 in 4874dce

 @property 

 def pid(self): 

 try: 

 with open(self.pidfile, "r") as pf: 

 pid = int(pf.read().strip()) 

 except IOError: 

 pid = None 

 return pid

Here, after reading the PID, it can be checked if a process with that PID exists using psutil (or any of the alternative methods to check if a PID exists). PID can be set to None if the process does not exist.

Make the docker-networks.py script run as user instead of root.

As of now, the scripts spawns containers that are root by default. On the other hand, heiko's target devices don't have access to any level of elevated permissions. It would be nice to make it run as user to make simulating and perhaps testing a lot more realistic?

Missing required dependency in `setup.py`

The dependency psutil is missing from the list of required installs in setup.py causing it to throw an exception when heiko is installed/built with pip install . or pip install heiko.

	def follow(file) -> Iterator[str]:
	"""Yields each line in a file as they are written
	(or yields when more than 5 characters have been written)

	:param file: file handle to read
	:type file: file handle
	:yield: a newline terminated line or a string of 5 characters
	:rtype: Iterator[str]
	"""
	line = ""
	i = 0
	while True:
	tmp = file.readline()
	if tmp is not None:
	line += tmp
	i += 1
	if line.endswith("\n") or i >= 5:
	yield line
	line = ""
	i = 0
	else:
	time.sleep(0.1)

	if args.follow:
	# follow log as it is written
	for line in follow(f):
	print(line, end="")

	@property
	def pid(self):
	try:
	with open(self.pidfile, "r") as pf:
	pid = int(pf.read().strip())
	except IOError:
	pid = None
	return pid