Git Product home page Git Product logo

heiko's People

Contributors

madhavjivrajani avatar psiayn avatar samyak2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

heiko's Issues

Logs follow argument takes too much CPU

Describe the bug

When heiko logs is used with the follow (-f) parameter, it utilizes 100% of a CPU thread.

To Reproduce
Steps to reproduce the behavior:

  1. Set up a heiko app and start it with heiko start --name something
  2. Check logs with heiko logs -f --name something
  3. Open your favourite system monitor tool (example: htop, top, bpytop)
  4. Notice heiko going brr

Expected behavior
Following the logs shouldn't take so much CPU, things like tail -f do it without much CPU usage.

Desktop (please complete the following information):

  • OS: Fedora 33 KDE
  • heiko Version: master
  • Python version: 3.9

Also noticed on a Raspberry Pi 4

Additional context

The function which is responsible for following logs is here:

heiko/heiko/cli.py

Lines 122 to 143 in 29c726f

def follow(file) -> Iterator[str]:
"""Yields each line in a file as they are written
(or yields when more than 5 characters have been written)
:param file: file handle to read
:type file: file handle
:yield: a newline terminated line or a string of 5 characters
:rtype: Iterator[str]
"""
line = ""
i = 0
while True:
tmp = file.readline()
if tmp is not None:
line += tmp
i += 1
if line.endswith("\n") or i >= 5:
yield line
line = ""
i = 0
else:
time.sleep(0.1)

Which is used here:

heiko/heiko/cli.py

Lines 213 to 216 in 29c726f

if args.follow:
# follow log as it is written
for line in follow(f):
print(line, end="")

Missing rsync install for test user in Dockerfile

When heiko is run with the test user using the docker-networks.py file, logs show an error saying that rsync is not found.

This should be pretty straightforward to fix:

RUN echo yabe | sudo -S apt-get install rsync -y

Eliminate use of binaries to get node info

Hey, so I was trying to understand how heiko is getting node information like CPU and RAM usage and things like that.

I noticed that in a lot of places it's making use of binaries such as lscpu and free. That's actually a really convenient way to do it, but I recently found out that in quite a few cases, binaries like lscpu aren't installed by default. So that might cause an issue. Another improvement that heiko might be able to achieve is, since it's going to be checking for these details multiple times, you'll have to load the binary I to memory atleast once. Instead what you could do is, you could maybe just read the info directly from the source itself that binaries like lscpu and free use.

So for example, for lscpu, the info you need can be gotten from /proc/cpuinfo and
/sys/devices/system/cpu*/cpufreq/cpuinfo_max_freq

For free you can get the info from /proc/meminfo

Simplify the interface of NodeDetails by making implementation private

The Problem

If you look into utils/load.py, specifically the NodeDetails class, you'll notice that all of the class functions are public but the only function that is actually useful to a user of the class is getDetails. Similarly, some of the class members (ram_pattern and load_pattern) have no use outside of the class, but they are made public.

In python, by default, all members and functions are public. Variables and functions can be made "private" by prefixing the name with an underscore (Note: nothing is truly private in python, these variables can be accessed outside the class but the responsibility of using these in the expected way is unto the user. This is different from languages like Java or C++ where private variables are truly private.)

Why do you need this?

Let's see this with a short story. You, a (first time) contributor to heiko, are working a new scheduler for heiko (you don't need to know what that means atm, but if you're interested, you can find a basic scheduler for heiko implemented here). You need to get details of a node to make some decision (say, to decide which node to run a process on). You make an instance of the class NodeDetails:

node_deets = NodeDetails(node)

Now, to get the actual details from it, you type node_deets. and wait for your editor's code completer to suggest the methods of the class. You see this:
image

For now you only need the CPU details, so you use getCpuDetails and your editor nicely shows the documentation of it:
image

So now you need to make an asyncssh connection object, whatever that is. Also, it only returns the output of a command which, in this case, happens to be quite big (try running lscpu -J if you have a linux system and see for yourself). Maybe you notice the parseCpuInfo function which gives only what you need. Maybe you don't notice and write your own parsing function (yikes, now if something is changed inside the getCpuDetails function, you'll have to change this too!). All of this for something that should have abstracted away in the class. This is exactly what is done in the getDetails function which hides all of the asyncssh connection object, calling the functions, etc.. Note that you are not wrong in this case, there were too many functions and variables to go through. It was not easy to do the right thing. Or in other words, it was easy to do the wrong thing.

The getCpuDetails and parseCpuInfo are internal functions used by getDetails. Here, getDetails is the only interface function and the rest are implementation details. Everything outside the class should only use the interface and it is the responsibility of the writer of the class to not change this interface (for example, adding more parameters, changing the return type, etc.). Whereas the implementation could change without affecting the interface. (here is a short example of implementation vs interface in C++).

I also highly recommend watching this amazing talk on designing interfaces - The Most Important Design Guideline by Scott Meyer (don't be fooled by the "lecture" in the title, the talk is quite interesting and you might find it relatable too).

The Solution

Phew, that was a large wall of text. All of this just to say:

  • Rename all of the implementation functions and members to prefix them with an underscore.
    • ram_pattern to _ram_pattern
    • load_pattern to _load_pattern
    • getNodeRam to _getNodeRam
    • getCpuUsage ....you get the idea
    • getCpuDetails
    • parserRam
    • parseLoad
    • parseCpuInfo
  • Rename in all the places these functions and members are used - all of this should be inside the class!

[FEATURE REQ] check if PID actually exists when reading pidfile

Description

When the master unexpectedly shuts off (a power cut, for example), the pidfile isn't removed after it is rebooted leading to heiko thinking that the daemon is already running (which it isn't)

Possible Fix

heiko/heiko/daemon.py

Lines 71 to 78 in 4874dce

@property
def pid(self):
try:
with open(self.pidfile, "r") as pf:
pid = int(pf.read().strip())
except IOError:
pid = None
return pid

Here, after reading the PID, it can be checked if a process with that PID exists using psutil (or any of the alternative methods to check if a PID exists). PID can be set to None if the process does not exist.

Make the docker-networks.py script run as user instead of root.

As of now, the scripts spawns containers that are root by default. On the other hand, heiko's target devices don't have access to any level of elevated permissions. It would be nice to make it run as user to make simulating and perhaps testing a lot more realistic?

Missing required dependency in `setup.py`

The dependency psutil is missing from the list of required installs in setup.py causing it to throw an exception when heiko is installed/built with pip install . or pip install heiko.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.