plajjan / bgworker Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 2.0 46 KB

License: MIT License

Python 89.14% Makefile 10.86%

bgworker's Introduction

Hi there 👋

🔭 I’m currently working on building a distributed programming language, Acton, besides also working on network automation stuff at Deutsche Telekom and Cisco.
🌱 I’m currently learning how to build a distributed programming language ;)
👯 I’m looking to collaborate on building a distributed programming language. You ever wanted a fast high level language that natively and seamlessly works across multiple computers offering fault tolerance and immortal applications?
🤔 I’m looking for help with writing the Acton standard library! Unhappy with the batteries in Python? Dislike the Go choice of stdlib? Ever dreamt of Write an Acton library today! :)
💬 Ask me about, or strike up a conversation, how we should all develop software natively for cloud environments using next generation distributed programming languages. I'm also quite proficient in YANG, Acton, network management and automation or distributed programming languages with orthogonal persistence
📫 How to reach me: X/plajjan, kristian circly-a spritelink d0t net

bgworker's People

Contributors

Stargazers

Watchers

Forkers

jabelk jinlongwukong

bgworker's Issues

Actually react to HA events

Silly me, I implemented a thread to listen to the notification API in NCS and subscribe to HA events.. and we put those events on the main supervisor queue but the supervisor doesn't actually react to them. FIX IT!!!

Do efficient monitoring of child process

Today the supervisor thread is waiting on its main queue but does so with a timeout of 1 second after which it will go and check the child process is alive and then come back to read the queue again, repeat ad infinitum.

We try to avoid busy waiting everywhere we can. Once a second isn't particularly busy but still, I want it to go away.

Instead of doing .is_alive() on the child process we could detect that it is dead through some other means.. something that is waitable. What is waitable?

We could open a pipe to the child. If the child dies, the pipe goes away which is something select() should react to.

Or we can maybe have an alarm signal handler?

Honour python-vm specific logging level

The log config listener I implemented only subscribes to the global logging level config, not the one specific to each python-vm. Fix!

Reduce supervisor threads

We have multiple threads for doing various things. The key in implementing all of this efficiently is to find a way to listen to an event in a waitable fashion. We don't want busy looping for reacting to events. Each thread listening to one type of event must also be able to listen to a 'stop' request so we can quickly stop all threads. Thus the propagation of the stop message and the primary event we are monitoring must happen through the same waitable concept.

The main supervisor thread is using a queue, so it can efficiently block and wait for messages on the queue. The stop method of the supervisor thread sends an 'exit' message on the queue and the other things the supervisor is reacting to, like configuration events or the HA events, all come in on the queue. We only need to listen to this one queue and can thus do that in an efficient way.

The HA listener is listening on a socket, which isn't waitable in the same way as the queue. We can however select() on this socket and I added the WaitableEvent class which does the same thing as a threading semaphore (roughly) but is implemented in a way (exposing a file descriptor) that we can use a select() to wait on both of them.

I recently learned that multiprocessing.Queue is select()able by digging out the underlying transport and selecting on that. We could thus possible reduce the number of threads by putting more things on the same select loop.

Probably worth a try since it could reduce the amount of code we have.

py3 compatibility usse for home grown WaitableEvent

Pretty sure this is due to py3isms. I wrote the code on py2.

ERROR> 03-Jul-2019::16:24:28.54 bgworker ComponentThread:main: - Traceback (most recent call last):
  File "/home/kll/ncs-4.7.4.2/src/ncs/pyapi/ncs_pyvm/ncsthreads.py", line 173, in run
    self.main._run()
  File "/home/kll/ncs-4.7.4.2/src/ncs/pyapi/ncs/application.py", line 218, in _run
    self.teardown()
  File "/home/kll/ncs-4.7.4.2/ncs-run/state/packages-in-use/1/bgworker/python/bgworker/main.py", line 35, in teardown
    self.p.stop()
  File "/home/kll/ncs-4.7.4.2/ncs-run/state/packages-in-use/1/bgworker/python/bgworker/background_process.py", line 126, in stop
    self.ha_event_listener.stop()
  File "/home/kll/ncs-4.7.4.2/ncs-run/state/packages-in-use/1/bgworker/python/bgworker/background_process.py", line 235, in stop
    self.exit_flag.set()
  File "/home/kll/ncs-4.7.4.2/ncs-run/state/packages-in-use/1/bgworker/python/bgworker/background_process.py", line 263, in set
    os.write(self._write_fd, '1')
TypeError: a bytes-like object is required, not 'str'

React to HA enabled/disabled

While we react to changes in HA mode changes like becoming master, the whole HA mode can be enabled or disabled and we only read this once on startup. Should we not listen to this too?

Add helper for handling configuration of background worker process?

So I believe when writing a background worker, it's rather likely there are some configuration options for it. Naturally we want to describe those in a YANG model and have them show up in CDB. Naturally we want the background worker to react to them immediately (requires a CDB subscriber). Naturally we don't want to bore the developer of a background worker with the mundane details of implementing such a subscriber.

background_process already listens to CDB changes, first and foremost the enabled leaf for the background worker, which controls whether the background worker child process should run or not. This value, or change to the value, is consumed by the supervisor and just kills the child process if it should be stopped or starts it if it should be running. We don't pass it as configuration to the background worker child process. Other configuration options, which actually affect the behaviour of the worker process, obviously needs to be passed. We already have a second CDB subscriber which listens to changes for the python-vm logging level and notifies a thread in the child process that then sets the appropriate logging level. While the log level configuration is passed to the child process, it is passed to a thread that we injected rather than to the bg function that the developer of the background worker implements.

How should we go about this?

Alternative 1: Leave it to the bg function implementer

We do nothing and let the implementer of the bg function set up their own CDB subscriber.

Alternative 2: Send over queue to bg function

We implement a queue and we accept input parameters to the supervisor process which configuration paths are interesting. The supervisor will create a CDB subscriber that subscribes to those interesting config paths and then send any updates across the queue which should then be emptied by the bg function.

This might impose limitations on how a bg function must be written, like we might force the use of a while loop around this queue. Maybe the bg function implementer wants their freedom?

It should be noted that multiprocessing queues can be select:ed upon, so that gives some more option to the implementer in avoiding busy waiting.

Alternative 3: background_process injects config listener thread and magically updates values

As per above, the supervisor establishes a queue and CDB subscriber, sending updates over the queue but the handling of those updates in the child process is done by a thread that background_process injects. What the bg function developer defines is a mapping between YANG paths and Python variables, something like:

config_map = {
'/bgworker/period': cfg.period,
'/bgworker/foo': cfg.foo
}

So we have a cfg object that is being updated by the config listener thread and we can read it from main background worker function. Variable updates are atomic, thus threadsafe... I think, except for when the underlying type is 64 bit!? So let's not have that, heh. Not sure how we could have locks here.. maybe the cfg object could be magical, because we don't want the bg function developer to think about locks.

Thoughts? @mzagozen

Reduce CDB subscribers

Not sure what the overhead of a CDB subscriber is. Somehow I imagine the more expensive part is how many paths we subscribe to and less about how many different subscribers.

Now we use two, one for the background worker enabled config leaf and another for python-vm logging level. They could be combined, but is it better?