bluesky / bluesky-pods Goto Github PK

Very experimental podman compose for running bluesky and friends in a pod.

License: BSD 3-Clause "New" or "Revised" License

Python 31.07% Shell 54.34% HTML 2.19% Jupyter Notebook 7.80% Dockerfile 4.60%

bluesky-pods's Introduction

Pods for bluesky(-adaptive)

This is a docker-compose yaml and ContainerFile files that will stand up a pod that attempt to mimic the full beamline / remote compute model (as we want to run at NSLS-II). The intent is to provide a realistic environment for local development.

Run the pod

cd compose/acq-pod
podman-compose --in-pod true up

To get a bluesky terminal in this pod run

bash launch_bluesky.sh

From inside bsui a set of simulated devices are available, as are databases and tiled servers. Data can be accessed via databroker, a tiled profile, or a remote tiled server.

RE(scan([det], motor, -10,10,10)) # will produce a liveplot and data
db[1] # or db[uid] can be used to access the data
from tiled.client import from_profile, from_uri
c = from_profile("MAD")
c[1] # or c[uid]
c = from_uri("http://tld:8000/")
c[1] # or c[uid]

Setting the environment variable BLUESKY_PROFILE_DIR to an ipython profile will allow you to use a custom profile in the launch_bluesky.sh script or the Queue Server container. In both cases, the RunEngine (RE), databroker, and Kafka subscriptions must be initialized in the startup profile.

To get a default QT gui for the queue server run

bash launch_bluesky.sh bluesky queue-monitor

On a Mac, XQuartz is required to display the output of the Best Effort Callback.

There is a jupyterlab instance, a tiled instance, and a Queueserver http API instance running the pod which are proxied via nginx. If the pod is running http://localhost:11973 will provide links to each.

Terms

image : The binary blob that can be run as a container
container : A running image. You can have many containers running the same image simultaneously. As part of starting the container you can pass in environmental variables and mount directories from the host into the container (read-only or read/write)
pod : A collection of running containers that share a conceptual local network. When the pod is created you can control which ports are visible to the host machine. When using podman-compose the other containers can be accessed via DNS with their names.

Get podman

Podman and buildah are packaged on many Linux distributions. Refer to the official installation guide for specific instructions. These instructions cover how to install podman. Also install buildah in exactly the same fashion.

You will also need podman compose

Enable "rootless" usage

Unlike Docker, podman and buildah can be used without elevated privileges (i.e. without root or a docker group). Podman only needs access to a range of uids and gids to run processes in the container as a range of different "users". Enable that like so:

sudo usermod --add-subuids 200000-201000 --add-subgids 200000-201000 $USER
podman system migrate

For additional details and troubleshooting, see the rootless tutorial.

Configure for display over SSH

If the machine where you will be running podman is one you are connected to via SSH, then you will need to configure the SSH daemon to accept connections routed through podman---specifically, connections to its IP address rather than localhost.

Add this line to /etc/ssh/sshd_config.

X11UseLocalhost no

If podman is running on the machine you are sitting in front of, or if you would like to run in "headless" mode, no action is required.

Repository Contents

Other examples

Wright Group bluesky-in-a-box
sst-pods [based on an early version of this repo]

bluesky-pods's People

Contributors

Stargazers

Watchers

Forkers

klauer tacaswell cryos jacobfilik mikehart85 danielballan gwbischof dmgav junaishima jklynch stuartcampbell cjtitus hyperrealist kezzsim maffettone

bluesky-pods's Issues

Error when running launch_bluesky_headless.sh

I followed all instructions to build and launch everything, when running bash launch_bluesky_headless.sh I see the following error:

mhanwell@unobtanium ~/src/bluesky-pods (main) $ bash launch_bluesky_headless.sh
+ '[' '' '!=' '' ']'
+ imagename=bluesky
++ pwd
+ podman run --pod acquisition -ti --rm -v /home/mhanwell/src/bluesky-pods:/app -w /app -v ./bluesky_config/ipython:/usr/local/share/ipython -v ./bluesky_config/databroker:/usr/local/share/intake -v ./bluesky_config/happi:/usr/local/share/happi -e EPICS_CA_ADDR_LIST=10.0.2.255 -e EPICS_CA_AUTO_ADDR_LIST=no bluesky ipython3 --ipython-dir=/usr/local/share/ipython
Python 3.8.5 (default, Aug 12 2020, 00:00:00) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.
[TerminalIPythonApp] WARNING | Unknown error in handling startup files:
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/usr/lib/python3.8/site-packages/IPython/core/shellapp.py in _exec_file(self, fname, shell_futures)
    335                     else:
    336                         # default to python, even without extension
--> 337                         self.shell.safe_execfile(full_filename,
    338                                                  self.shell.user_ns,
    339                                                  shell_futures=shell_futures,

/usr/lib/python3.8/site-packages/IPython/core/interactiveshell.py in safe_execfile(self, fname, exit_ignore, raise_exceptions, shell_futures, *where)
   2718             try:
   2719                 glob, loc = (where + (None, ))[:2]
-> 2720                 py3compat.execfile(
   2721                     fname, glob, loc,
   2722                     self.compile if shell_futures else None)

/usr/lib/python3.8/site-packages/IPython/utils/py3compat.py in execfile(fname, glob, loc, compiler)
    166     with open(fname, 'rb') as f:
    167         compiler = compiler or compile
--> 168         exec(compiler(f.read(), fname, 'exec'), glob, loc)
    169 
    170 # Refactor print statements in doctests.

/usr/local/share/ipython/profile_default/startup/00-base.py in <module>
     19 from bluesky_adaptive.per_start import adaptive_plan
     20 
---> 21 from bluesky_queueserver.plan import configure_plan
     22 
     23 import databroker

ModuleNotFoundError: No module named 'bluesky_queueserver.plan'

These pods are too big and should be split up

This is a longer-term concern and is not on the critical path for our MVP.

Just to record some thoughts from separate conversations I've had with @stuartcampbell and @tacaswell:

We currently have a couple very large pods. This meets our needs at present and works quite well, so I'm no great hurry to change it, but I think we should consider restructuring it in the future.

We like to use tools the way they a meant to be used. We are abusing the pod abstraction here by stuffing so many services into one pod. To zeroth order, a pod should have one container. To first order, pods can have a container and additional containers running support services whose data stays local to the pod (i.e. worker processes, an nginx proxy). Large services like MongoDB or Kafka whose data is of interest to multiple other services should in general get their own pods. Some external validation on that opinion:

The primary purpose of a multi-container Pod is to support co-located, co-managed helper processes for a primary application.

Source: https://www.mirantis.com/blog/multi-container-pods-and-container-communication-in-kubernetes/

You might say our current structure is using "pod" to mean "private network" rather than "pod".
When pods are small, restarting one is less disruptive to the rest of the system.
If we ever want to build something comparable in production (which is still very much an open question) we will definitely not want large pods because we'll want the large services on dedicated nodes. To maintain correspondence between dev and production, we should use a more idiomatic pod structure.

One possible pod grouping:

Beamline Pod: {IOCs + queue server parts + local redis}
Message Broker Pod: {zookeeper + kafka}
Database Pod {mongo + mongoconsumer}
Data Broker Server Pod {databroker server + nginx}

Pull low-level steps into bash scripts separate from buildah

Notes on a conversation with @MikeHart85:

It would be nice if these components were usable as:

Bash scripts that you can run directly inside a VM, such as a CI VM, or even on a local machine if you want.
Containers
Pods

In particular encoding the low-level steps as

buildah run ...
buildah run ...
buildah run ...

does not seem to add anything compared to a single

buildah run ... bash_script_with_the_actual_steps.sh

Add dbwr + config

https://github.com/ornl-epics/dbwr so we can see what PVs are inside.

Set up a JupyterHub using a custom spawner

The Spawner will use the podman API to launch user containers (and pods?). Useful links:

mount a shared volume between the acquisition and databroker pods

So that we can write files on one side and read them on the other.

Mount git repositories to enable development using pods

In previous work I have used docker-compose with mounting of git repositories to facilitate development, and imagine a similar workflow could be achieved with podman. In the development of either the bluesky or databroker stacks I imagine the need for:

Server running in one container, with reload enabled for development
Client running in another container with reload enabled for development
Server to combine the server and client behind a single port (probably NGINX)

There would be a (probably default) production version where it would not enable reload, and would simply build the web application into an optimized static bundle using everything within the container/image.

Documentation should detail contents more clearly

We should at least add a table to the readme of what containers are expected to be spun up, how they interact with each other, and what images they depend on. I can imagine the majority of applications will just execute the bash script and poke around, but a few may want to do some repurposing or extension. Even walking through the compose file, it took a bit of time for me to understand what was getting spun up, and from where.

try out redpanda instead of kafka in pod

https://docs.redpanda.com/current/get-started/quick-start/

It looks like we can replace zookeeper + kafka with a single container (which would reduce some of the random startup issues) and get a web UI to look at them to boot.

We do use librdkafka based kafka bindings, but do not think we are using any of the fancy features.

Split up the compose file and add variations

There appears to be include https://docs.docker.com/compose/multiple-compose-files/include/ andextends https://docs.docker.com/compose/multiple-compose-files/extends/ directives in the vocabulary and a way to merge https://docs.docker.com/compose/multiple-compose-files/merge/ compose files when invoking up (or automagically by putting semantics in filenames 🤯 ).

This needs a bit of ivestigation, but my current thinking is we want to use include so that we can have shared compose file fore:

core data services (mongo, postgres, kafka, tiled, jlab)
epics serivces (archiver, saverestore, ...)
set(s) of IOCs
bluesky / qs / CSS configuration

So that we can have a couple of variations {just ophyd.sim, a bunch of mock caproto IOCs, ADsim + motorsim, beamline-analog, blackhole-IOC, ...} without having to copy-paste a lot of yaml.