elektrobit / flake-pilot Goto Github PK
View Code? Open in Web Editor NEWRegistration/Control utility for applications launched through a runtime-engine, e.g containers
License: MIT License
Registration/Control utility for applications launched through a runtime-engine, e.g containers
License: MIT License
oci-pilot and oci-register should provide man pages
A registration attempt of the form
oci-ctl register --app /usr/bin/python --container foo
will fail with an error message if /usr/bin/python
already exists on the system. This is a good behavior but we should allow an option that force deletes the existing file and allows registration of a containerized version
On slower hardware the number of max retries in firecracker resume mode is not enough. Should be increased
resume instance testing on low performance machines like HPCs (e.g rPI) succeeds
As of today one can call
flake-ctl firecracker pull --kis-image URI
But a kis image is a kiwi specific image type that provides a tarball with initrd, kernel and rootfs. It would be good to also allow:
flake-ctl firecracker pull --rootfs URI --kernel URI [--initrd URI]
That way offers additional flexibility and we could also fetch and test the kernels provided by firecracker
NorthStar is like podman a container engine. It uses a different format for the containers and also implements different mechanics for starting and managing containers. In order to support users who prefer northstar over podman we would like to add support for this engine.
northstar-pilot
exists and is able to startup a NorthStar compatible container, runs an application inside and works comparable to the existing podman engine implementationflake-ctl
As of today the user id to run the engine is specified in the flake setup as follows:
container:
runtime:
runas: root
There is no option to specify the user on the commandline
When running a firecracker flake without firecracker installed you get
โ launching flake
before getting send back to your original shell instead of a proper error message
For the podman engine podman-pilot
supports provisioning of a container with data from a delta container or arbitrary include files or additional layer containers. The same concept can be applied for the VM based firecracker engine. The provisioning will require overlay_size
to be configured. If this is the case all provision data can be synced into the overlay which would implement the same concept that we use for containers also for VMs.
I also suggest to allow to use containers for provisioning the VM. This has the advantage that the same delta or layer containers can be used for running a container or a VM.
At the moment the metadata files to store information about running app instances are stored
in /tmp/flakes
. However this location is a tmpfs and gets lost on reboot, power-cycle, etc. For
non resume instances this is not a problem but for resume instances it prevents them to be
correctly resumed. I suggest to move the location to the registry storage area, e.g
/var/lib/containers/storage/tmp
flake metadata is persistently stored
Unlike read-only Canonical Snaps (squashfs), Flakes has two modes:
Read-only
Just like snaps. Data may be stored outside somewhere.
Read-write
Data may be also stored outside, but the Flake's state itself can be modified independently.
When Flake is finished its runtime, logically always to remove it. This prevents pollution of useless instance images. This is ideal for just a software, which runs in a container, but looks like it is just a native software. So far, so good: same behaviour as with the Snaps.
However, what to do if a Flake is internally modified and user wants that to be persisted for a certain period of time? An example: a development environment of a different hardware architecture with a Git code inside or a deployed system?
Read-write Flakes are rather exception than a mainstream. Because the data still can be stored outside with no harm to the flow and immutable is still better. However, some cases might require read-write Flake, like a confined package build to ensure nothing wrong is linked in the process. Such Flake can be used for further debugging of a failing package, for example.
The expectation is that a user, who is using Flakes (not podman directly!) is automatically helped to cleanup the mess left behind of a dangling container instances.
Of course, manual cleanup is possible. But this path should not be required, because the main purpose of Flakes is to hide the OCI engine at first place.
engine.yml
is all "up to user". It is OK, as long as a packager knows what he is doing and 100% sure he is not doing anything wrong by not adding --rm
to the options (remove after quit). Probably we need engine wrappers, which will add cleanup always by default, unless explicitly turned off in the same engine.yml
configuration, like:# Explicitly say the Flake is mutable. If omitted, "immutable" by default.
type: mutable
# The usual runtime config
runtime:
podman:
...
podman
engine --rm
should be always removed. If found in a config, a flake should quit with an error of mis-packaging. Example behaviour of Flake-delivered Python (mock-up):$ python
Packaging error: Flake is mutable, but you ask to remove its instance on finish.
Either define it immutable or do not explicitly remove it.
Read-write instances brings a problem of "Always at least one instance left" (for the reuse). So the problem is not just if a user launches a new read-write Flake 1000 times, but also if a user launches 1000 read-write different Flakes at least one time.
A possible solution to prevent overbloating the system is to force the user to look after his Flakes. For example, a config of the Pilot itself anywhere in /etc/oci-pilot.conf
for a quota limit, something like:
# Limit instances per a user
instances-per-user: 100
# Purge dangling instances those were not accessed more than 60 days
instances-timeout: 60 d
In case user wants 101th, oci-pilot
should abort a flake and suggest cleaning up the instances (with either podman
directly or other ways). Practically none of read-write Flakes would be launch-performance critical. Therefore every time any read-write (!) Flake is launched, it should check for outdated instances as well and pre-purge them first, and only then launch a target Flake. It will surely for time to time impacts a performance on a very messy system, but that's OK.
Read-only Flakes, those are equivalents to Snaps, should not do this check due to performance savings.
This is sort of podman ps
already... The use-case of this would be very limited. Possible solutions are:
flake
option that will operate instancesIn case of flake
option, each flake will have it wrapped before any other options, e.g.:
$ aarch64-dev-env flake --help
new=[NAME] Launch a new instance. Optionally specify a name.
resume=<NAME> Resume an existing instance explicitly to a precise name.
args .... Add arguments, those are native to an application
Calling aarch64-dev-env
alone will just create one instance of a read-write Flake. Quitting it and calling it again will:
In case user wants another instance from this, then something like:
$ aarch64-dev-env flake new
However, this leaves a question to which flake one should re-attach back on a next run? One way is to specify the Podman's name (or ID):
$ aarch64-dev-env
Multiple Flake instances detected. Which one to re-attach?
(1) Stinky_Sniffer
(2) Bombastic_Bummer
Enter a number: _
Or this way:
$ aarch64-dev-env flake resume=Stinky_Sniffer
This can be used in scripts, when new
can have a name, like:
$ aarch64-dev-env flake new=Junky_Jack
...and then resume=Junky_Jack
later in the script.
Limitation of this approach is that when explicitly using flake
argument, there is a need for complex arguments interface, where e.g. args
would be native to the Flake arguments. I.e. python -c 'print("Hi")'
would be done like:
$ python flake new=Pimpy_Punk args -c 'print("Hi")'
The above would call a Python read-write'able Flake with the usual arguments into a new instance Pimpy_Punk
, also visible under Podman ps
.
A command writes and the data is transfered through the serial console and the VM layer to the caller.
However the call in the VM has already finished and the reboot sequence kicks in. The VM could be
exited before all data has been transfered through the serial console.
/dev/console
has no data prior reboot ?There is currently no man page for /usr/bin/firecracker-service
man page exists
To handle the network between guest and host firecracker offers the following documentation:
As of today firecracker supports networking through tap devices only. The qemu supported virtio-net model is currently
not supported. This comes with the issue that the host network must be prepared for firecracker VMs to work. firecracker will create/delete a tap device if the config lists:
"network-interfaces": [
{
"iface_id": "eth0",
"guest_mac": "AA:FC:00:00:00:01",
"host_dev_name": "tap0"
}
],
but it does not care for the real connection and routing of data on the host. This means the address assignment as well
as the eventual NAT or bridge configuration is kept as responsibility to the owner of the host. For us and our use case this might be an issue because we aim for a transparent process which does not require the user to manage its host network prior being able to have networking available in the guest. For the container based approach podman and CNI automatically cares for the network capabilities inside of the container. That level of comfort currently does not exist with firecracker
I personally think it should not become a responsibility of the flake-pilot project to prepare the host for firecracker networking. So we should at least mention it in the documentation / man pages and adapt as firecracker evolves.
This is mostly for conversation and agreement how far we get in terms of networking
At the moment the oci-registry tool exists as part of an example image description. The tool allows to prepare the read-write part of a podman (OCI) registry below /var/lib/containers/storage
. The tool also allows to switch between different read-write storage areas and is handy in combination with flake-pilot. Thus instead of maintaining the tool as part of an image description it would fit into this project and could be delivered as a package
/usr/bin/oci-registry
is provided as part of a packageAlong with the effort to support the firecracker engine we need to implement the pilot launcher to actually run the firecracker VM and the app matching the app registration. The following features needs to be implemented in the launcher, not all of them are required for the acceptance criteria:
/var/lib/firecracker/storage
firecracker.json
file from registration data and caller arguments on each callfirecracker-service
tool/var/lib/firecracker/images/
Allow to create OCI layers on top of an existing OCI KIWI based image description
layers can be used in two ways:
@schaefi the helper/update_changelog.py
seems copied from here:
https://github.com/OSInside/kiwi/commits/master/helper/update_changelog.py however, History of that file also doesn't looks original from the project and so it is not clear if the updater is GPL v3 or not. ๐ Nevertheless, since it comes from Kiwi which is under different license, this tool's licence needs to be properly mentioned.
Could you please fix that?
Alternatively, since it is a tool on its own, you could change its licence to permissive and allow more freedom (i.e. MIT or BSD), so it would be useful to bigger audience (as you wish).
As of today the runtime options passed to podman can be added in the flake file as follows:
runtime:
podman:
- setting
- setting
- setting
There is no way to specify runtime settings on the commandline during oci-ctl register ...
The implementation of the yaml writer in oci-ctl has room for improvement. The initial version done by myself could be implemented way more elegant. With regards to more elements/sub-elements that will be written according to the information provided at app registration time, the yaml writer code should be refactored
As of today the resume mode can be configured in the flake file directly via
runtime:
resume: true
There is no way to do this on the command line via oci-ctl register ...
Code is prepared for implementation of the actual register sub-command
On the first time, the delta-app is always auto-provisioned. That often may cause unexpected delay to the app and panic user vigorously hitting ^C^C^C^C... which will mess up everything.
Solution would be the following:
If a Flake requires a root access, instead it will do something like:
Error: short-name "elektrobit/amd64_sdk" did not resolve to an alias and no unqualified-search registries are defined in "/etc/containers/registries.conf"
Calling this with sudo ....
solves the issue. Probably we have to do the following:
sudo
, it should call it on its own with sudo
. This could be set as an option in a corresponding Flake YAML config.sudo
failed (user denied), then gracefully tell the user "sorry, you need to have sudo
access, ask your admin"sudo
and just "crash", gracefully wrapping the output, if no access.If the container provides a default app registration, allow to read it with
flake-ctl podman register --container foo --info
This will lookup foo.yaml
inside of the foo
container and allows for providing a default registration for this container.
Right now the config in both pilots is being parsed into a generic Yaml node that is evaluated on the fly. This may cause destructive actions to be performed before terminating the program due to a faulty config.
By parsing deserializing the config into a fixed struct the pilot can quit on startup if the config is invalid. This will also prevent errors due to typos and make the config more discoverable for new contributors.
As we add more and more engines, we have to finalise the structure.
Since this is "Flakes", let's regroup to two categories:
That is, rename all helpers oci-ctl/-reg/-fufu
into flake-ctl/-reg/-fufu
,
and rename pilots into <engine>-pilot
that would "hardlink" itself to a runtime.
For example, oci-pilot
is implemented to run Podman, so it would make sense
to have it actually as podman-pilot
, explicitly showing up what engine is behind of it.
In a future, it would be northstar-pilot
, firecracker-pilot
and so on.
Names would be longer, but:
Allow to build a flake as debian package either from an existing OCI tar image or via kiwi from an image description
Along with the effort to add support for the firecracker engine, an app registration for the later firecracker-pilot is required.
The app registration should look similar to flake-ctl podman register
like the following call:
flake-ctl firecracker register --vm NAME --app APPNAME ... [--overlay-storage ... --mem-size ... --cache-type ... --cpu-count ... ]
/var/lib/firecracker/images/NAME
/usr/share/flakes/APPNAME.yaml
Currently oci-ctl can resolve dependencies between 2 containers/layers. For example if I have a Python application that depends on a specific interpreter, say 3.10 I can build a delta container that has only my bits and then use the interpreter from python-3-10-basesystem and oci-ctl handles the layers and lets me register "my-cmd" and makes it appear to the user as if "my-cmd" is installed on the host system, i.e. the containers are transparent.
There is another use-case where my application may also depend on something I know to be part of the host system, the package manager is a good example. As such it would be great if oci-ctl could also layer the proper parts of the host os into the view of the file system for "my-cmd". In the flakes file this could be handled with additional directives such as
hostsystem:
- zypper
oci-ctl could inspect the package database to determine which bits and pieces need to be visible to "my-cmd" to make this work.
The approach can be thought of in the way that open linking works. I can link a C application in a way that the linker doesn't complain at link time when a specific library is not found, the resolution is deferred to run time.
Allow to include files as part of the container provisioning e.g
include:
tar:
- tar-archive-to-include
file:
- file/to/include
...and_so_on
tar
Specifying runtime options is done like this:
runtime:
podman:
--volume: /var/tmp:/var/tmp
--volume: /usr/share/kiwi/nautilos:/usr/share/kiwi/nautilos
--privileged:
--rm:
-ti:
However this concept does not work if options of the same name can be specified multiple times like it is the case for podman's --volume
option. The reason is, in the above concept option names are handled as hash keys and would overwrite each other if specified multiple times
Thus this should be fixed such that runtime options can be set as list as follows:
runtime:
podman:
- --volume /var/tmp:/var/tmp
- --volume /usr/share/kiwi/nautilos:/usr/share/kiwi/nautilos
- --privileged
- --rm
- -ti
At the moment the podman engine is the only runtime we support. Assuming we want to support more engines e.g firecracker in the future the commandline interface needs to be extended. One example:
oci-ctl register --container joe --app /usr/bin/joe --base basesystem
Should also have an engine specific caller variant:
oci-ctl podman register --container joe --app /usr/bin/joe --base basesystem
The existing caller semantics must not change for compat reasons and continues to default to the podman engine
Code is prepared for implementation of the actual remove sub-command.
This is divided into two operation modes
Currently only the /removed
file information from the last container (the app container) in the chain is taken into account.
But it's required to take all eventually existing /removed
files from all containers used to provision the instance into account
/removed
gets ignoredIn a flake setup that uses the resume
feature the container ID for this instance is looked up according to the exact command line of the application. Example:
oci-ctl register --container foo --app /usr/bin/aws --resume true
aws --help
The call of the app is connected to the container ID by it's .cid
file
/tmp/flakes/aws--help.cid
resuming the container instance currently only happens if the exact same command is called again.
However if the command line changes e.g aws ec2 help
, a new instance is created.
It might be better to only store the base app name and resume this instance for any call of this application.
The .cid
file name would then change to:
/tmp/flakes/aws.cid
aws ec2 help @mine
As of now the file name to oci-ctl build-deb --oci ...
must use the .docker.tar
extension.
This is actually pretty misleading because we are dealing with OCI containers only
.oci.tar
As of today delta containers can be provisioned against one base container and optional host dependencies. However when creating delta of deltas this introduces additional layers that needs to be taken into account for provisioning the container. A possible example could like this
oci-ctl register --container aws-cli --app /usr/bin/aws --base basesystem --layer basepython --layer ...
Multiple additional container layers can be used for provisioning a container instance
There is currently no man page for /usr/sbin/sci
man page exists
As we plan to use firecracker as the launcher for applications, the process to run commands involves the startup
of the VM through firecracker and next to it the startup of the application. As of today the output of the command
is captured through the console setting of the VM console=ttyS0
which transfers all output to the caller. This works
but also includes other messages not related to the actual command call. To keep the bootup of the VM as silent as
possible we currently use the kernel options loglevel=0 quiet
which eliminates most of the unwanted messages but
surely not all of them and it also depends on the used kernel image if there might be more unrelated messages.
This opens the question how we can separate the command output from the VM startup messages in a generic way
The following code from firecracker-pilot/src/firecracker.rs
can be improved as written in the fixme note
pub fn get_exec_port() -> u32 {
/*!
Create random execution port
!*/
let mut random = rand::thread_rng();
// FIXME: A more stable version
// should check for already running socket connections
// and if the same number is used for an already running one
// another rand should be called
let exec_port = random.gen_range(49200..60000);
exec_port
}
sci is a static compiled binary to be used with any rootfs. When using an app registration for firecracker in resume mode, sci calls socat in VSOCK-CONNECT mode to establish a connection. That socat creates a dependency to socat in the guest. We could prevent that by adding an implementation based on:
To run the command in the same way socat does it
To allow different paths for the application on the host and inside of the container the symlink structure in oci-register
has been changed. This change must be adapted in oci-pilot once oci-register has merged the commit implementing the registration
Along with the firecracker support and to allow --resume
in the flake-ctl firecracker register ...
registration we need a service running inside of the VM which accepts and calls commands from the host to be called inside of the VM. We named this command sce
(service command execution)
sci
support for reading and executing commands through vsoc is implementedflake-ctl
firecracker register supports --resume optionfirecracker-pilot
supports command handling through a vsoc connectionWhen using firecracker apps in resume mode, any quoting will be eaten while the command is transfer through the vsock.
For example:
mybash -c "'ls -l'"
does not work in resume mode but work in standard mode
Allow to list registered applications
As of today the setting to attach to the app if running can be configured in the flake file directly via
runtime:
attach: true
There is no way to do this on the command line via oci-ctl register ...
It seems firecracker has no sharing concept at the moment. We should at least add this to the documentation unless there is a better solution available.
At the moment one package provides all pilots and the ctrl tool. This should be changed to have dedicated pilot packages (flake-pilot-podman, flake-pilot-firecracker) and a base package flake-pilot
On the effort to support the firecracker engine we need to add support for pulling firecracker ready images onto the target system. This should be done via:
flake-ctl firecracker pull --rootfs-image ... --kernel-image ... [--initrd-image ...]
flake-ctl firecracker pull --kis-image ...
The data for a firecracker image should be organized below /var/lib/firecracker/images
flake-ctl firecracker pull
implementation existsflake-ctl firecracker pull
can at least be used together with --kis-image
provided here: https://build.opensuse.org/package/show/home:marcus.schaefer:delta_containers/firecracker_base_leap_systemFireCracker is a project which aims for running KVM based virtual machines in a fast way. We would like to add support for a firecracker-pilot
to allow registration and running of applications inside of FireCracker based virtual machines
firecracker-pilot
exists and is able to startup a FireCracker compatible image, runs an application inside e.g by init="/the/app", redirects input/output channels to the caller e.g via serial console and gracefully handles the shutdown of the VM such that calling the app feels like a native application the systemflake-ctl
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.