hckuo / lupine-linux Goto Github PK

Linux in Unikernel Clothing

Makefile 5.63% Dockerfile 0.66% C 42.71% Python 1.27% Shell 49.55% HTML 0.17%

lupine-linux's Introduction

Lupine: Linux in Unikernel Clothes

NOTE

The container image for building the kernel (Linux 4.0) is not available anymore but you can use an older Ubuntu image to build it. We use apline-variant applications that use musl-libc for our benchmarks. PRs are welcomed!

Paper

@inproceedings{10.1145/3342195.3387526,
author = {Kuo, Hsuan-Chi and Williams, Dan and Koller, Ricardo and Mohan, Sibin},
  title = {A Linux in Unikernel Clothing},
  year = {2020},
  isbn = {9781450368827},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3342195.3387526},
  doi = {10.1145/3342195.3387526},
  booktitle = {Proceedings of the Fifteenth European Conference on Computer Systems},
  articleno = {Article 11},
  numpages = {15},
  location = {Heraklion, Greece},
  series = {EuroSys ’20}
}

In this project, we manually configure the Linux kernel so that it becomes as small as 4+MB. We aim to show that it has the following unikernel properties

no mode switching (by KML)
specialization (by Kconfig)
size (by evaluation)
efficiency in terms of boot time and memory density (by evaluation)

Contribution

a combination of existing Linux configuration specialization and KML for a first-step Linux unikernel
an evaluation of what unikernel properties are achieved
a discussion of common unikernel tradeoffs (e.g., no smp, no fork) and their impact
highlighting next steps for Linux specialization/ cross domain optimization

Setup

Clone project: git clone https://github.com/hckuo/Lupine-Linux.git
Update submodule: git submodule update --init
Pull a docker image that contains the environment for building Linux 4.0 kernel, and create a tag linuxbuild:latest. Run the below commands:
- docker pull a74731248/linuxbuild:latest
- docker tag a74731248/linuxbuild:latest linuxbuild:latest
Run make command in the content load_entropy to generate load_entropy file
Build the Lupine unikernel of your interest by following one of the below steps:
- run sh scripts/build-kernels.sh
- run sh scripts/build-with-configs.sh configs/<specific_config> <app_config>
  For Eg: sh scripts/build-with-configs.sh configs/lupine-djw-kml.config configs/apps/nginx.config
Build rootfs: sh image2rootfs.sh app tag ext2, the tag must be alpine, because Lupine use musl libc rather than glibc. Example: sh image2rootfs.sh nginx alpine ext2

Run Lupine-Linux

In Firecraker

run sh firecrackerd.sh in first shell
run sh firecracker-run.sh <path_to_kernel> <path_to_rootfs> init=<init_script> in second shell, the init can be /bin/sh or some scripts that you want to run after Lupine boots up.

In QEMU

Make sure to build your kernel with the follwing options turned on:
- CONFIG_PCI=y
- CONFIG_VIRTIO_PCI=y
Once built you can run using the following command: qemu-system-x86_64 -no-reboot -kernel <path_to_kernel> -drive "file=<path_to_root_fs>,format=raw,if=none,id=hd0" -nographic -nodefaults -serial stdio -device virtio-blk-pci,id=blk0,drive=hd0,scsi=off -append "panic=-1 console=ttyS0 root=/dev/vda rw loglevel=15 nokaslr init=<init_script>

Files

scripts
|-- build-kernels.sh (build all kernels for helloworld, redis and ngnix for all variants)
|-- firecrackerd.sh (wrapper firecracker daeomn)
|-- firecracker-run.sh (wrapper of firecrakcer client)
|-- image2rootfs.sh (create userspace root fs from docker image)
|-- firecracker-lz4bench-run.sh (runs lz4 bench as a firecracker microvm lupine+kml+mmio)
`-- run-helper.sh (shared variables and helper functions)

Theses scripts should be executed at the root directory.

Takeaways:

don't rewrite Linux unikernel people!
know what unikernel benefits you really care about and what you give up to get them
future work should be on app manifests LTO, etc.

Going forward beyond unikernel restrictions

smp vs. non-smp
- smp gives speed benefits, esp when CPU bound
- may cause size overhead (different config)?
fork vs. non-fork
- same as smp just may add context switches (show in microbenchmark)
(how many applications use fork?)
dynamic linking vs. static
with KML unikernel properties "gracefully degrade"

lupine-linux's People

Contributors

Stargazers

Watchers

Forkers

djwillia begleybrothers manabusugimoto8 dnobre dizz kuhnke-m hlef imcg dpreed leopck vaibhavkulkarni daasin garandria razaaliraza muneebtator elain520 cryblo

lupine-linux's Issues

Broken with recent Redis releases

It seems that Lupine doesn't work with recent Redis releases. The most recent release I managed to run with the currently available configuration is 5.0.4 (using the redis:5.0.4-alpine Docker container image).

LTO????

number apps vs config union

show that the union could feasibly become the new microvm config

Rerun Application perf.

Relating lupine `configs/*` to paper

Apologies for being obtuse. It is not clear to me which config files in this repo are lupine-base and which are lupine-general (I expect there will be kml/nokml variants)?

more lupinification for apps

is union enough for the new microvm config

relationship between config size and image size (really should be found with make randomconfig)

KML kernel on Firecracker fails to start

KML kernel on Firecracker fails to start with the following message:

EXT2-fs (vda): error: ext2_lookup: deleted inode referenced: 1228803
cp: can't stat '/trusted/nginx': I/O error

Root cause the is cp command in guest_start.sh where all the respective executables get copied to /trusted

Any idea why we need this? Why not run directly from /usr/sbin?

beyond unikernels experiment and writeup

smp experiments with no-dev vs no-dev-multi

also talk about "graceful degradation of unikernel properties"

remove fig 14

CDF for docker hub stars

One question we face is "how many popular apps is enough" to make points like "one config is good enough for everything". If we plot a histogram of the number of stars, we can visualize how much popularity we are covering. If we plot it as a CDF, we could make claims like "10 apps cover 80% of the stars on Docker Hub".

boot time for other configs

see what lupinified app configurations do to boot time

memory density eval

image size numbers

lupine-base (lupine-djw-kml)
lupine-base-nokml (lupine-djw-nokml)
hermitux is missing
need lupine-tiny for this

Not sure if we want/need to show both kml and nokml here
Maybe all of the lupine derivitives (e.g., nginx, redis) should show up in the graph as errorbars (e.g., to show max and min)

threading experiment for beyond unikernels

we want to see pthreads vs. green threads

this will look bad, but starts to answer the question "what do you lose when you give up a unikernel design principle", in this case single context

we can put it in context of the tradeoff: unikernels have traded generality, SMP, etc. for this

Compatibility with QEMU/KVM

Is Lupine currently compatible with QEMU/KVM? Current configurations seem to work only on Firecracker.

nokml config fails to build when CONFIG_PARAVIRT is set

Although, idea behind Lupine is to use KML patched kernel, I suspect having nokml kernel for performance comparison would be good. But setting CONFIG_PARAVIRT results in a failed build with the following error message:

arch/x86/kernel/paravirt.c:361:14: error: 'native_load_tls' undeclared here (not in a function)
.load_tls = native_load_tls,

This needs to be fixed in IMO.

limitations section

So far, we have a discussion on immutable infrastructure and how the dynamic nature of Lupine makes that a bit harder than something like mirage.

The reset of the section needs to be filled out. Here's what we have as notes so far:

Unikernel benefits not covered:

other language-based anal- yses
running on a unikernel monitor
multiple layered implementations of the same abstraction

build time optimization (cross domain)

compile time (limited for C)
link time (menuconfig does both compile time and link time)
e.g., inline crap from the kernel into application code –> can’t do any of it unless application build is also changed –> llvm?

repository docker.io/linuxbuild not found: does not exist or no pull access

when I try to run build-kernel.sh, It occur an error that repository docker.io/linuxbuild not found: does not exist or no pull access because of Unable to find image 'linuxbuild:latest' locally and Trying to pull repository docker.io/library/linuxbuild.
I also try to pull linuxbuild:latest separately use docker pull linuxbuild:latest, it still occur the error that repository docker.io/linuxbuild not found: does not exist or no pull access.
Is that image was removed from docker hub?

write intro

open loop nginx with httperf

discussion of where manifest comes from

In limitations section, we should highlight how language-based unikernels like mirage can automatically produce the application manifest

Reproducing table & figure data

I'm finding it very difficult to know where to start to reproduce the data behind some of the tables and figures.
I expected something like:

scripts/tables/4_data.sh
scripts/figures/5_data.sh

etc. etc.

reference fig 7 after the intro

rerun perf. experiments with union config

write related work section

write conclusion

boot numbers

The current boot numbers are not fair between the Linux/firecracker systems and the ukvm/uhyve systems because the latter numbers include teardown time. We should make sure that all are using the same measurement methodology. We have two options:

make ukvm/uhyve signal when boot happens (like in fc)
include teardown for firecracker as well

it seems that the first option would be best, as fc will likely have worse teardown than ukvm (and we don't really want to focus on the monitor!)

make app perf figure a table

Perf analysis why it's good

Lupine-tiny

"space/time tradeoffs"
< CONFIG_BASE_FULL=y
< CONFIG_BASE_SMALL=0
< CONFIG_SPARSEMEM_VMEMMAP=y
< CONFIG_TRANSPARENT_HUGEPAGE=y
< CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
< CONFIG_ARCH_RANDOM=y

Missing KML-enabled musl libc

Hi,
In the paper there are references to a KML-enabled musl libc. There are also references to it in the code within the repository. Can this modified code or the patch to the library be provided?

Thanks,
Andy

more boot time numbers

lupine-djw-nokml now has paravirt so will be good to see
lupine-base (lupine-djw-kml) should also be shown
hermitux is missing

need experiment for degree of specialization

I'm almost finished writing up the section on specialization through kernel configuration. For the application-specific specialization, I think it would be nice to evaluate what the configuration looks like for different applications (how different it is). The goal would be to see whether one "general" Lupine configuration is good enough, rather than needing to mess with manifests and things like that.

If we had applications that made sense that didn't need a ton of functionality, that would seem the way to argue for specialization, but with the usual suspects (nginx, redis, etc), I expect things are pretty much going to be the same. Probably it would be best to come up with a "representative list", even if that list is things that don't run on the competitors and use that.

In terms of what to measure: 1) total # configuration options, 2) system calls available in the kernel, 3) kernel binary size, ?) anything else??

@hckuo @ricarkol any ideas on what the representative list should be?

(by the way, the representative list can also be used to roughly show how "compatible" other systems really are)