Git Product home page Git Product logo

lupine-linux's Introduction

Lupine: Linux in Unikernel Clothes

NOTE

The container image for building the kernel (Linux 4.0) is not available anymore but you can use an older Ubuntu image to build it. We use apline-variant applications that use musl-libc for our benchmarks. PRs are welcomed!

Paper

@inproceedings{10.1145/3342195.3387526,
author = {Kuo, Hsuan-Chi and Williams, Dan and Koller, Ricardo and Mohan, Sibin},
  title = {A Linux in Unikernel Clothing},
  year = {2020},
  isbn = {9781450368827},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3342195.3387526},
  doi = {10.1145/3342195.3387526},
  booktitle = {Proceedings of the Fifteenth European Conference on Computer Systems},
  articleno = {Article 11},
  numpages = {15},
  location = {Heraklion, Greece},
  series = {EuroSys ’20}
}

In this project, we manually configure the Linux kernel so that it becomes as small as 4+MB. We aim to show that it has the following unikernel properties

  • no mode switching (by KML)
  • specialization (by Kconfig)
  • size (by evaluation)
  • efficiency in terms of boot time and memory density (by evaluation)

Contribution

  • a combination of existing Linux configuration specialization and KML for a first-step Linux unikernel
  • an evaluation of what unikernel properties are achieved
  • a discussion of common unikernel tradeoffs (e.g., no smp, no fork) and their impact
  • highlighting next steps for Linux specialization/ cross domain optimization

Setup

  1. Clone project: git clone https://github.com/hckuo/Lupine-Linux.git

  2. Update submodule: git submodule update --init

  3. Pull a docker image that contains the environment for building Linux 4.0 kernel, and create a tag linuxbuild:latest. Run the below commands:

    • docker pull a74731248/linuxbuild:latest
    • docker tag a74731248/linuxbuild:latest linuxbuild:latest
  4. Run make command in the content load_entropy to generate load_entropy file

  5. Build the Lupine unikernel of your interest by following one of the below steps:

    • run sh scripts/build-kernels.sh
    • run sh scripts/build-with-configs.sh configs/<specific_config> <app_config>
      For Eg: sh scripts/build-with-configs.sh configs/lupine-djw-kml.config configs/apps/nginx.config
  6. Build rootfs: sh image2rootfs.sh app tag ext2, the tag must be alpine, because Lupine use musl libc rather than glibc. Example: sh image2rootfs.sh nginx alpine ext2

Run Lupine-Linux

In Firecraker

  • run sh firecrackerd.sh in first shell
  • run sh firecracker-run.sh <path_to_kernel> <path_to_rootfs> init=<init_script> in second shell, the init can be /bin/sh or some scripts that you want to run after Lupine boots up.

In QEMU

  • Make sure to build your kernel with the follwing options turned on:
    • CONFIG_PCI=y
    • CONFIG_VIRTIO_PCI=y
  • Once built you can run using the following command: qemu-system-x86_64 -no-reboot -kernel <path_to_kernel> -drive "file=<path_to_root_fs>,format=raw,if=none,id=hd0" -nographic -nodefaults -serial stdio -device virtio-blk-pci,id=blk0,drive=hd0,scsi=off -append "panic=-1 console=ttyS0 root=/dev/vda rw loglevel=15 nokaslr init=<init_script>

Files

scripts
|-- build-kernels.sh (build all kernels for helloworld, redis and ngnix for all variants)
|-- firecrackerd.sh (wrapper firecracker daeomn)
|-- firecracker-run.sh (wrapper of firecrakcer client)
|-- image2rootfs.sh (create userspace root fs from docker image)
|-- firecracker-lz4bench-run.sh (runs lz4 bench as a firecracker microvm lupine+kml+mmio)
`-- run-helper.sh (shared variables and helper functions)

Theses scripts should be executed at the root directory.

Takeaways:

  • don't rewrite Linux unikernel people!
  • know what unikernel benefits you really care about and what you give up to get them
  • future work should be on app manifests LTO, etc.

Going forward beyond unikernel restrictions

  • smp vs. non-smp
    • smp gives speed benefits, esp when CPU bound
    • may cause size overhead (different config)?
  • fork vs. non-fork
    • same as smp just may add context switches (show in microbenchmark)
  • (how many applications use fork?)
  • dynamic linking vs. static
  • with KML unikernel properties "gracefully degrade"

lupine-linux's People

Contributors

a74731248 avatar djwillia avatar hckuo2 avatar hlef avatar ricarkol avatar vaibhavkulkarni avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lupine-linux's Issues

Broken with recent Redis releases

It seems that Lupine doesn't work with recent Redis releases. The most recent release I managed to run with the currently available configuration is 5.0.4 (using the redis:5.0.4-alpine Docker container image).

Relating lupine `configs/*` to paper

Apologies for being obtuse. It is not clear to me which config files in this repo are lupine-base and which are lupine-general (I expect there will be kml/nokml variants)?

more lupinification for apps

is union enough for the new microvm config

relationship between config size and image size (really should be found with make randomconfig)

KML kernel on Firecracker fails to start

KML kernel on Firecracker fails to start with the following message:

EXT2-fs (vda): error: ext2_lookup: deleted inode referenced: 1228803
cp: can't stat '/trusted/nginx': I/O error

Root cause the is cp command in guest_start.sh where all the respective executables get copied to /trusted

Any idea why we need this? Why not run directly from /usr/sbin?

CDF for docker hub stars

One question we face is "how many popular apps is enough" to make points like "one config is good enough for everything". If we plot a histogram of the number of stars, we can visualize how much popularity we are covering. If we plot it as a CDF, we could make claims like "10 apps cover 80% of the stars on Docker Hub".

image size numbers

  • lupine-base (lupine-djw-kml)
  • lupine-base-nokml (lupine-djw-nokml)
  • hermitux is missing
  • need lupine-tiny for this

Not sure if we want/need to show both kml and nokml here
Maybe all of the lupine derivitives (e.g., nginx, redis) should show up in the graph as errorbars (e.g., to show max and min)

threading experiment for beyond unikernels

we want to see pthreads vs. green threads

this will look bad, but starts to answer the question "what do you lose when you give up a unikernel design principle", in this case single context

we can put it in context of the tradeoff: unikernels have traded generality, SMP, etc. for this

nokml config fails to build when CONFIG_PARAVIRT is set

Although, idea behind Lupine is to use KML patched kernel, I suspect having nokml kernel for performance comparison would be good. But setting CONFIG_PARAVIRT results in a failed build with the following error message:

arch/x86/kernel/paravirt.c:361:14: error: 'native_load_tls' undeclared here (not in a function)
.load_tls = native_load_tls,

This needs to be fixed in IMO.

limitations section

So far, we have a discussion on immutable infrastructure and how the dynamic nature of Lupine makes that a bit harder than something like mirage.

The reset of the section needs to be filled out. Here's what we have as notes so far:

Unikernel benefits not covered:

  • other language-based anal- yses
  • running on a unikernel monitor
  • multiple layered implementations of the same abstraction

build time optimization (cross domain)

  • compile time (limited for C)
  • link time (menuconfig does both compile time and link time)
    e.g., inline crap from the kernel into application code –> can’t do any of it unless application build is also changed –> llvm?

repository docker.io/linuxbuild not found: does not exist or no pull access

when I try to run build-kernel.sh, It occur an error that repository docker.io/linuxbuild not found: does not exist or no pull access because of Unable to find image 'linuxbuild:latest' locally and Trying to pull repository docker.io/library/linuxbuild.
I also try to pull linuxbuild:latest separately use docker pull linuxbuild:latest, it still occur the error that repository docker.io/linuxbuild not found: does not exist or no pull access.
Is that image was removed from docker hub?

Reproducing table & figure data

I'm finding it very difficult to know where to start to reproduce the data behind some of the tables and figures.
I expected something like:

  • scripts/tables/4_data.sh
  • scripts/figures/5_data.sh

etc. etc.

boot numbers

The current boot numbers are not fair between the Linux/firecracker systems and the ukvm/uhyve systems because the latter numbers include teardown time. We should make sure that all are using the same measurement methodology. We have two options:

  • make ukvm/uhyve signal when boot happens (like in fc)
  • include teardown for firecracker as well

it seems that the first option would be best, as fc will likely have worse teardown than ukvm (and we don't really want to focus on the monitor!)

Lupine-tiny

"space/time tradeoffs"
< CONFIG_BASE_FULL=y
< CONFIG_BASE_SMALL=0
< CONFIG_SPARSEMEM_VMEMMAP=y
< CONFIG_TRANSPARENT_HUGEPAGE=y
< CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
< CONFIG_ARCH_RANDOM=y

Missing KML-enabled musl libc

Hi,
In the paper there are references to a KML-enabled musl libc. There are also references to it in the code within the repository. Can this modified code or the patch to the library be provided?

Thanks,
Andy

more boot time numbers

  • lupine-djw-nokml now has paravirt so will be good to see
  • lupine-base (lupine-djw-kml) should also be shown
  • hermitux is missing

need experiment for degree of specialization

I'm almost finished writing up the section on specialization through kernel configuration. For the application-specific specialization, I think it would be nice to evaluate what the configuration looks like for different applications (how different it is). The goal would be to see whether one "general" Lupine configuration is good enough, rather than needing to mess with manifests and things like that.

If we had applications that made sense that didn't need a ton of functionality, that would seem the way to argue for specialization, but with the usual suspects (nginx, redis, etc), I expect things are pretty much going to be the same. Probably it would be best to come up with a "representative list", even if that list is things that don't run on the competitors and use that.

In terms of what to measure: 1) total # configuration options, 2) system calls available in the kernel, 3) kernel binary size, ?) anything else??

@hckuo @ricarkol any ideas on what the representative list should be?

(by the way, the representative list can also be used to roughly show how "compatible" other systems really are)

complexity comparison

one of unikernel proponents stated advantages has to do with complexity... is this measurable? I would expect things like Hermitux (which is immature and lacks completeness) to look really good in this metric

threats to validity

how do we know app-specific configs are complete
how do we know manual categorization is correct

Recompute numbers with union config

We seem to be trying to/on the verge of making a point that the "union config" is in fact better than (or just as good as) having specialization via Kconfig. We need to know how the union config does across all the metrics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.