tailhook / lithos Goto Github PK
View Code? Open in Web Editor NEWProcess supervisor that supports linux containers
Home Page: http://lithos.readthedocs.org
License: MIT License
Process supervisor that supports linux containers
Home Page: http://lithos.readthedocs.org
License: MIT License
Currently it's just can't create cgroup
. We need to know which cgroup can't be created
I still don’t understand high-level concept of Lithos’ configuration. Why is configuration of a container separated into three configs (sandbox, process and container), specifically why is sandbox and process not a single config? What’s the idea behind this and how it’s supposed to be used? Could you please provide example of some non-trivial setup that demonstrates this?
In #8 (comment) you wrote:
Well, technically we have a concept of sandboxes just for that. I.e. you allow users to upload images to certain directory configured in a sandbox. And arrange some script to update their
/etc/lithos/processes/<sandbox-name>.yaml
. This allows them to freely update their images, add and remove services, but they can't escape a sandbox.
This quite makes sense, except one thing – cgroup limits are configured only in the container config (inside image). Cgroup limits are exactly the thing I’d like to enforce for users and not let them change them.
Traceback, on release version is useless, but:
thread '<main>' panicked at 'called `Option::unwrap()` on a `None` value', ../src/libcore/option.rs:330
stack backtrace:
1: 0x560e5ea4c3d0 - sys::backtrace::tracing::imp::write::h3675b4f0ca767761Xcv
2: 0x560e5ea4f8fb - panicking::default_handler::_$u7b$$u7b$closure$u7d$$u7d$::closure.44519
3: 0x560e5ea4f568 - panicking::default_handler::h18faf4fbd296d909lSz
4: 0x560e5ea3c94c - sys_common::unwind::begin_unwind_inner::hfb5d07d6e405c6bbg1t
5: 0x560e5ea3cdd8 - sys_common::unwind::begin_unwind_fmt::h8b491a76ae84af35m0t
6: 0x560e5ea4b981 - rust_begin_unwind
7: 0x560e5ea7f61f - panicking::panic_fmt::h98b8cbb286f5298alcM
8: 0x560e5ea7f8f8 - panicking::panic::h4265c0105caa1121SaM
9: 0x560e5e910006 - run::hb0fe367667823cd6nab
10: 0x560e5e96ca05 - main::h163a2cd1736181e3R3b
11: 0x560e5ea4f1c4 - sys_common::unwind::try::try_fn::h14622312129452522850
12: 0x560e5ea4b90b - __rust_try
13: 0x560e5ea4ec5b - rt::lang_start::h0ba42f7a8c46a626rKz
14: 0x7f6749f94f44 - __libc_start_main
15: 0x560e5e8fab78 - <unknown>
Sometimes we want to pass some values to the running process in a very lightweight way at runtime. Use cases include:
Lithos side:
log-level: warn
Application's task:
The expected cost of reading a variable is a single atomic load instruction. So you can check the variable in every logging instruction. Even rust can afford it.
Surely, this is not for full blown configs. We think that this feature needed for things that are:
On the other hand, most things that traditionally where implemented via signals might be use this mechanism, because in most cases signal handlers do exactly the same thing: set some global flag and continue work until normal code flow checks the flag
SIGUSR1
. The downsides are:
EINTR
handling still has issues in many languages and environments% sudo ./example_configs.sh
[sudo] password for a:
Copying examples/py into the system
WARNING: This Command will remove /etc/lithos from the system
... hopefully you run this in a virtual machine
... but let you think for 10 seconds
10 \r9 \r8 \r7 \r6 \r5 \r4 \r3 \r2 \r1 \r0 \rOkay proceeding...
building file list ... done
sent 188 bytes received 11 bytes 398.00 bytes/sec
total size is 309 speedup is 1.55
Config /home/a/proj/lithos/vagga.yaml cannot be read: Error parsing config /home/a/proj/lithos/vagga.yaml: /home/a/proj/lithos/vagga.yaml:89:7: Parse Error: Expected scalar, sequence or mapping, got Anchor
Tested with vagga versions both master
and 0.6.2
While trying to setup bridged network on Alpine Linux v3.7, the initialization procedure setup_network
fails while checking interface avalibility with arping
on the following error:
Fatal error running "container/container.0": arping failed: exited with code 1
Full log of the container attache below:
[2018-06-19T11:46:53Z][DEBUG]src/mount.rs:148: Making private "/"
[2018-06-19T11:46:53Z][DEBUG]src/mount.rs:132: Remount readonly: "/run/lithos/mnt"
[2018-06-19T11:46:53Z][INFO] [container/container.0] Starting container
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:106: Running "ip" "link" "add" "li_ca02f9_0001" "type" "veth" "peer" "name" "li-ca02f9-0001"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:119: Running "ip" "link" "set" "dev" "li_ca02f9_0001" "netns" "/proc/28231/fd/6"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:131: Running "brctl" "addif" "br0" "li_ca02f9_0001"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:142: Running "ip" "link" "set" "li_ca02f9_0001" "up"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:155: Running "ip" "link" "set" "lo" "up"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:168: Running "ip" "addr" "add" "10.0.0.1/24" "dev" "li-ca02f9-0001"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:179: Running "ip" "link" "set" "li-ca02f9-0001" "up"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:191: Running "ip" "route" "add" "default" "via" "10.0.0.1"
[2018-06-19T11:46:53Z][DEBUG]src/bin/lithos_knot/setup_network.rs:203: Running "arping" "-U" "10.0.0.1" "-c1"
[2018-06-19T11:46:53Z][ERROR] Fatal error running "container/container.0": arping failed: exited with code 1
If lithos_tree
process is run with strace
as follows, it gives us an indication why is this happening:
$ strace -f /usr/bin/lithos_tree
... unrelated output ommited ...
[pid 32223] execve("/usr/bin/arping", ["/usr/bin/arping", "-U", "10.0.0.1", "-c1"], 0x55d76cb2d0a0 /* 1 var */) = 0
... unrelated output ommited ...
[pid 32223] writev(2, [{iov_base="", iov_len=0}, {iov_base="arping: Too many args on command"..., iov_len=61}], 2arping: Too many args on command line. Expected at most one.
) = 61
[pid 32223] exit_group(1) = ?
[pid 32223] +++ exited with 1 +++
The lithos configurations used are:
master config:
log_level: debug
devfs-dir: /dev
sandbox config:
allow-users: [0, 1, 100]
allow-groups: [0, 101]
image-dir: /var/lib/lithos/images
writable-paths:
/var/log/container: /data
/var/lib/container: /log
bridged-network:
bridge: br0
network: 10.0.0.0/24
default_gateway: 10.0.0.1
process config:
container:
image: container
config: /config/container.yaml
ip-addresses: [10.0.0.1]
It turns out that Alpine uses distribution of arping
(specifically this one by Thomas Habets) which is more strict with regards to argument order validation than arping
distributions that are available in other (more mainstream) Linux distros (there it is usually provided by iputils package).
Simple reordering of arguments is sufficient to get it to work on both implementations of arping (I have tested only on Fedora and Alpine).
Some programs (e.g. java
) support starting on demand from xinetd daemon that binds socket on behalf of the program and passes it to the program using file descriptor 0, i.e. STDIN.
Unfortunately fd: 0
cannot be used in Lithos:
tcp-ports:
8080:
fd: 0
host: 0.0.0.0
reuse-addr: true
$ lithos_tree
thread 'main' panicked at 'Stdio file descriptors must be configured with respective methods instead of passing fd 0 to `file_descritor()`', .../github.com-1ecc6299db9ec823/unshare-0.2.0/src/fds.rs:33:13
unshare
crate explicitly checks if the given fd is greater than 2 and panics when it’s not (see fds.rs:32).
Could you please make it work even for fd 0?
With bridged_network setup, while checking for an veth interface availability on setup_network the arping call fails with a timeout as shown by following output of lithos_tree
process:
--- 10.0.0.1 statistics ---
1 packets transmitted, 0 packets received, 100% unanswered (0 extra)
Fatal error: arping failed: exited with code 1
ARPING 10.0.0.1
Timeout
--- 10.0.0.1 statistics ---
1 packets transmitted, 0 packets received, 100% unanswered (0 extra)
Fatal error: arping failed: exited with code 1
ARPING 10.0.0.1
Timeout
TCP dump from host (bridge interface) shows no ARP response being sent:
tcpdump -i br0 -en "icmp or arp"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:15:05.564340 4a:41:e6:41:9f:a9 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 58: Request who-has 10.0.0.1 (ff:ff:ff:ff:ff:ff) tell 10.0.0.1, length 44
15:15:06.900662 82:84:a5:0c:48:b1 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 58: Request who-has 10.0.0.1 (ff:ff:ff:ff:ff:ff) tell 10.0.0.1, length 44
15:15:08.253697 ce:91:ed:2d:a7:19 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 58: Request who-has 10.0.0.1 (ff:ff:ff:ff:ff:ff) tell 10.0.0.1, length 44
15:15:09.583953 ba:3f:bc:a6:ac:e8 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 58: Request who-has 10.0.0.1 (ff:ff:ff:ff:ff:ff) tell 10.0.0.1, length 44
15:15:10.930642 3a:f1:e9:d9:3f:23 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 58: Request who-has 10.0.0.1 (ff:ff:ff:ff:ff:ff) tell 10.0.0.1, length 44
^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
bridge info on host:
br0 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet addr:10.0.0.200 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::1440:b0ff:fec7:5926/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:1047 errors:0 dropped:0 overruns:0 frame:0
TX packets:727 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:62207 (60.7 KiB) TX bytes:59939 (58.5 KiB)
veth info in the lithos container:
nsenter -n -p --target 1647 /bin/sh
ifconfig -a
li-ca02f9-0001 Link encap:Ethernet HWaddr 5A:23:45:F5:F9:B4
inet addr:10.0.0.1 Bcast:0.0.0.0 Mask:255.255.255.0
inet6 addr: fe80::5823:45ff:fef5:f9b4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:956 (956.0 B) TX bytes:760 (760.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:16 errors:0 dropped:0 overruns:0 frame:0
TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:1376 (1.3 KiB) TX bytes:1376 (1.3 KiB)
I have a trouble understanding what is this check trying to accomplish because:
That means that veth device with 10.0.0.1 address assigned (in the container) effectively sends ARP broadcast asking "Tell me who has 10.0.0.1" to the bridge (located on host) that the other part of veth is attached and (as expected) receives no answer (the only one that can answer this, is the local interface itself that sends the link broadcast, please note that in this case arping
beahaves differently than eg. ping
which allows pinging local interface).
I suspect that the arping call should be performed on the host (parent) namespace (possibly with specific bridge interface selected, what if different hosts with the 10.0.0.1 address are reachable from multiple host's NICs?) instead of child namespace, so that it will send ARP from bridge to veth and not the other way around it happens now.
Please see attached patch for better understanding of my proposal:
diff --git a/src/bin/lithos_knot/setup_network.rs b/src/bin/lithos_knot/setup_network.rs
index f8df4a6..db9227d 100644
--- a/src/bin/lithos_knot/setup_network.rs
+++ b/src/bin/lithos_knot/setup_network.rs
@@ -193,23 +193,27 @@ fn _setup_bridged(sandbox: &SandboxConfig, _child: &ChildInstance, ip: IpAddr)
Ok(s) if s.success() => {}
Ok(s) => bail!("ip route failed: {}", s),
Err(e) => bail!("ip route failed: {}", e),
}
}
+ setns(parent_ns.as_raw_fd(), CloneFlags::CLONE_NEWNET)?;
+
let mut cmd = unshare::Command::new("/usr/bin/arping");
cmd.arg("-U");
cmd.arg("-c1");
cmd.arg(&format!("{}", ip));
debug!("Running {}", cmd.display(&Style::short()));
match cmd.status() {
Ok(s) if s.success() => {}
Ok(s) => bail!("arping failed: {}", s),
Err(e) => bail!("arping failed: {}", e),
}
+ setns(my_ns.as_raw_fd(), CloneFlags::CLONE_NEWNET)?;
+
Ok(())
}
fn _setup_isolated(_sandbox: &SandboxConfig, _child: &ChildInstance)
-> Result<(), Error>
{
I have observed this error on Alpine v3.7 after PR #15 applied.
Thank you for any ideas or opinions on this.
BTW many thanks to you and other contributors for an awesome containerization stuff (not just lithos). It saves us a lot of time and sanity by not having to deal with Docker :).
According to the documentation of Container Configuration there are only three types of variables – TcpPort
, Choice
and Name
– and all used variables must be declared. I’ve checked also sources and it seems that really only these three types are accepted.
That’s very restricting. I’d like to declare e.g. variable for base URI of the application and use it in environ
to pass it into the container as environment variable or via arguments. Is this wrong usage of the variables?
Hi,
it seems that Lithos is designed for running services that may be isolated using namespaces, cgroups, capabilities(?), i.e. running them inside “containers.” I wonder, can I use it even to run ”full OS” (Alpine Linux to be specific) with traditional init system (OpenRC) and multiple services, including OpenSSH server for users to connect into? Like with LXC that I currently use. Theoretically it should be possible, but it seems that it’s really not designed for such use case (?), so are there any limitations or design decisions that makes it really bad idea?
In the past we have relied on keeping secrets in the filesystem and mounting them as:
volumes:
/secrets: !Readonly /secrets
This is fine for smaller installations but has the following downsides:
/secrets
dir (i.e. is the config there is up to date)Add the following to sandbox
config:
secrets-private-key: /etc/some/file
Add the following to container
config:
secret-environ:
SOME_SECRET: I7OO1RBTRYk+oAZ6n/dhRMCDXwgW
The value of SOME_SECRET
is the actual value encoded by a public key.
When process is started, lithos will decode those values and pass them as environment variables to the process.
Upsides:
Downsides:
We may also add the following to sandbox
:
environ-secrets-file: /encrypted/secret.file # file is encrypted
And the following to container
config:
external-secret-environ: [VAR1, VAR2]
To allow changing environ more dynamically. But this means /encrypted/secret.file
has to be deployed separately
Thoughts?
/cc @popravich, @anti-social, @jirutka, #13
Metrics which we can track:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.