Git Product home page Git Product logo

google / nsjail Goto Github PK

View Code? Open in Web Editor NEW
2.8K 91.0 265.0 1.41 MB

A lightweight process isolation tool that utilizes Linux namespaces, cgroups, rlimits and seccomp-bpf syscall filters, leveraging the Kafel BPF language for enhanced security.

Home Page: https://nsjail.dev

License: Apache License 2.0

Makefile 1.97% C 0.74% Roff 4.84% C++ 92.30% Dockerfile 0.17%
security linux-namespaces linux process-isolation seccomp-bpf-policies chroot

nsjail's Introduction


This is NOT an official Google product.


Overview

NsJail is a process isolation tool for Linux. It utilizes Linux namespace subsystem, resource limits, and the seccomp-bpf syscall filters of the Linux kernel.

It can help you with (among other things):

  • Isolating networking services (e.g. web, time, DNS), by isolating them from the rest of the OS
  • Hosting computer security challenges (so-called CTFs)
  • Containing invasive syscall-level OS fuzzers

Features:


What forms of isolation does it provide

  1. Linux namespaces: UTS (hostname), MOUNT (chroot), PID (separate PID tree), IPC, NET (separate networking context), USER, CGROUPS
  2. FS constraints: chroot(), pivot_root(), RO-remounting, custom /proc and tmpfs mount points
  3. Resource limits (wall-time/CPU time limits, VM/mem address space limits, etc.)
  4. Programmable seccomp-bpf syscall filters (through the kafel language)
  5. Cloned and isolated Ethernet interfaces
  6. Cgroups for memory and PID utilization control

Which use-cases are supported

Isolation of network services (inetd style)

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

  • Server:
 $ ./nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
  • Client:
 $ nc 127.0.0.1 9000
 / $ ifconfig
 / $ ifconfig -a
 lo    Link encap:Local Loopback
       LOOPBACK  MTU:65536  Metric:1
       RX packets:0 errors:0 dropped:0 overruns:0 frame:0
       TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
       RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 3 99999    {busybox} ps wuax
 / $

Isolation with access to a private, cloned interface (requires root/setuid)

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

$ sudo ./nsjail --user 9999 --group 9999 --macvlan_iface eth0 --chroot /chroot/ -Mo --macvlan_vs_ip 192.168.0.44 --macvlan_vs_nm 255.255.255.0 --macvlan_vs_gw 192.168.0.1 -- /bin/sh -i
/ $ id
uid=9999 gid=9999
/ $ ip addr sh
1: lo:  mtu 65536 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: vs:  mtu 1500 qdisc noqueue 
    link/ether ca:a2:69:21:33:66 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.44/24 brd 192.168.0.255 scope global vs
       valid_lft forever preferred_lft forever
    inet6 fe80::c8a2:69ff:fe21:cd66/64 scope link 
       valid_lft forever preferred_lft forever
/ $ nc 217.146.165.209 80
GET / HTTP/1.0

HTTP/1.0 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: https://www.google.ch/?gfe_rd=cr&ei=cEzWVrG2CeTI8ge88ofwDA
Content-Length: 258
Date: Wed, 02 Mar 2016 02:14:08 GMT

...
...
/ $ 

Isolation of local processes

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

 $ ./nsjail -Mo --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
 / $ ifconfig -a
 lo    Link encap:Local Loopback
       LOOPBACK  MTU:65536  Metric:1
       RX packets:0 errors:0 dropped:0 overruns:0 frame:0
       TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
       RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
 / $ id
 uid=99999 gid=99999
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 4 99999    {busybox} ps wuax
 / $exit
 $

Isolation of local processes (and re-running them, if necessary)

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

 $ ./nsjail -Mr --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
 BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
 Enter 'help' for a list of built-in commands.
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 2 99999    {busybox} ps wuax
 / $ exit
 BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
 Enter 'help' for a list of built-in commands.
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 2 99999    {busybox} ps wuax
 / $

Bash in a minimal file-system with uid==0 and access to /dev/urandom only

$ ./nsjail -Mo --user 0 --group 99999 -R /bin/ -R /lib -R /lib64/ -R /usr/ -R /sbin/ -T /dev -R /dev/urandom --keep_caps -- /bin/bash -i
[2017-05-24T17:08:02+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:08:02+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/bin/bash', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:true, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/usr/' dst:'/usr/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/sbin/' dst:'/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/dev' type:'tmpfs' flags:0 options:'size=4194304' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/dev/urandom' dst:'/dev/urandom' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:08:02+0200] Uid map: inside_uid:0 outside_uid:69664
[2017-05-24T17:08:02+0200] Gid map: inside_gid:99999 outside_gid:5000
[2017-05-24T17:08:02+0200] Executing '/bin/bash' for '[STANDALONE_MODE]'
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
bash-4.3# ls -l
total 28
drwxr-xr-x   2 65534 65534  4096 May 15 14:04 bin
drwxrwxrwt   2     0 99999    60 May 24 15:08 dev
drwxr-xr-x  28 65534 65534  4096 May 15 14:10 lib
drwxr-xr-x   2 65534 65534  4096 May 15 13:56 lib64
dr-xr-xr-x 391 65534 65534     0 May 24 15:08 proc
drwxr-xr-x   2 65534 65534 12288 May 15 14:16 sbin
drwxr-xr-x  17 65534 65534  4096 May 15 13:58 usr
bash-4.3# id
uid=0 gid=99999 groups=65534,99999
bash-4.3# exit
exit
[2017-05-24T17:08:05+0200] PID: 129839 exited with status: 0, (PIDs left: 0)

/usr/bin/find in a minimal file-system (only /usr/bin/find accessible from /usr/bin)

$ ./nsjail -Mo --user 99999 --group 99999 -R /lib/x86_64-linux-gnu/ -R /lib/x86_64-linux-gnu -R /lib64 -R /usr/bin/find -R /dev/urandom --keep_caps -- /usr/bin/find / | wc -l
[2017-05-24T17:04:37+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:04:37+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/usr/bin/find', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:true, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:04:37+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib/x86_64-linux-gnu/' dst:'/lib/x86_64-linux-gnu/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib/x86_64-linux-gnu' dst:'/lib/x86_64-linux-gnu' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib64' dst:'/lib64' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/usr/bin/find' dst:'/usr/bin/find' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:04:37+0200] Mount point: src:'/dev/urandom' dst:'/dev/urandom' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:04:37+0200] Uid map: inside_uid:99999 outside_uid:69664
[2017-05-24T17:04:37+0200] Gid map: inside_gid:99999 outside_gid:5000
[2017-05-24T17:04:37+0200] Executing '/usr/bin/find' for '[STANDALONE_MODE]'
/usr/bin/find: `/proc/tty/driver': Permission denied
2289
[2017-05-24T17:04:37+0200] PID: 129525 exited with status: 1, (PIDs left: 0)

Using /etc/subuid

$ tail -n1 /etc/subuid
user:10000000:1
$ ./nsjail -R /lib -R /lib64/ -R /usr/lib -R /usr/bin/ -R /usr/sbin/ -R /bin/ -R /sbin/ -R /dev/null -U 0:10000000:1 -u 0 -R /tmp/ -T /tmp/ -- /bin/ls -l /usr/
[2017-05-24T17:12:31+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:12:31+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/bin/ls', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/lib' dst:'/usr/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/bin/' dst:'/usr/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/sbin/' dst:'/usr/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/sbin/' dst:'/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/dev/null' dst:'/dev/null' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:12:31+0200] Mount point: src:'/tmp/' dst:'/tmp/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/tmp/' type:'tmpfs' flags:0 options:'size=4194304' isDir:True
[2017-05-24T17:12:31+0200] Uid map: inside_uid:0 outside_uid:69664
[2017-05-24T17:12:31+0200] Gid map: inside_gid:5000 outside_gid:5000
[2017-05-24T17:12:31+0200] Newuid mapping: inside_uid:'0' outside_uid:'10000000' count:'1'
[2017-05-24T17:12:31+0200] Executing '/bin/ls' for '[STANDALONE_MODE]'
total 120
drwxr-xr-x   5 65534 65534 77824 May 24 12:25 bin
drwxr-xr-x 210 65534 65534 20480 May 22 16:11 lib
drwxr-xr-x   4 65534 65534 20480 May 24 00:24 sbin
[2017-05-24T17:12:31+0200] PID: 130841 exited with status: 0, (PIDs left: 0)

Even more contrained shell (with seccomp-bpf policies)

$ ./nsjail --chroot / --seccomp_string 'ALLOW { write, execve, brk, access, mmap, open, openat, newfstat, close, read, mprotect, arch_prctl, munmap, getuid, getgid, getpid, rt_sigaction, geteuid, getppid, getcwd, getegid, ioctl, fcntl, newstat, clone, wait4, rt_sigreturn, exit_group } DEFAULT KILL' -- /bin/sh -i
[2017-01-15T21:53:08+0100] Mode: STANDALONE_ONCE
[2017-01-15T21:53:08+0100] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, uid:(ns:1000, global:1000), gid:(ns:1000, global:1000), time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-01-15T21:53:08+0100] Mount point: src:'/' dst:'/' type:'' flags:0x5001 options:''
[2017-01-15T21:53:08+0100] Mount point: src:'(null)' dst:'/proc' type:'proc' flags:0x0 options:''
[2017-01-15T21:53:08+0100] PID: 18873 about to execute '/bin/sh' for [STANDALONE_MODE]
/bin/sh: 0: can't access tty; job control turned off
$ set
IFS='
'
OPTIND='1'
PATH='/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
PPID='0'
PS1='$ '
PS2='> '
PS4='+ '
PWD='/'
$ id
Bad system call
$ exit
[2017-01-15T21:53:17+0100] PID: 18873 exited with status: 159, (PIDs left: 0)

Configuration file

You will also find all examples in the configs directory.


config.proto contains ProtoBuf schema for nsjail's configuration format.


You can examine an example config file in configs/bash-with-fake-geteuid.cfg.

Usage:

$ ./nsjail --config configs/bash-with-fake-geteuid.cfg

You can also override certain options with command-line options. Here, the executed binary (/bin/bash) is overriden with /usr/bin/id, yet options from configs/bash-with-fake-geteuid.cfg still apply

$ ./nsjail --config configs/bash-with-fake-geteuid.cfg -- /usr/bin/id
...
[INSIDE-JAIL]: id
uid=999999 gid=999998 euid=4294965959 groups=999998,65534
[INSIDE-JAIL]: exit
[2017-05-27T18:45:40+0200] PID: 16579 exited with status: 0, (PIDs left: 0)

You might also want to try using configs/home-documents-with-xorg-no-net.cfg.

$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/evince /user/Documents/doc.pdf
$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/geeqie /user/Documents/
$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/gv /user/Documents/doc.pdf
$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/mupdf /user/Documents/doc.pdf

The configs/firefox-with-net.cfg config file will allow you to run firefox inside a sandboxed environment:

$ ./nsjail --config configs/firefox-with-net.cfg

A more complex setup, which utilizes virtualized (cloned) Ethernet interfaces (to separate it from the main network namespace), can be found in configs/firefox-with-cloned-net.cfg. Remember to change relevant UIDs and Ethernet interface names before use.

As using cloned Ethernet interfaces (MACVTAP) required root privileges, you'll have to run it under sudo:

$ sudo ./nsjail --config configs/firefox-with-cloned-net.cfg

More info

The command-line options should be self-explanatory, while the proto-buf config options are described in config.proto

./nsjail --help
 Usage: ./nsjail [options] -- path_to_command [args]
 Options:
  --help|-h 
 	Help plz..
  --mode|-M VALUE
 	Execution mode (default: 'o' [MODE_STANDALONE_ONCE]):
	l: Wait for connections on a TCP port (specified with --port) [MODE_LISTEN_TCP]
	o: Launch a single process on the console using clone/execve [MODE_STANDALONE_ONCE]
	e: Launch a single process on the console using execve [MODE_STANDALONE_EXECVE]
	r: Launch a single process on the console with clone/execve, keep doing it forever [MODE_STANDALONE_RERUN]
  --config|-C VALUE
 	Configuration file in the config.proto ProtoBuf format (see configs/ directory for examples)
  --exec_file|-x VALUE
 	File to exec (default: argv[0])
  --execute_fd 
 	Use execveat() to execute a file-descriptor instead of executing the binary path. In such case argv[0]/exec_file denotes a file path before mount namespacing
  --chroot|-c VALUE
 	Directory containing / of the jail (default: none)
  --no_pivotroot 
 	When creating a mount namespace, use mount(MS_MOVE) and chroot rather than pivot_root. Usefull when pivot_root is disallowed (e.g. initramfs). Note: escapable is some configuration
  --rw 
 	Mount chroot dir (/) R/W (default: R/O)
  --user|-u VALUE
 	Username/uid of processes inside the jail (default: your current uid). You can also use inside_ns_uid:outside_ns_uid:count convention here. Can be specified multiple times
  --group|-g VALUE
 	Groupname/gid of processes inside the jail (default: your current gid). You can also use inside_ns_gid:global_ns_gid:count convention here. Can be specified multiple times
  --hostname|-H VALUE
 	UTS name (hostname) of the jail (default: 'NSJAIL')
  --cwd|-D VALUE
 	Directory in the namespace the process will run (default: '/')
  --port|-p VALUE
 	TCP port to bind to (enables MODE_LISTEN_TCP) (default: 0)
  --bindhost VALUE
 	IP address to bind the port to (only in [MODE_LISTEN_TCP]), (default: '::')
  --max_conns VALUE
 	Maximum number of connections across all IPs (only in [MODE_LISTEN_TCP]), (default: 0 (unlimited))
  --max_conns_per_ip|-i VALUE
 	Maximum number of connections per one IP (only in [MODE_LISTEN_TCP]), (default: 0 (unlimited))
  --log|-l VALUE
 	Log file (default: use log_fd)
  --log_fd|-L VALUE
 	Log FD (default: 2)
  --time_limit|-t VALUE
 	Maximum time that a jail can exist, in seconds (default: 600)
  --max_cpus VALUE
 	Maximum number of CPUs a single jailed process can use (default: 0 'no limit')
  --daemon|-d 
 	Daemonize after start
  --verbose|-v 
 	Verbose output
  --quiet|-q 
 	Log warning and more important messages only
  --really_quiet|-Q 
 	Log fatal messages only
  --keep_env|-e 
 	Pass all environment variables to the child process (default: all envars are cleared)
  --env|-E VALUE
 	Additional environment variable (can be used multiple times). If the envar doesn't contain '=' (e.g. just the 'DISPLAY' string), the current envar value will be used
  --keep_caps 
 	Don't drop any capabilities
  --cap VALUE
 	Retain this capability, e.g. CAP_PTRACE (can be specified multiple times)
  --silent 
 	Redirect child process' fd:0/1/2 to /dev/null
  --stderr_to_null 
 	Redirect child process' fd:2 (STDERR_FILENO) to /dev/null
  --skip_setsid 
 	Don't call setsid(), allows for terminal signal handling in the sandboxed process. Dangerous
  --pass_fd VALUE
 	Don't close this FD before executing the child process (can be specified multiple times), by default: 0/1/2 are kept open
  --disable_no_new_privs 
 	Don't set the prctl(NO_NEW_PRIVS, 1) (DANGEROUS)
  --rlimit_as VALUE
 	RLIMIT_AS in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 4096)
  --rlimit_core VALUE
 	RLIMIT_CORE in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 0)
  --rlimit_cpu VALUE
 	RLIMIT_CPU, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 600)
  --rlimit_fsize VALUE
 	RLIMIT_FSIZE in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 1)
  --rlimit_nofile VALUE
 	RLIMIT_NOFILE, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 32)
  --rlimit_nproc VALUE
 	RLIMIT_NPROC, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
  --rlimit_stack VALUE
 	RLIMIT_STACK in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
  --rlimit_memlock VALUE
 	RLIMIT_MEMLOCK in KB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
  --rlimit_rtprio VALUE
 	RLIMIT_RTPRIO, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
  --rlimit_msgqueue VALUE
 	RLIMIT_MSGQUEUE in bytes, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
  --disable_rlimits 
 	Disable all rlimits, default to limits set by parent
  --persona_addr_compat_layout 
 	personality(ADDR_COMPAT_LAYOUT)
  --persona_mmap_page_zero 
 	personality(MMAP_PAGE_ZERO)
  --persona_read_implies_exec 
 	personality(READ_IMPLIES_EXEC)
  --persona_addr_limit_3gb 
 	personality(ADDR_LIMIT_3GB)
  --persona_addr_no_randomize 
 	personality(ADDR_NO_RANDOMIZE)
  --disable_clone_newnet|-N 
 	Don't use CLONE_NEWNET. Enable global networking inside the jail
  --disable_clone_newuser 
 	Don't use CLONE_NEWUSER. Requires euid==0
  --disable_clone_newns 
 	Don't use CLONE_NEWNS
  --disable_clone_newpid 
 	Don't use CLONE_NEWPID
  --disable_clone_newipc 
 	Don't use CLONE_NEWIPC
  --disable_clone_newuts 
 	Don't use CLONE_NEWUTS
  --disable_clone_newcgroup 
 	Don't use CLONE_NEWCGROUP. Might be required for kernel versions < 4.6
  --enable_clone_newtime 
 	Use CLONE_NEWTIME. Supported with kernel versions >= 5.3
  --uid_mapping|-U VALUE
 	Add a custom uid mapping of the form inside_uid:outside_uid:count. Setting this requires newuidmap (set-uid) to be present
  --gid_mapping|-G VALUE
 	Add a custom gid mapping of the form inside_gid:outside_gid:count. Setting this requires newgidmap (set-uid) to be present
  --bindmount_ro|-R VALUE
 	List of mountpoints to be mounted --bind (ro) inside the container. Can be specified multiple times. Supports 'source' syntax, or 'source:dest'
  --bindmount|-B VALUE
 	List of mountpoints to be mounted --bind (rw) inside the container. Can be specified multiple times. Supports 'source' syntax, or 'source:dest'
  --tmpfsmount|-T VALUE
 	List of mountpoints to be mounted as tmpfs (R/W) inside the container. Can be specified multiple times. Supports 'dest' syntax. Alternatively, use '-m none:dest:tmpfs:size=8388608'
  --mount|-m VALUE
 	Arbitrary mount, format src:dst:fs_type:options
  --symlink|-s VALUE
 	Symlink, format src:dst
  --disable_proc 
 	Disable mounting procfs in the jail
  --proc_path VALUE
 	Path used to mount procfs (default: '/proc')
  --proc_rw 
 	Is procfs mounted as R/W (default: R/O)
  --seccomp_policy|-P VALUE
 	Path to file containing seccomp-bpf policy (see kafel/)
  --seccomp_string VALUE
 	String with kafel seccomp-bpf policy (see kafel/)
  --seccomp_log 
 	Use SECCOMP_FILTER_FLAG_LOG. Log all actions except SECCOMP_RET_ALLOW). Supported since kernel version 4.14
  --nice_level VALUE
 	Set jailed process niceness (-20 is highest -priority, 19 is lowest). By default, set to 19
  --cgroup_mem_max VALUE
 	Maximum number of bytes to use in the group (default: '0' - disabled)
  --cgroup_mem_memsw_max VALUE
 	Maximum number of memory+swap bytes to use (default: '0' - disabled)
  --cgroup_mem_swap_max VALUE
 	Maximum number of swap bytes to use (default: '-1' - disabled)
  --cgroup_mem_mount VALUE
 	Location of memory cgroup FS (default: '/sys/fs/cgroup/memory')
  --cgroup_mem_parent VALUE
 	Which pre-existing memory cgroup to use as a parent (default: 'NSJAIL')
  --cgroup_pids_max VALUE
 	Maximum number of pids in a cgroup (default: '0' - disabled)
  --cgroup_pids_mount VALUE
 	Location of pids cgroup FS (default: '/sys/fs/cgroup/pids')
  --cgroup_pids_parent VALUE
 	Which pre-existing pids cgroup to use as a parent (default: 'NSJAIL')
  --cgroup_net_cls_classid VALUE
 	Class identifier of network packets in the group (default: '0' - disabled)
  --cgroup_net_cls_mount VALUE
 	Location of net_cls cgroup FS (default: '/sys/fs/cgroup/net_cls')
  --cgroup_net_cls_parent VALUE
 	Which pre-existing net_cls cgroup to use as a parent (default: 'NSJAIL')
  --cgroup_cpu_ms_per_sec VALUE
 	Number of milliseconds of CPU time per second that the process group can use (default: '0' - no limit)
  --cgroup_cpu_mount VALUE
 	Location of cpu cgroup FS (default: '/sys/fs/cgroup/cpu')
  --cgroup_cpu_parent VALUE
 	Which pre-existing cpu cgroup to use as a parent (default: 'NSJAIL')
  --cgroupv2_mount VALUE
 	Location of cgroupv2 directory (default: '/sys/fs/cgroup')
  --use_cgroupv2 
 	Use cgroup v2
  --detect_cgroupv2 
 	Use cgroupv2, if it is available. (Specify instead of use_cgroupv2)
  --iface_no_lo 
 	Don't bring the 'lo' interface up
  --iface_own VALUE
 	Move this existing network interface into the new NET namespace. Can be specified multiple times
  --macvlan_iface|-I VALUE
 	Interface which will be cloned (MACVLAN) and put inside the subprocess' namespace as 'vs'
  --macvlan_vs_ip VALUE
 	IP of the 'vs' interface (e.g. "192.168.0.1")
  --macvlan_vs_nm VALUE
 	Netmask of the 'vs' interface (e.g. "255.255.255.0")
  --macvlan_vs_gw VALUE
 	Default GW for the 'vs' interface (e.g. "192.168.0.1")
  --macvlan_vs_ma VALUE
 	MAC-address of the 'vs' interface (e.g. "ba:ad:ba:be:45:00")
  --macvlan_vs_mo VALUE
 	Mode of the 'vs' interface. Can be either 'private', 'vepa', 'bridge' or 'passthru' (default: 'private')
  --disable_tsc 
 	Disable rdtsc and rdtscp instructions. WARNING: To make it effective, you also need to forbid `prctl(PR_SET_TSC, PR_TSC_ENABLE, ...)` in seccomp rules! (x86 and x86_64 only). Dynamic binaries produced by GCC seem to rely on RDTSC, but static ones should work.
  --forward_signals 
 	Forward fatal signals to the child process instead of always using SIGKILL.
 
 Examples: 
  Wait on a port 31337 for connections, and run /bin/sh
   nsjail -Ml --port 31337 --chroot / -- /bin/sh -i
  Re-run echo command as a sub-process
   nsjail -Mr --chroot / -- /bin/echo "ABC"
  Run echo command once only, as a sub-process
   nsjail -Mo --chroot / -- /bin/echo "ABC"
  Execute echo command directly, without a supervising process
   nsjail -Me --chroot / --disable_proc -- /bin/echo "ABC"

Launching in Docker

To launch nsjail in a docker container clone the repository and build the docker image:

docker build -t nsjailcontainer .

This will build up an image containing njsail and kafel.

From now you can either use it in another Dockerfile (FROM nsjailcontainer) or directly:

docker run --privileged --rm -it nsjailcontainer nsjail --user 99999 --group 99999 --disable_proc --chroot / --time_limit 30 /bin/bash

Contact

nsjail's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nsjail's Issues

Alternatives to macvlan for network virtualization?

From my very coarse understanding of macvlan, it doesn't seem possible to use it if the adapter is virtualized or otherwise allow for multiple MAC addresses (e.g. on VMWare or a cloud hosting provider).

Either I misunderstand how to configure macvlan in these scenarios (and I hope I do!), or is there an alternative to macvlan for forwarding network packets to the jail?

quoting in arg / doc request

Hi,

I have looked at https://github.com/google/nsjail/blob/master/config.proto for a moment but seems my issue is a bit more basic than what it explains.

I'm trying to start an apache instance for testing, but would like to pass some more arguments to it.
This seems to be a problem.

exec_bin {
        path: "/usr/sbin/httpd"
        arg : "-C 'Include /etc/apache2/sysconfig.d/loadmodule.conf' -C 'Include /etc/apache2/sysconfig.d/global.conf' -c 'Include /etc/apache2/sysconfig.d/include.conf' -f /etc/apache2/httpd.conf -X -T"   
}

Here -C Include /path/spec needs to be passed safely to the Apache process.
I've tried a few ways of quoting it and in various ways they end up passed wrong.

I've looked through all the examples in config/ and unfortunately noone had an example case with a more complex arg than say "--hello-this-is-all-we-pass"

Should I happen to figure this, I'll add it myself but so far it looks like a no ;-)

The limit max_conns_per_ip is broken

nsjail/net.c

Lines 165 to 178 in e2529ce

unsigned int cnt = 0;
struct pids_t* p;
TAILQ_FOREACH(p, &nsjconf->pids, pointers) {
if (memcmp(addr.sin6_addr.s6_addr, p->remote_addr.sin6_addr.s6_addr,
sizeof(*p->remote_addr.sin6_addr.s6_addr)) == 0) {
cnt++;
}
}
if (cnt >= nsjconf->max_conns_per_ip) {
LOG_W("Rejecting connection from '%s', max_conns_per_ip limit reached: %u", cs_addr,
nsjconf->max_conns_per_ip);
return false;
}

There is a redundant dereference operator in the sizeof(...) of memcmp, so it would only compare the first byte of address and reject new connections too aggressively.

The relevant structures:

struct sockaddr_in6 {
    sa_family_t     sin6_family;   /* AF_INET6 */
    in_port_t       sin6_port;     /* port number */
    uint32_t        sin6_flowinfo; /* IPv6 flow information */
    struct in6_addr sin6_addr;     /* IPv6 address */
    uint32_t        sin6_scope_id; /* Scope ID (new in 2.4) */
};

struct in6_addr {
    unsigned char   s6_addr[16];   /* IPv6 address */
};

Feature request: rlimit max setting in config

One of my primary usages of nsjail is for isolated build containers. These require quite a lot of resources, so I often set some or most of the rlimit values to 'max' from the command line. It would be handy to be able to do this via the config file.

Init process created even when clone_newpid is false

Currently, when an init process is created in pidInitNs, it does this whenever the mode is execve, even when clone_newpid is false. This is problematic since, in that case, the new process will actually now be a child of the execve'd process, and when that process dies, the "init" process is reparented to 1, and becomes a dangling process that never exits.

I believe the line at

nsjail/pid.c

Line 36 in 152d6d6

if (nsjconf->mode != MODE_STANDALONE_EXECVE) {
should be modified to read:
if (nsjconf->mode != MODE_STANDALONE_EXECVE || !nsjconf->clone_newpid) {

That way, an init process would not be created if the mode is execve but clone_newpid is false.

--disable_clone_newuser ignored

Heya I'm trying to run this command

root@42c59e018224:/# nsjail -Mo --chroot /chroot/ --disable_clone_newcgroup --disable_clone_newuser -E LANG=en_US.UTF-8 -R/usr -R/lib -R/lib64 --user nobody --group nogroup --time_limit 2 --disable_proc --iface_no_lo --cgroup_mem_max 50000 --cgroup_pids_max 1 --quiet -- /usr/bin/python3.6 -ISq -c "print('test')" 

but I run into the following error message:

[2018-05-30T12:33:13+0000] [E][16] void subproc::runChild(nsjconf_t*, int, int, int)():441 clone(flags=CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWNET|SIGCHLD) failed. You probably need root privileges if your system doesn't support CLONE_NEWUSER. Alternatively, you might want to recompile your kernel with support for namespaces or check the setting of the kernel.unprivileged_userns_clone sysctl: Operation not permitted

It appears the --disable_clone_newuser is being ignored?

Termination of nsjail leaves cgroup entries in /sys/fs/cgroup/*/NSJAIL/NSJAIL.*

To reproduce start nsjail in STANDALONE_ONCE mode with active cgroup and terminate nsjail session by pressing ctrl+c (or sending termination signal to nsjail process):

$ ./nsjail -Mo --chroot / --cgroup_pids_max 100 -v -- /bin/bash                   
[2018-01-02T23:08:33.823345+0100] Mode: STANDALONE_ONCE
[2018-01-02T23:08:33.823469+0100] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/bash', bind:[::]:0, max_conns_per_ip:0, time_limi
t:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts
:true, clone_newcgroup:true, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, max_cpus:0
[2018-01-02T23:08:33.823555+0100] Mount point: src:'/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0 options:'' isDir:true
[2018-01-02T23:08:33.823599+0100] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2018-01-02T23:08:33.823647+0100] Uid map: inside_uid:1000 outside_uid:1000 count:1 newuidmap:false
[2018-01-02T23:08:33.823688+0100] Gid map: inside_gid:1000 outside_gid:1000 count:1 newgidmap:false
[2018-01-02T23:08:33.823737+0100] [D][31842] nsjailSetSigHandler():57 Setting sighandler for signal SIGINT (2)
[2018-01-02T23:08:33.823804+0100] [D][31842] nsjailSetSigHandler():57 Setting sighandler for signal SIGQUIT (3)
[2018-01-02T23:08:33.823863+0100] [D][31842] nsjailSetSigHandler():57 Setting sighandler for signal SIGUSR1 (10)
[2018-01-02T23:08:33.823920+0100] [D][31842] nsjailSetSigHandler():57 Setting sighandler for signal SIGALRM (14)
[2018-01-02T23:08:33.823975+0100] [D][31842] nsjailSetSigHandler():57 Setting sighandler for signal SIGCHLD (17)
[2018-01-02T23:08:33.824031+0100] [D][31842] nsjailSetSigHandler():57 Setting sighandler for signal SIGTERM (15)
[2018-01-02T23:08:33.824110+0100] [D][31842] subprocRunChild():455 Creating new process with clone flags:CLONE_NEWNS|CLONE_NEWCGROUP|CLONE_NEWUTS
|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET|SIGCHLD
[2018-01-02T23:08:33.824208+0100] [D][31842] subprocClone():418 Cloning process with flags:CLONE_NEWNS|CLONE_NEWCGROUP|CLONE_NEWUTS|CLONE_NEWIPC|
CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET|SIGCHLD
[2018-01-02T23:08:33.827163+0100] [D][31842] subprocAdd():210 Added pid '31843' with start time '1514930913' to the queue for IP: '[STANDALONE_MO
DE]'
[2018-01-02T23:08:33.827275+0100] [D][31842] cgroupInitNsFromParentPids():90 Create '/sys/fs/cgroup/pids/NSJAIL/NSJAIL.31843' for PID=31843
[2018-01-02T23:08:33.827396+0100] [D][31842] cgroupInitNsFromParentPids():100 Setting '/sys/fs/cgroup/pids/NSJAIL/NSJAIL.31843/pids.max' to '100'
[2018-01-02T23:08:33.827477+0100] [D][31842] utilWriteBufToFile():137 Written '3' bytes to '/sys/fs/cgroup/pids/NSJAIL/NSJAIL.31843/pids.max'
[2018-01-02T23:08:33.827536+0100] [D][31842] cgroupInitNsFromParentPids():109 Adding PID='31843' to '/sys/fs/cgroup/pids/NSJAIL/NSJAIL.31843/task
s'
[2018-01-02T23:08:33.827622+0100] [D][31842] utilWriteBufToFile():137 Written '5' bytes to '/sys/fs/cgroup/pids/NSJAIL/NSJAIL.31843/tasks'
[2018-01-02T23:08:33.827702+0100] [D][31842] utilWriteBufToFile():137 Written '4' bytes to '/proc/31843/setgroups'
[2018-01-02T23:08:33.827769+0100] [D][31842] userGidMapSelf():145 Writing '1000 1000 1
' to '/proc/31843/gid_map'
[2018-01-02T23:08:33.827838+0100] [D][31842] utilWriteBufToFile():137 Written '12' bytes to '/proc/31843/gid_map'
[2018-01-02T23:08:33.827912+0100] [D][31842] userUidMapSelf():117 Writing '1000 1000 1
' to '/proc/31843/uid_map'
[2018-01-02T23:08:33.827980+0100] [D][31842] utilWriteBufToFile():137 Written '12' bytes to '/proc/31843/uid_map'
[2018-01-02T23:08:33.828057+0100] [D][1] userInitNsFromChild():291 setgroups(0, NULL)
[2018-01-02T23:08:33.828198+0100] [D][1] userInitNsFromChild():294 setgroups(NULL) failed: Operation not permitted
[2018-01-02T23:08:33.828264+0100] [D][1] userSetResGid():48 setresgid(1000)
[2018-01-02T23:08:33.828313+0100] [D][1] userSetResUid():64 setresuid(1000)
[2018-01-02T23:08:33.828386+0100] [D][1] mountMkdirAndTest():281 Created accessible directory in '/run/user/1000/nsjail.root'
[2018-01-02T23:08:33.828554+0100] [D][1] mountMkdirAndTest():281 Created accessible directory in '/run/user/1000/nsjail.tmp'
[2018-01-02T23:08:33.828699+0100] [D][1] mountMount():124 Mounting 'src:'/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0 options:'
' isDir:true'
[2018-01-02T23:08:33.828979+0100] [D][1] mountMount():124 Mounting 'src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true'
[2018-01-02T23:08:33.869151+0100] [D][1] mountRemountRO():263 Re-mounting R/O '/' (flags:MS_RDONLY|MS_REMOUNT|MS_BIND|MS_RELATIME|0)
[2018-01-02T23:08:33.869254+0100] [D][1] mountRemountRO():263 Re-mounting R/O '/proc' (flags:MS_RDONLY|MS_REMOUNT|MS_BIND|MS_RELATIME|0)
[2018-01-02T23:08:33.869557+0100] [D][1] utsInitNs():34 Setting hostname to 'NSJAIL'
[2018-01-02T23:08:33.869659+0100] [D][1] capsInitNs():234 Adding the following capabilities to the inheritable set:
[2018-01-02T23:08:33.869971+0100] [D][1] capsInitNs():257 Dropped the following capabilities from the bounding set: CAP_CHOWN CAP_DAC_OVERRIDE CA
P_DAC_READ_SEARCH CAP_FOWNER CAP_FSETID CAP_KILL CAP_SETGID CAP_SETUID CAP_SETPCAP CAP_LINUX_IMMUTABLE CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP
_NET_ADMIN CAP_NET_RAW CAP_IPC_LOCK CAP_IPC_OWNER CAP_SYS_MODULE CAP_SYS_RAWIO CAP_SYS_CHROOT CAP_SYS_PTRACE CAP_SYS_PACCT CAP_SYS_ADMIN CAP_SYS_
BOOT CAP_SYS_NICE CAP_SYS_RESOURCE CAP_SYS_TIME CAP_SYS_TTY_CONFIG CAP_MKNOD CAP_LEASE CAP_AUDIT_WRITE CAP_AUDIT_CONTROL CAP_SETFCAP CAP_MAC_OVER
RIDE CAP_MAC_ADMIN CAP_SYSLOG CAP_WAKE_ALARM CAP_BLOCK_SUSPEND CAP_AUDIT_READ
[2018-01-02T23:08:33.870038+0100] [D][1] capsInitNs():271 Added the following capabilities to the ambient set:
[2018-01-02T23:08:33.870082+0100] [D][1] cpuInit():65 No max_cpus limit set
[2018-01-02T23:08:33.870249+0100] [D][1] containMakeFdsCOEProc():220 FD=0 will be passed to the child process
[2018-01-02T23:08:33.870287+0100] [D][1] containMakeFdsCOEProc():220 FD=1 will be passed to the child process
[2018-01-02T23:08:33.870315+0100] [D][1] containMakeFdsCOEProc():220 FD=2 will be passed to the child process
[2018-01-02T23:08:33.870342+0100] [D][1] containMakeFdsCOEProc():227 FD=3 will be closed before execve()
[2018-01-02T23:08:33.870367+0100] [D][1] containMakeFdsCOEProc():227 FD=4 will be closed before execve()
[2018-01-02T23:08:33.870393+0100] [D][1] containMakeFdsCOEProc():227 FD=5 will be closed before execve()
[2018-01-02T23:08:33.870435+0100] Executing '/bin/bash' for '[STANDALONE_MODE]'
[2018-01-02T23:08:33.870491+0100] [D][1] subprocNewProc():172  Arg[0]: '/bin/bash'
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
bob@NSJAIL:/$ <PRESS CTRL+C HERE>
[2018-01-02T23:08:58.486616+0100] Server stops due to fatal signal (2) caught. Exiting

Observed behavior: empty cgroup NSJAIL.31843 is not removed from the tree:

$ ls /sys/fs/cgroup/pids/NSJAIL/
cgroup.clone_children  cgroup.procs  notify_on_release  NSJAIL.31843  pids.current  pids.events  pids.max  tasks

Expected behavior: NSJAIL.31843 node will be removed from cgroup tree.

Can't run as different user when --disable_clone_newuser is passed in

When running this utility with user namespaces enabled, generally you'll have the outside UID/GID be an unprivileged user, and the inside UID/GID be root. This is needed for nsjail, since the namespace user needs the capability to mount the root tmpfs for the chroot'd environment (in /dev/shm/nsjail.root).

However, if we don't want to run with clone_newuser (maybe because we don't want to expose that attack surface, or our kernel doesn't support it), the mount() operation will fail. This is because the containUserNs() helper is called first, and simply sets our uid/gid if user namespaces are disabled. We are dropping our privileges too early for anything else important to take place (like the mount() operations in containInitMountNs).

bool containContain(struct nsjconf_t * nsjconf)
{
    if (containUserNs(nsjconf) == false) { <-- uid/gid dropped here.
        return false;
    }
    if (containInitPidNs(nsjconf) == false) {
        return false;
    }
    if (containInitMountNs(nsjconf) == false) {
        return false;
    }
    if (containInitNetNs(nsjconf) == false) {
        return false;
    }
    if (containInitUtsNs(nsjconf) == false) {
        return false;
    }
    if (containInitCgroupNs() == false) {
        return false;
    }
    if (containDropPrivs(nsjconf) == false) {
        return false;
    }
    -----> containUserNs() should be moved here. <------
/* */
    /* As non-root */
    if (containCPU(nsjconf) == false) {
        return false;
    }
    if (containSetLimits(nsjconf) == false) {
        return false;
    }
    if (containPrepareEnv(nsjconf) == false) {
        return false;
    }
    if (containMakeFdsCOE(nsjconf) == false) {
        return false;
    }
    return true;
}

Let's try running nsjail -u <non_root> -g <non_root> -R /bin -R /sbin -R /lib -R /lib64 -R /usr -c /tmp/user --disable_clone_newuser -- /bin/bash

Sure enough, running this without the patch we get:

....
[2017-07-24T19:04:01-0700] [D][1] mountMkdirAndTest():255 Created accessible directory in '/dev/shm/nsjail.root'
[2017-07-24T19:04:01-0700] [E][1] mountInitNsInternal():312 mount('/dev/shm/nsjail.root', 'tmpfs'): Operation not permitted
....

Running with the patch, we get:

....
[2017-07-24T19:04:56-0700] [D][1] mountMkdirAndTest():255 Created accessible directory in '/dev/shm/nsjail.root'
[2017-07-24T19:04:56-0700] [D][1] mountMkdirAndTest():255 Created accessible directory in '/dev/shm/nsjail.tmp'
[2017-07-24T19:04:56-0700] [D][1] mountMount():125 mounting 'src:'/tmp/user' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true'
....

Let me know if you have any questions or if the proposed patch introduces other issues.

-Nathan

Confusing cpu cgroup desc.

--cgroup_cpu_ms_per_sec VALUE
	Number of us that the process group can use per second (default: '0' - disabled)

ms_per_sec implies milliseconds, but the description says 'us', is that a contradictory micro seconds?

Invalid chroot path causes loop

Didn't see this reported, kind of odd. Starting nsjail with an invalid chroot path will cause it to spin in a loop. I'm not sure where /run/user even comes from, the correct path should be either /var/run or /run/nsjail no?

example (/chroot is intentionally bogus):

[root@localhost nsjail]# ./nsjail -Mr --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
[2018-07-13T20:18:53-0400] Mode: STANDALONE_RERUN
[2018-07-13T20:18:53-0400] Jail parameters: hostname:'NSJAIL', chroot:'/chroot/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:true, keep_caps:false, disable_no_new_privs:false, max_cpus:0
[2018-07-13T20:18:53-0400] Mount point: src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-07-13T20:18:53-0400] Mount point: src:'' dst:'/proc' flags:'MS_RDONLY' type:'proc' options:'' is_dir:true
[2018-07-13T20:18:53-0400] Uid map: inside_uid:99999 outside_uid:0 count:1 newuidmap:false
[2018-07-13T20:18:53-0400] [W][13526] void cmdline::logParams(nsjconf_t*)():236 Process will be UID/EUID=0 in the global user namespace, and will have user root-level access to files
[2018-07-13T20:18:53-0400] Gid map: inside_gid:99999 outside_gid:0 count:1 newgidmap:false
[2018-07-13T20:18:53-0400] [W][13526] void cmdline::logParams(nsjconf_t*)():247 Process will be GID/EGID=0 in the global user namespace, and will have group root-level access to files
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13527 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13529 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13530 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13531 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13532 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13533 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:53-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:53-0400] PID: 13534 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:54-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:54-0400] PID: 13537 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:54-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:54-0400] PID: 13540 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:54-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:54-0400] PID: 13541 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:54-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:54-0400] PID: 13542 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:54-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:54-0400] PID: 13543 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:54-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:54-0400] PID: 13544 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:55-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:55-0400] PID: 13545 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
[2018-07-13T20:18:55-0400] [W][1] bool mnt::mountPt(mount_t*, const char*, const char*)():204 mount('src:'/chroot/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true') src:'/chroot/' dstpath:'/run/user//nsjail.0.root//' failed: No such file or directory
[2018-07-13T20:18:55-0400] PID: 13548 ([STANDALONE MODE]) exited with status: 255, (PIDs left: 0)
^C[2018-07-13T20:18:55-0400] Server stops due to fatal signal (2) caught. Exiting
[root@localhost nsjail]# ^C
[root@localhost nsjail]# 

tmpfs size is not limited by default, `--tmpfs_size` argument/`cmdlineTmpfsSz` var are unused

For example:

$ ./nsjail -Mo -v --chroot /mnt/media/public/nsjail/ubuntu_16.04_extra/ --tmpfs_size 10000 -T /tempdir -- /bin/bash
[2018-01-08T01:24:01.491134+0100] Mode: STANDALONE_ONCE
[2018-01-08T01:24:01.491271+0100] Jail parameters: hostname:'NSJAIL', chroot:'/mnt/media/public/nsjail/ubuntu_16.04_extra/', process:'/bin/bash', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:true, keep_caps:false, tmpfs_size:10000, disable_no_new_privs:false, max_cpus:0
[2018-01-08T01:24:01.491318+0100] Mount point: src:'/mnt/media/public/nsjail/ubuntu_16.04_extra/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0 options:'' isDir:true
[2018-01-08T01:24:01.491342+0100] Mount point: src:'[NULL]' dst:'/tempdir' type:'tmpfs' flags:0 options:'' isDir:true
[2018-01-08T01:24:01.491365+0100] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2018-01-08T01:24:01.491384+0100] Uid map: inside_uid:1000 outside_uid:1000 count:1 newuidmap:false
[2018-01-08T01:24:01.491403+0100] Gid map: inside_gid:1000 outside_gid:1000 count:1 newgidmap:false
[2018-01-08T01:24:01.491423+0100] [D][11948] nsjailSetSigHandler():57 Setting sighandler for signal SIGINT (2)
[2018-01-08T01:24:01.491454+0100] [D][11948] nsjailSetSigHandler():57 Setting sighandler for signal SIGQUIT (3)
[2018-01-08T01:24:01.491480+0100] [D][11948] nsjailSetSigHandler():57 Setting sighandler for signal SIGUSR1 (10)
[2018-01-08T01:24:01.491505+0100] [D][11948] nsjailSetSigHandler():57 Setting sighandler for signal SIGALRM (14)
[2018-01-08T01:24:01.491530+0100] [D][11948] nsjailSetSigHandler():57 Setting sighandler for signal SIGCHLD (17)
[2018-01-08T01:24:01.491555+0100] [D][11948] nsjailSetSigHandler():57 Setting sighandler for signal SIGTERM (15)
[2018-01-08T01:24:01.491596+0100] [D][11948] subprocRunChild():455 Creating new process with clone flags:CLONE_NEWNS|CLONE_NEWCGROUP|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET|SIGCHLD
[2018-01-08T01:24:01.491651+0100] [D][11948] subprocClone():418 Cloning process with flags:CLONE_NEWNS|CLONE_NEWCGROUP|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET|SIGCHLD
[2018-01-08T01:24:01.493758+0100] [D][11948] subprocAdd():210 Added pid '11949' with start time '1515371041' to the queue for IP: '[STANDALONE_MODE]'
[2018-01-08T01:24:01.493824+0100] [D][11948] utilWriteBufToFile():137 Written '4' bytes to '/proc/11949/setgroups'
[2018-01-08T01:24:01.493866+0100] [D][11948] userGidMapSelf():145 Writing '1000 1000 1
' to '/proc/11949/gid_map'
[2018-01-08T01:24:01.493901+0100] [D][11948] utilWriteBufToFile():137 Written '12' bytes to '/proc/11949/gid_map'
[2018-01-08T01:24:01.493935+0100] [D][11948] userUidMapSelf():117 Writing '1000 1000 1
' to '/proc/11949/uid_map'
[2018-01-08T01:24:01.493965+0100] [D][11948] utilWriteBufToFile():137 Written '12' bytes to '/proc/11949/uid_map'
[2018-01-08T01:24:01.494022+0100] [D][1] userInitNsFromChild():291 setgroups(0, NULL)
[2018-01-08T01:24:01.494161+0100] [D][1] userInitNsFromChild():294 setgroups(NULL) failed: Operation not permitted
[2018-01-08T01:24:01.494203+0100] [D][1] userSetResGid():48 setresgid(1000)
[2018-01-08T01:24:01.494232+0100] [D][1] userSetResUid():64 setresuid(1000)
[2018-01-08T01:24:01.494284+0100] [D][1] mountMkdirAndTest():281 Created accessible directory in '/run/user/1000/nsjail.root'
[2018-01-08T01:24:01.494412+0100] [D][1] mountMkdirAndTest():281 Created accessible directory in '/run/user/1000/nsjail.tmp'
[2018-01-08T01:24:01.494519+0100] [D][1] mountMount():124 Mounting 'src:'/mnt/media/public/nsjail/ubuntu_16.04_extra/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0 options:'' isDir:true'
[2018-01-08T01:24:01.494627+0100] [D][1] mountMount():124 Mounting 'src:'[NULL]' dst:'/tempdir' type:'tmpfs' flags:0 options:'' isDir:true'
[2018-01-08T01:24:01.494759+0100] [D][1] mountMount():124 Mounting 'src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true'
[2018-01-08T01:24:01.544649+0100] [D][1] mountRemountRO():263 Re-mounting R/O '/' (flags:MS_RDONLY|MS_REMOUNT|MS_BIND|MS_RELATIME|0)
[2018-01-08T01:24:01.544801+0100] [D][1] mountRemountRO():263 Re-mounting R/O '/proc' (flags:MS_RDONLY|MS_REMOUNT|MS_BIND|MS_RELATIME|0)
[2018-01-08T01:24:01.545106+0100] [D][1] utsInitNs():34 Setting hostname to 'NSJAIL'
[2018-01-08T01:24:01.545198+0100] [D][1] capsInitNs():234 Adding the following capabilities to the inheritable set:
[2018-01-08T01:24:01.545357+0100] [D][1] capsInitNs():257 Dropped the following capabilities from the bounding set: CAP_CHOWN CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_FSETID CAP_KILL CAP_SETGID CAP_SETUID CAP_SETPCAP CAP_LINUX_IMMUTABLE CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_ADMIN CAP_NET_RAW CAP_IPC_LOCK CAP_IPC_OWNER CAP_SYS_MODULE CAP_SYS_RAWIO CAP_SYS_CHROOT CAP_SYS_PTRACE CAP_SYS_PACCT CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_NICE CAP_SYS_RESOURCE CAP_SYS_TIME CAP_SYS_TTY_CONFIG CAP_MKNOD CAP_LEASE CAP_AUDIT_WRITE CAP_AUDIT_CONTROL CAP_SETFCAP CAP_MAC_OVERRIDE CAP_MAC_ADMIN CAP_SYSLOG CAP_WAKE_ALARM CAP_BLOCK_SUSPEND CAP_AUDIT_READ
[2018-01-08T01:24:01.545397+0100] [D][1] capsInitNs():271 Added the following capabilities to the ambient set:
[2018-01-08T01:24:01.545428+0100] [D][1] cpuInit():65 No max_cpus limit set
[2018-01-08T01:24:01.545602+0100] [D][1] containMakeFdsCOEProc():220 FD=0 will be passed to the child process
[2018-01-08T01:24:01.545648+0100] [D][1] containMakeFdsCOEProc():220 FD=1 will be passed to the child process
[2018-01-08T01:24:01.545672+0100] [D][1] containMakeFdsCOEProc():220 FD=2 will be passed to the child process
[2018-01-08T01:24:01.545697+0100] [D][1] containMakeFdsCOEProc():227 FD=3 will be closed before execve()
[2018-01-08T01:24:01.545721+0100] [D][1] containMakeFdsCOEProc():227 FD=4 will be closed before execve()
[2018-01-08T01:24:01.545746+0100] [D][1] containMakeFdsCOEProc():227 FD=5 will be closed before execve()
[2018-01-08T01:24:01.545796+0100] Executing '/bin/bash' for '[STANDALONE_MODE]'
[2018-01-08T01:24:01.545817+0100] [D][1] subprocNewProc():172  Arg[0]: '/bin/bash'
user@NSJAIL:/$ mount
/dev/mapper/seagate-oldmedia on / type ext4 (ro,relatime,commit=600,data=ordered)
none on /tempdir type tmpfs (rw,relatime,uid=1000,gid=1000)
none on /proc type proc (ro,relatime)
user@NSJAIL:/$ exit
[2018-01-08T01:24:09.292774+0100] PID: 11949 ([STANDALONE_MODE]) exited with status: 0, (PIDs left: 0)
[2018-01-08T01:24:09.292910+0100] [D][11948] subprocRemove():218 Removing pid '11949' from the queue (IP:'[STANDALONE_MODE]', start time:'2018-01-08T01:24:01+0100')

notice that there is no size=... in /tempdir mount options.

Error re-mounting chroot in RO

Starting the following command fail on the re-mount as RO with mountRemountRO():228 mount('/', flags:MS_RDONLY|MS_REMOUNT|MS_NOATIME|0): Operation not permitted:

$ nsjail --chroot / -- /bin/sh -i
[2017-07-06T13:32:26+0000] Mode: STANDALONE_ONCE
[2017-07-06T13:32:26+0000] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false
[2017-07-06T13:32:26+0000] Mount point: src:'/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-07-06T13:32:26+0000] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2017-07-06T13:32:26+0000] Uid map: inside_uid:109 outside_uid:109 count:1 newuidmap:false
[2017-07-06T13:32:26+0000] Gid map: inside_gid:117 outside_gid:117 count:1 newgidmap:false
[2017-07-06T13:32:26+0000] [W][1] mountRemountRO():228 mount('/', flags:MS_RDONLY|MS_REMOUNT|MS_NOATIME|0): Operation not permitted
[2017-07-06T13:32:26+0000] PID: 16959 exited with status: 1, (PIDs left: 0)

Changing the mount options from

/dev/dm-0 / ext4 rw,noatime,errors=remount-ro,data=ordered 0 0

to

/dev/dm-0 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0

fixes the issue.

cgroup cpu configurations are not parsed

I was wondering why the cgroup cpu configurations were not being picked up. It looks like
cgroup_cpu_ms_per_sec, cgroup_cpu_mount and cgroup_cpu_parent are not set in config.cc: configParseInternal.

nsjail + systemd

Question of use

what's the best way to integrate nsjail with systemd?
systemd has some features that overlaps with nsjail and could conflict
ideas:

  • nsjail as setuid, capabilities?
  • systemd unit running as root
  • dont use systemd security features at all
  • ....

is_root_rw deprecated, but required?

This could quite easily be user error but I am finding that unless I use the deprecated configuration is_root_rw: true, my mount namespace is read-only, even with a mount bind with dst: "/" and rw: true.

setresgid(): Function not implemented

root@LEDE:~# nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
[2017-10-15T10:02:43+0000] Mode: LISTEN_TCP
[2017-10-15T10:02:43+0000] Jail parameters: hostname:'NSJAIL', chroot:'/chroot/', process:'/bin/sh', bind:[::]:9000, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, max_cpus:0
[2017-10-15T10:02:43+0000] Mount point: src:'/chroot/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-10-15T10:02:43+0000] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2017-10-15T10:02:43+0000] Uid map: inside_uid:99999 outside_uid:0 count:1 newuidmap:false
[2017-10-15T10:02:43+0000] [W][1232] cmdlineLogParams():243 Process will be UID/EUID=0 in the global user namespace, and will have user root-level access to files
[2017-10-15T10:02:43+0000] Gid map: inside_gid:99999 outside_gid:0 count:1 newgidmap:false
[2017-10-15T10:02:43+0000] [W][1232] cmdlineLogParams():252 Process will be GID/EGID=0 in the global user namespace, and will have group root-level access to files
[2017-10-15T10:02:43+0000] Listening on [::]:9000
[2017-10-15T10:02:51+0000] New connection from: [::ffff:192.168.1.245]:56302 on: [::ffff:192.168.1.52]:9000
[2017-10-15T10:02:51+0000] [E][1] userInitNsFromChild():271 setresgid(99999): Function not implemented
[2017-10-15T10:02:52+0000] PID: 1233 ([::ffff:192.168.1.245]:56302) exited with status: 255, (PIDs left: 0)

I have setresgid(x): Function not implemented error when I try other examples regardless of using --user /--group arguments.

I'm running nsjail on Raspberry pi 2 (arm) running lede (openwrt)

Create a simple tool for policies definitions

I couldn't find a simple tool that defines the syscalls, permissions of a program, etc.
I created this simple script, based on strace and grep, that get's a list of all the unique syscalls of a binary, on the run.
This way, one can whitelist some flows of a program easily, without disassembling.
I'd be happy to create something bigger then that, a solution for nsjail+firejail policies definitions.

https://github.com/avilum/syscalls

build on CentOS 7 with ProtoBuf support

Opened this issue just for doc purpose:

If you try to compile under CentOS with ProtoBuf support you will have this error
cc -o nsjail nsjail.o cmdline.o config.o contain.o log.o cgroup.o mount.o net.o pid.o sandbox.o subproc.o user.o util.o uts.o config.pb-c.o kafel/libkafel.a protobuf-c-text/protobuf-c-text/.libs/libprotobuf-c-text.a -Wl,-z,now -Wl,-z,relro -pie -Wl,-z,noexecstack -lprotobuf-c /bin/ld: protobuf-c-text/protobuf-c-text/.libs/libprotobuf-c-text.a(protobuf_c_text_libprotobuf_c_text_la-generate.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC protobuf-c-text/protobuf-c-text/.libs/libprotobuf-c-text.a: error adding symbols: Bad value collect2: error: ld returned 1 exit status make: *** [nsjail] Error 1

to solve this issue you have to
cd protobuf-c-text/ ; export CFLAGS=-fPIC; ./autogen.sh; make
then you can go back to nsjail dir and build as usual

Mapping UIDs fails

Edit: I just realized the 2.7 tag from the release page works just fine.

Version: Current master (122f251)

I'm trying to use nsjail to hop into a chroot while remapping UIDs, so I can be root in the chroot, but have a high, unprivileged UID outside.

nsjail -v -Mo --chroot /alpine/ -u 0:100000:65536 -g 0:100000:65536 --rw -- /bin/sh

Without remapping UID's this works fine. If I try to remap UIDs though newuidmap fails with EPERM while trying to write to uid_map, I've confirmed this by swapping newuidmap with a wrapper straceing it .

In uidGidMap we first try to write to it ourselves, and then execute newuidmap, which tries to write to it again. (Same goes for gid map)

According to the man page for user namespace

   After the creation of a new user namespace, the uid_map file of one
   of the processes in the namespace may be written to once to define
   the mapping of user IDs in the new user namespace.  An attempt to
   write more than once to a uid_map file in a user namespace fails with
   the error EPERM.  Similar rules apply for gid_map files.

I assume that's a bug and correct behaviour should be that only either of those two calls needs to succeed?

subprocess exit status 255, but nsjail exit with 55

root@ubuntu-14-04-f878f8544-6l8bf:/# /opt/goma/bin/nsjail --chroot / --rw --disable_proc -- /bin/ls
[2018-06-06T06:25:12+0000] Mode: STANDALONE_ONCE
[2018-06-06T06:25:12+0000] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/ls', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, max_cpus:0
[2018-06-06T06:25:12+0000] Mount point: src:'/' dst:'/' type:'' flags:MS_BIND|MS_REC|0 options:'' isDir:true
[2018-06-06T06:25:12+0000] Uid map: inside_uid:0 outside_uid:0 count:1 newuidmap:false
[2018-06-06T06:25:12+0000] [W][65] cmdlineLogParams():243 Process will be UID/EUID=0 in the global user namespace, and will have user root-level access to files
[2018-06-06T06:25:12+0000] Gid map: inside_gid:0 outside_gid:0 count:1 newgidmap:false
[2018-06-06T06:25:12+0000] [W][65] cmdlineLogParams():252 Process will be GID/EGID=0 in the global user namespace, and will have group root-level access to files
[2018-06-06T06:25:12+0000] [E][1] mountInitNsInternal():315 mount('/dev/shm/nsjail.root', 'tmpfs'): Permission denied
[2018-06-06T06:25:12+0000] PID: 66 ([STANDALONE_MODE]) exited with status: 255, (PIDs left: 0)
root@ubuntu-14-04-f878f8544-6l8bf:/# echo $?
55

nsjail + java/tomcat

context: trying nsjail under CentOS 7 with java/tomcat package from yum repos

How can I run java/tomcat under nsjail
I've tried several options but JVM can't reserve mem
Could not reserve enough space for 262144KB object heap
I've tried
/usr/sbin/nsjail --user tomcat:tomcat --group tomcat:tomcat -R /etc/alternatives -R /etc/tomcat -R /etc/sysconfig/tomcat -R /etc/java -R /bin/ -R /lib -R /lib64/ -R /usr/ -T /dev -R /dev/urandom -T /tmp -B /usr/share/tomcat -B /var/log/tomcat/ -e -- /usr/libexec/tomcat/server start

and
/usr/sbin/nsjail --user tomcat:tomcat --group tomcat:tomcat -R /etc/alternatives -R /etc/tomcat -R /etc/sysconfig/tomcat -R /etc/java -R /bin/ -R /lib -R /lib64/ -R /usr/ -T /dev -R /dev/urandom -T /tmp -B /usr/share/tomcat -B /var/log/tomcat/ -e -- /bin/java

and
/usr/sbin/nsjail --user tomcat:tomcat --group tomcat:tomcat -R /etc/alternatives -R /etc/tomcat -R /etc/sysconfig/tomcat -R /etc/java -R /bin/ -R /lib -R /lib64/ -R /usr/ -T /dev -R /dev/urandom -T /tmp -B /usr/share/tomcat -B /var/log/tomcat/ -e -- /bin/strace /bin/java

captura de pantalla 2017-05-20 a las 16 12 31

captura de pantalla 2017-05-20 a las 16 15 25

Creating symbolic link within jail?

Hi
I'm trying to create symbolic link for /proc/self/fd to /dev/fd inside the jail. So the process substitute could work in bash. /proc itself is mounted through procfs.

one solution is to create /dev tmpfs with rw, and create symbolic link before bash started.
Is this something can be configured with mount, any suggestions?

cap_get_flag: Invalid argument

Since the last release 1.5, I get the following error:

[2017-07-25T15:24:17+0000] [F][1] capsGetCap():135 cap_get_flag(id=37, type=2): Invalid argument

From a Debian:

$ uname -a
Linux gozy-01-int 4.9.0-0.bpo.3-amd64 #1 SMP Debian 4.9.30-2~bpo8+1 (2017-06-14) x86_64 GNU/Linux

Guess it is linked to CAP_AUDIT_READ (id=37). Not sure if I'm missing something or if nsjail is doing something wrong ? Thanks for the help.

Here is the actual call:

# nsjail --chroot / -- /bin/sh -i
[2017-07-25T16:04:44+0000] Mode: STANDALONE_ONCE
[2017-07-25T16:04:44+0000] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, max_cpus:0
[2017-07-25T16:04:44+0000] Mount point: src:'/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-07-25T16:04:44+0000] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2017-07-25T16:04:44+0000] Uid map: inside_uid:0 outside_uid:0 count:1 newuidmap:false
[2017-07-25T16:04:44+0000] [W][12329] cmdlineLogParams():242 Process will be UID/EUID=0 in the global user namespace
[2017-07-25T16:04:44+0000] Gid map: inside_gid:0 outside_gid:0 count:1 newgidmap:false
[2017-07-25T16:04:44+0000] [W][12329] cmdlineLogParams():250 Process will be GID/EGID=0 in the global user namespace
[2017-07-25T16:04:44+0000] [W][1] mountMount():202 mount('src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true') src:'none' dst:'/dev/shm/nsjail.root//proc' failed: Operation not permitted
[2017-07-25T16:04:45+0000] [F][1] capsGetCap():135 cap_get_flag(id=37, type=2): Invalid argument
[2017-07-25T16:04:45+0000] PID: 12330 ([STANDALONE_MODE]) exited with status: 1, (PIDs left: 0)

Ability to use additional File Descriptors on spawn

We've got a situation where additional custom data is being passed from the child process on FD 3.

It seems when we try to spawn the same application using nsjail, there is no way to open up the additional File Descriptor we need.

Is there any easy way we can fix this? I found some references in the code, but it seems mostly hard-coded to only work with FD 0,1,2

Thank you.

Symlinks not working

I am running CentOS 7 and I have a mount namespace with mount point /usr/local/lib64, bind ro. When I run my executable any libraries it tries to reference that are symlinks in that directory fail - file not found. The symlinks point to files in the same directory (as you’d expect here). Am I missing something?

Mapping files or folders with annoying names

If one wants to map a file or folder from the command line, the path cannot contain a colon character anywhere, or it is assumed that this is the delimited separating the source and destination.

For example, -R /tmp/foo:bar:/foobar will understandably not work.

While one workaround would be to use a configuration file, it would be great if it were still possible to handle these cases on the command line. For example, maybe there could be a set of two-argument options which take in a source and dest separately, so the syntax would look like --bindmount_ro2 /tmp/foo:bar /foobar

mountMount fails: Operation not permitted

Finally after few changes I managed to run nsjail on mips.

root@LEDE:/# uname -an
Linux LEDE 4.9.58 #0 SMP Thu Oct 26 15:22:52 2017 mips GNU/Linux

There were many warnings during link process but it got compiled!

/home/pandora/repositories/lede-hamid/staging_dir/toolchain-mipsel_24kc_gcc-5.5.0_musl/lib/gcc/mipsel-openwrt-linux-musl/5.5.0/../../../../mipsel-openwrt-linux-musl/bin/ld: kafel/libkafel.a(libkafel.o): Can't find matching LO16 reloc against `kafel_ctxt_clean' for R_MIPS_GOT16 at 0x164 in section `.text'
/home/pandora/repositories/lede-hamid/staging_dir/toolchain-mipsel_24kc_gcc-5.5.0_musl/lib/gcc/mipsel-openwrt-linux-musl/5.5.0/../../../../mipsel-openwrt-linux-musl/bin/ld: kafel/libkafel.a(libkafel.o): Can't find matching LO16 reloc against `kafel_yylex_init' for R_MIPS_GOT16 at 0x178 in section `.text'

deducted for repeated warnings

I tried to fix it by removing relro and other flags from LDFLAGS but could not resolve it.

LDFLAGS += -Wl,-z,now -Wl,-z,relro -pie -Wl,-z,noexecstack -lpthread $(shell pkg-config --libs protobuf)

Sadly I eventually got the same error as the one I got on the raspberry pi (ARM) : Operation not permitted ...

root@LEDE:/# nsjail -Mo --chroot / -- /bin/sh -i 
[2017-10-26T17:02:28+0000] Mode: STANDALONE_ONCE
[2017-10-26T17:02:28+0000] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, max_cpus:0
[2017-10-26T17:02:28+0000] Mount point: src:'/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0 options:'' isDir:true
[2017-10-26T17:02:28+0000] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2017-10-26T17:02:28+0000] Uid map: inside_uid:0 outside_uid:0 count:1 newuidmap:false
[2017-10-26T17:02:28+0000] [W][2302] cmdlineLogParams():247 Process will be UID/EUID=0 in the global user namespace, and will have user root-level access to files
[2017-10-26T17:02:28+0000] Gid map: inside_gid:0 outside_gid:0 count:1 newgidmap:false
[2017-10-26T17:02:28+0000] [W][2302] cmdlineLogParams():258 Process will be GID/EGID=0 in the global user namespace, and will have group root-level access to files
[2017-10-26T17:02:28+0000] [W][1] mountMount():207 mount('src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true') src:'none' dst:'/tmp/nsjail.root//proc' failed: Operation not permitted
[2017-10-26T17:02:28+0000] PID: 2303 ([STANDALONE_MODE]) exited with status: 255, (PIDs left: 0)
root@LEDE:/# 

I used the latest code from lede, kafel and nsjail (26 Oct ) repositories and documented the procedure to compile and execute the result on qemu here

error: unknown type name '__rlim64_t'

I'm cross compiling nsjail for 32bit architecture (MIPS), but the compiler complains about some 64bit specific symbols in these files: cmdline.c, cmdline.h, nsjail.h !

Manual page for nsjail

I'm preparing a package of nsjail for CRUX linux and wish to include at least a basic manual page for usage.

I have worked up a manual page by starting with the output of help2man and then have done a good deal of cleanup from that point. I think that it's at a useable stage.

If including it here is of interest, I can do a PR.
Content can be previewed at https://github.com/jvvv/nsjail/blob/master/nsjail.1.

njail + httpd config

My http + php config example

name: "APACHE"
mode: ONCE
hostname: "test.server.com"

rlimit_as: 1024
rlimit_cpu: 1000
rlimit_fsize: 1024
rlimit_nofile: 16

clone_newnet: false
keep_caps: true

uidmap {
	inside_id: "apache"
	outside_id: ""
	count: 1
}

gidmap {
	inside_id: "apache"
	outside_id: ""
	count: 1
}

mount {
	src: "/etc/httpd"
	dst: "/etc/httpd"
	is_bind: true
}
mount {
	src: "/etc/php.d"
	dst: "/etc/php.d"
	is_bind: true
}
mount {
	src: "/etc/ld.so.cache"
	dst: "/etc/ld.so.cache"
	is_bind: true
}
mount {
	src: "/etc/passwd"
	dst: "/etc/passwd"
	is_bind: true
}
mount {
	src: "/etc/group"
	dst: "/etc/group"
	is_bind: true
}
mount {
	src: "/etc/hosts"
	dst: "/etc/hosts"
	is_bind: true
}

mount {
	src: "/etc/mime.types"
	dst: "/etc/mime.types"
	is_bind: true
}

mount {
	src: "/etc/localtime"
	dst: "/etc/localtime"
	is_bind: true
}

mount {
	src: "/sbin/httpd"
	dst: "/sbin/httpd"
	is_bind: true
}



mount {
	src: "/lib64"
	dst: "/lib64"
	is_bind: true
}


mount {
	src: "/usr"
	dst: "/usr"
	is_bind: true
}

mount {
	src: "/var/www/html"
	dst: "/var/www/html"
	is_bind: true
	rw: true
}

mount {
	src: "/var/log/httpd"
	dst: "/var/log/httpd"
	is_bind: true
	rw: true
}

mount {
	dst: "/tmp"
	fstype: "tmpfs"
	rw: true
	is_bind: false
}

mount {
	dst: "/run/httpd"
	fstype: "tmpfs"
	rw: true
	is_bind: false
}

mount {
        src: "/dev/urandom"
        dst: "/dev/urandom"
        is_bind: true
        rw: true
}

mount {
        dst: "/dev/shm"
        fstype: "tmpfs"
        rw: true
        is_bind: false
}

mount {
        dst: "/proc"
        fstype: "proc"
}

seccomp_string: "
	POLICY example {
		KILL {
			ptrace,
			process_vm_readv,
			process_vm_writev
		}
	}
	USE example DEFAULT ALLOW
"

questions:

  • how to add just one cap? for example network, I changed the httpd config to listen on 1025 so I can drop all caps
  • If "clone_newnet: true" then how to link to network device? it doesnt appear in "ip netns list"

Cleanup mounts if initialization fails

Currently, nsjail mounts by default into /dev/shm/nsjail.root and /dev/shm/nsjail.root/proc. If you specify other mounts with -R e.g. -R /lib or other similar options, we mount even more. If any of the other initialization routines fail after these have been mounted, we end up with a bunch of stale mount points. If you keep re-running nsjail (and it fails somewhere in initialization post-mount), you'll soon end up with hundreds of identical mount entries! Of course if it fails too late, we won't have the privileges to unmount, but there should at least be some attempt at cleanup.

nsjail/kafel for other architectures

I'm trying to cross compile nsjail for running it on OpenWRT/LEDE and I ended up with unsupported architecture error from Kafel:
https://travis-ci.org/openwrt/packages/builds/282223552#L2385
I found this instructive commit by @rlc2 for adding a new architecture to kafel:
google/kafel@47cfef4
I tried to follow a similar method by creating a template with all (possible) syscall prototypes with the TEMPLATE_NUM as the placeholder for the syscall number. I fixed names for few syscalls by finding the variation of the syscall name in other architecture and adding them to the template. (e.g adding newX, _X, X2 for syscall X) but what are we supposed to do with the syscalls that unique to the architecture. Can we simply ignore them?

python3 parser.py 
Syscall not found in the template: cachectl
Syscall not found in the template: cacheflush
Syscall not found in the template: pkey_alloc
Syscall not found in the template: pkey_free
Syscall not found in the template: pkey_mprotect
Syscall not found in the template: statx
Syscall not found in the template: sysmips
Syscall not found in the template: timerfd

Please also take a look at these files if my explanations are not clear:
https://github.com/ebadi/syscalls-table/blob/master/mips64_syscalls.c
https://github.com/ebadi/syscalls-table/blob/master/template_syscalls.c

Sample use cases on Ubuntu 16.04 (x86-64)

The sample use cases are not working on my Ubuntu 16.04 (x86-64).

./nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
[2017-06-14T11:25:15+0200] Mode: LISTEN_TCP
[2017-06-14T11:25:15+0200] Jail parameters: hostname:'NSJAIL', chroot:'/chroot/', process:'/bin/sh', bind:[::]:9000, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false
[2017-06-14T11:25:15+0200] Mount point: src:'/chroot/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-06-14T11:25:15+0200] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2017-06-14T11:25:15+0200] Uid map: inside_uid:99999 outside_uid:1000 count:1 newuidmap:false
[2017-06-14T11:25:15+0200] Gid map: inside_gid:99999 outside_gid:1000 count:1 newgidmap:false
[2017-06-14T11:25:15+0200] Listening on [::]:9000
[2017-06-14T11:25:20+0200] New connection from: [::ffff:127.0.0.1]:40942 on: [::ffff:127.0.0.1]:9000
[2017-06-14T11:25:20+0200] Executing '/bin/sh' for '[::ffff:127.0.0.1]:40942'
[2017-06-14T11:25:20+0200] [E][1] subprocNewProc():180 execve('/bin/sh') failed: No such file or directory
[2017-06-14T11:25:20+0200] PID: 30539 exited with status: 1, (PIDs left: 0)

It works fine when I change it to:

./nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999  -R /bin/ -R /lib -R /lib64/  -- /bin/sh -i
[2017-06-14T11:29:49+0200] Mode: LISTEN_TCP
[2017-06-14T11:29:49+0200] Jail parameters: hostname:'NSJAIL', chroot:'/chroot/', process:'/bin/sh', bind:[::]:9000, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false
[2017-06-14T11:29:49+0200] Mount point: src:'/chroot/' dst:'/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-06-14T11:29:49+0200] Mount point: src:'[NULL]' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:true
[2017-06-14T11:29:49+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-06-14T11:29:49+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-06-14T11:29:49+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:true
[2017-06-14T11:29:49+0200] Uid map: inside_uid:99999 outside_uid:1000 count:1 newuidmap:false
[2017-06-14T11:29:49+0200] Gid map: inside_gid:99999 outside_gid:1000 count:1 newgidmap:false
[2017-06-14T11:29:49+0200] Listening on [::]:9000
[2017-06-14T11:29:57+0200] New connection from: [::ffff:127.0.0.1]:40972 on: [::ffff:127.0.0.1]:9000
[2017-06-14T11:29:57+0200] Executing '/bin/sh' for '[::ffff:127.0.0.1]:40972'

I assume access to /bin to run sh executable file is needed and /lib* library files are later needed for successful execution.

Compilation failure (in kafel?) with GCC 8.1

make -C kafel
make[1]: Entering directory '/build/nsjail/src/nsjail-2.7/kafel'
Makefile:27: warning: overriding recipe for target 'test'
build/Makefile.mk:41: warning: ignoring old recipe for target 'test'
make -C src PROJECT_ROOT=../
make[2]: Entering directory '/build/nsjail/src/nsjail-2.7/kafel/src'
bison parser.y
flex lexer.l
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o context.o context.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o codegen.o codegen.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o expression.o expression.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o policy.o policy.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o range_rules.o range_rules.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscall.o syscall.c
bison parser.y
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscalls/amd64_syscalls.o syscalls/amd64_syscalls.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscalls/i386_syscalls.o syscalls/i386_syscalls.c
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  caps.cc -o caps.o
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscalls/aarch64_syscalls.o syscalls/aarch64_syscalls.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscalls/mipso32_syscalls.o syscalls/mipso32_syscalls.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscalls/mips64_syscalls.o syscalls/mips64_syscalls.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o syscalls/arm_syscalls.o syscalls/arm_syscalls.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -D_FORTIFY_SOURCE=2  -c -o kafel.o kafel.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -Wno-error -D_FORTIFY_SOURCE=2  -c -o lexer.o lexer.c
cc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -std=gnu11 -Iinclude -Wall -Wextra -Werror -O2 -fPIC -fvisibility=hidden -std=gnu11 -I../include -Wall -Wextra -Werror -O2 -Wno-error -D_FORTIFY_SOURCE=2  -c -o parser.o parser.c
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  cgroup.cc -o cgroup.o
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  cmdline.cc -o cmdline.o
protoc --cpp_out=. config.proto
protoc --cpp_out=. config.proto
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  contain.cc -o contain.o
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  cpu.cc -o cpu.o
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  logs.cc -o logs.o
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  mnt.cc -o mnt.o
cgroup.cc: In function ‘bool cgroup::initNsFromParentMem(nsjconf_t*, pid_t)’:
cgroup.cc:54:33: error: ‘/memory.limit_in_bytes’ directive output may be truncated writing 22 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/memory.limit_in_bytes", mem_cgroup_path);
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:54:10: note: ‘snprintf’ output between 23 and 4118 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/memory.limit_in_bytes", mem_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:65:33: error: ‘/memory.oom_control’ directive output may be truncated writing 19 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/memory.oom_control", mem_cgroup_path);
                                 ^~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:65:10: note: ‘snprintf’ output between 20 and 4115 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/memory.oom_control", mem_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:73:33: error: ‘/tasks’ directive output may be truncated writing 6 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/tasks", mem_cgroup_path);
                                 ^~~~~~~~~~
cgroup.cc:73:10: note: ‘snprintf’ output between 7 and 4102 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/tasks", mem_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  net.cc -o net.o
cgroup.cc: In function ‘bool cgroup::initNsFromParentNetCls(nsjconf_t*, pid_t)’:
cgroup.cc:137:33: error: ‘/net_cls.classid’ directive output may be truncated writing 16 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/net_cls.classid", net_cls_cgroup_path);
                                 ^~~~~~~~~~~~~~~~~~~~
cgroup.cc:137:10: note: ‘snprintf’ output between 17 and 4112 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/net_cls.classid", net_cls_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:146:33: error: ‘/tasks’ directive output may be truncated writing 6 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/tasks", net_cls_cgroup_path);
                                 ^~~~~~~~~~
cgroup.cc:146:10: note: ‘snprintf’ output between 7 and 4102 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/tasks", net_cls_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc: In function ‘bool cgroup::initNsFromParentCpu(nsjconf_t*, pid_t)’:
cgroup.cc:174:33: error: ‘/cpu.cfs_quota_us’ directive output may be truncated writing 17 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/cpu.cfs_quota_us", cpu_cgroup_path);
                                 ^~~~~~~~~~~~~~~~~~~~~
cgroup.cc:174:10: note: ‘snprintf’ output between 18 and 4113 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/cpu.cfs_quota_us", cpu_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:183:33: error: ‘/cpu.cfs_period_us’ directive output may be truncated writing 18 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/cpu.cfs_period_us", cpu_cgroup_path);
                                 ^~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:183:10: note: ‘snprintf’ output between 19 and 4114 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/cpu.cfs_period_us", cpu_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:192:33: error: ‘/tasks’ directive output may be truncated writing 6 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/tasks", cpu_cgroup_path);
                                 ^~~~~~~~~~
cgroup.cc:192:10: note: ‘snprintf’ output between 7 and 4102 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/tasks", cpu_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc: In function ‘bool cgroup::initNsFromParentPids(nsjconf_t*, pid_t)’:
cgroup.cc:99:33: error: ‘/pids.max’ directive output may be truncated writing 9 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/pids.max", pids_cgroup_path);
                                 ^~~~~~~~~~~~~
cgroup.cc:99:10: note: ‘snprintf’ output between 10 and 4105 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/pids.max", pids_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cgroup.cc:108:33: error: ‘/tasks’ directive output may be truncated writing 6 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=]
  snprintf(fname, sizeof(fname), "%s/tasks", pids_cgroup_path);
                                 ^~~~~~~~~~
cgroup.cc:108:10: note: ‘snprintf’ output between 7 and 4102 bytes into a destination of size 4096
  snprintf(fname, sizeof(fname), "%s/tasks", pids_cgroup_path);
  ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
g++ -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O2 -c -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -Wformat -Wformat=2 -Wformat-security -fPIE -Wno-format-nonliteral -Wall -Wextra -Werror -Ikafel/include -pthread  -std=c++11 -fno-exceptions -Wno-unused -Wno-unused-parameter -DNSJAIL_NL3_WITH_MACVLAN -I/usr/include/libnl3  nsjail.cc -o nsjail.o
cc -Wl,-soname,../libkafel.so.1 -shared kafel.o context.o codegen.o expression.o policy.o range_rules.o syscall.o lexer.o parser.o syscalls/amd64_syscalls.o syscalls/i386_syscalls.o syscalls/aarch64_syscalls.o syscalls/mipso32_syscalls.o syscalls/mips64_syscalls.o syscalls/arm_syscalls.o -o ../libkafel.so
cc1plus: all warnings being treated as errors
make: *** [Makefile:63: cgroup.o] Error 1
make: *** Waiting for unfinished jobs....
ld -r kafel.o context.o codegen.o expression.o policy.o range_rules.o syscall.o lexer.o parser.o syscalls/amd64_syscalls.o syscalls/i386_syscalls.o syscalls/aarch64_syscalls.o syscalls/mipso32_syscalls.o syscalls/mips64_syscalls.o syscalls/arm_syscalls.o -o libkafel_r.o
objcopy --localize-hidden libkafel_r.o libkafel.o
rm -f libkafel_r.o
ar rcs ../libkafel.a libkafel.o
rm -f libkafel.o
make[2]: Leaving directory '/build/nsjail/src/nsjail-2.7/kafel/src'
make[1]: Leaving directory '/build/nsjail/src/nsjail-2.7/kafel'

Not able to run simple script with node inside nsjail

Hi,
While running nodejs script inside nsjail, execution fails with following error.

contents of test.js

console.log('test');

Verbose output follows:

# ./nsjail/nsjail -Me  -c /  -v  --keep_caps -- node test.js
[2018-05-20T19:21:12+0000] Mode: STANDALONE_EXECVE
[2018-05-20T19:21:12+0000] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'node', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:true, keep_caps:true, disable_no_new_privs:false, max_cpus:0
[2018-05-20T19:21:12+0000] Mount point: src:'/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0' type:'' options:'' is_dir:true
[2018-05-20T19:21:12+0000] Mount point: src:'' dst:'/proc' flags:'MS_RDONLY|0' type:'proc' options:'' is_dir:true
[2018-05-20T19:21:12+0000] Uid map: inside_uid:0 outside_uid:0 count:1 newuidmap:false
[2018-05-20T19:21:12+0000] [W][13820] void cmdline::logParams(nsjconf_t*)():254 Process will be UID/EUID=0 in the global user namespace, and will have user root-level access to files
[2018-05-20T19:21:12+0000] Gid map: inside_gid:0 outside_gid:0 count:1 newgidmap:false
[2018-05-20T19:21:12+0000] [W][13820] void cmdline::logParams(nsjconf_t*)():265 Process will be GID/EGID=0 in the global user namespace, and will have group root-level access to files
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGINT (2)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGQUIT (3)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGUSR1 (10)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGALRM (14)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGCHLD (17)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGTERM (15)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGTTIN (21)
[2018-05-20T19:21:12+0000] [D][13820] bool nsjailSetSigHandler(int)():58 Setting sighandler for signal SIGTTOU (22)
[2018-05-20T19:21:12+0000] [D][13820] void subproc::runChild(nsjconf_t*, int, int, int)():405 Entering namespace with flags:CLONE_NEWNS|CLONE_NEWCGROUP|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET|SIGUNKNOWN(0)
[2018-05-20T19:21:12+0000] [D][13820] bool util::writeBufToFile(const char*, const void*, size_t, int)():109 Written '4' bytes to '/proc/13820/setgroups'
[2018-05-20T19:21:12+0000] [D][13820] bool user::gidMapSelf(nsjconf_t*, pid_t)():146 Writing '0 0 1
' to '/proc/13820/gid_map'
[2018-05-20T19:21:12+0000] [D][13820] bool util::writeBufToFile(const char*, const void*, size_t, int)():109 Written '6' bytes to '/proc/13820/gid_map'
[2018-05-20T19:21:12+0000] [D][13820] bool user::uidMapSelf(nsjconf_t*, pid_t)():118 Writing '0 0 1
' to '/proc/13820/uid_map'
[2018-05-20T19:21:12+0000] [D][13820] bool util::writeBufToFile(const char*, const void*, size_t, int)():109 Written '6' bytes to '/proc/13820/uid_map'
[2018-05-20T19:21:12+0000] [D][13820] bool user::initNsFromChild(nsjconf_t*)():240 setgroups(0, NULL)
[2018-05-20T19:21:12+0000] [D][13820] bool user::initNsFromChild(nsjconf_t*)():243 setgroups(NULL) failed: Operation not permitted
[2018-05-20T19:21:12+0000] [D][13820] bool user::setResGid(gid_t)():49 setresgid(0)
[2018-05-20T19:21:12+0000] [D][13820] bool user::setResUid(uid_t)():65 setresuid(0)
[2018-05-20T19:21:12+0000] [D][13820] bool pid::initNs(nsjconf_t*)():44 Creating a dummy 'init' process
[2018-05-20T19:21:12+0000] [D][13820] pid_t subproc::cloneProc(uintptr_t)():480 Cloning process with flags:CLONE_FS|SIGUNKNOWN(0)
[2018-05-20T19:21:12+0000] [D][13820] pid_t subproc::cloneProc(uintptr_t)():480 Cloning process with flags:CLONE_FS|SIGCHLD
[2018-05-20T19:21:12+0000] [D][2] bool mnt::mkdirAndTest(const char*)():270 Couldn't create '/run/user/0/nsjail.root' directory: No such file or directory
[2018-05-20T19:21:12+0000] [D][2] bool mnt::mkdirAndTest(const char*)():277 Created accessible directory in '/tmp/nsjail.root'
[2018-05-20T19:21:12+0000] [D][2] bool mnt::mkdirAndTest(const char*)():270 Couldn't create '/run/user/0/nsjail.tmp' directory: No such file or directory
[2018-05-20T19:21:12+0000] [D][2] bool mnt::mkdirAndTest(const char*)():277 Created accessible directory in '/tmp/nsjail.tmp'
[2018-05-20T19:21:12+0000] [D][2] bool mnt::mountPt(mount_t*, const char*, const char*)():124 Mounting 'src:'/' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE|0' type:'' options:'' is_dir:true'
[2018-05-20T19:21:12+0000] [D][2] bool mnt::mountPt(mount_t*, const char*, const char*)():124 Mounting 'src:'' dst:'/proc' flags:'MS_RDONLY|0' type:'proc' options:'' is_dir:true'
[2018-05-20T19:21:12+0000] [D][2] bool mnt::remountRO(const mount_t&)():259 Re-mounting R/O '/' (flags:MS_RDONLY|MS_REMOUNT|MS_BIND|MS_RELATIME|0)
[2018-05-20T19:21:12+0000] [D][2] bool mnt::remountRO(const mount_t&)():259 Re-mounting R/O '/proc' (flags:MS_RDONLY|MS_NODEV|MS_REMOUNT|MS_BIND|MS_RELATIME|0)
[2018-05-20T19:21:12+0000] [D][13820] bool uts::initNs(nsjconf_t*)():36 Setting hostname to 'NSJAIL'
[2018-05-20T19:21:12+0000] [D][13820] bool caps::initNsKeepCaps(cap_user_data_t)():180 Adding the following capabilities to the inheritable set: CAP_CHOWN CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_FSETID CAP_KILL CAP_SETGID CAP_SETUID CAP_SETPCAP CAP_LINUX_IMMUTABLE CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_ADMIN CAP_NET_RAW CAP_IPC_LOCK CAP_IPC_OWNER CAP_SYS_MODULE CAP_SYS_RAWIO CAP_SYS_CHROOT CAP_SYS_PTRACE CAP_SYS_PACCT CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_NICE CAP_SYS_RESOURCE CAP_SYS_TIME CAP_SYS_TTY_CONFIG CAP_MKNOD CAP_LEASE CAP_AUDIT_WRITE CAP_AUDIT_CONTROL CAP_SETFCAP CAP_MAC_OVERRIDE CAP_MAC_ADMIN CAP_SYSLOG CAP_WAKE_ALARM CAP_BLOCK_SUSPEND
[2018-05-20T19:21:12+0000] [D][13820] bool caps::initNsKeepCaps(cap_user_data_t)():199 Added the following capabilities to the ambient set: CAP_CHOWN CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_FSETID CAP_KILL CAP_SETGID CAP_SETUID CAP_SETPCAP CAP_LINUX_IMMUTABLE CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_ADMIN CAP_NET_RAW CAP_IPC_LOCK CAP_IPC_OWNER CAP_SYS_MODULE CAP_SYS_RAWIO CAP_SYS_CHROOT CAP_SYS_PTRACE CAP_SYS_PACCT CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYS_NICE CAP_SYS_RESOURCE CAP_SYS_TIME CAP_SYS_TTY_CONFIG CAP_MKNOD CAP_LEASE CAP_AUDIT_WRITE CAP_AUDIT_CONTROL CAP_SETFCAP CAP_MAC_OVERRIDE CAP_MAC_ADMIN CAP_SYSLOG CAP_WAKE_ALARM CAP_BLOCK_SUSPEND
[2018-05-20T19:21:12+0000] [D][13820] bool cpu::initCpu(nsjconf_t*)():67 No max_cpus limit set
[2018-05-20T19:21:12+0000] [D][13820] bool contain::containMakeFdsCOEProc(nsjconf_t*)():194 open('/proc/self/fd', O_DIRECTORY|O_RDONLY|O_CLOEXEC): No such file or directory
[2018-05-20T19:21:12+0000] [D][13820] bool contain::containMakeFdsCOENaive(nsjconf_t*)():176 FD=0 will be passed to the child process
[2018-05-20T19:21:12+0000] [D][13820] bool contain::containMakeFdsCOENaive(nsjconf_t*)():176 FD=1 will be passed to the child process
[2018-05-20T19:21:12+0000] [D][13820] bool contain::containMakeFdsCOENaive(nsjconf_t*)():176 FD=2 will be passed to the child process
[2018-05-20T19:21:12+0000] Executing '/var/vcap/packages/node/bin/node' for '[STANDALONE MODE]'
[2018-05-20T19:21:12+0000] [D][13820] int subproc::subprocNewProc(nsjconf_t*, int, int, int, int)():175  Arg: '/var/vcap/packages/node/bin/node'
[2018-05-20T19:21:12+0000] [D][13820] int subproc::subprocNewProc(nsjconf_t*, int, int, int, int)():175  Arg: '-r'
[2018-05-20T19:21:12+0000] [D][13820] int subproc::subprocNewProc(nsjconf_t*, int, int, int, int)():175  Arg: 'test.js'
node[13820]: pthread_create: Invalid argument

#
# Fatal error in heap setup
# Allocation failed - process out of memory
#

Illegal instruction

nsjail can't execute dynamically-linked 32-bit binary while chroot

Hi, currently I'm preparing a simple binary challenge for myself to learn. Upon trying to isolate simple x86 binary, nsjail shows an error No such file or directory.

Linux Distro: Arch Linux

Here is a simple C code:

#include <stdio.h>
int main()
{
    printf("hello");
    return 0;
}

Compiling this with GCC into two architectures x86 and x86-64.

x86:
GCC compile: gcc test.c -o test_x86 -m32
Nsjail: ./nsjail -Mo --user 2000 --group 2000 -R /lib -R /lib64/ -R /usr/ -R /bin/ -R /dev/null --chroot /home/shahril/Desktop/nsjail-report/RELEASE --keep_caps -- /test_x86
Error: execve('/test_x86') failed: No such file or directory

[2018-10-03T17:07:27+0800] Mode: STANDALONE_ONCE
[2018-10-03T17:07:27+0800] Jail parameters: hostname:'NSJAIL', chroot:'/home/shahril/Desktop/nsjail-report/RELEASE', process:'/test_x86', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:true, keep_caps:true, disable_no_new_privs:false, max_cpus:0
[2018-10-03T17:07:27+0800] Mount point: src:'/home/shahril/Desktop/nsjail-report/RELEASE' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:07:27+0800] Mount point: src:'/lib' dst:'/lib' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:07:27+0800] Mount point: src:'/lib64/' dst:'/lib64/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:07:27+0800] Mount point: src:'/usr/' dst:'/usr/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:07:27+0800] Mount point: src:'/bin/' dst:'/bin/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:07:27+0800] Mount point: src:'/dev/null' dst:'/dev/null' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:false
[2018-10-03T17:07:27+0800] Mount point: src:'' dst:'/proc' flags:'MS_RDONLY' type:'proc' options:'' is_dir:true
[2018-10-03T17:07:27+0800] Uid map: inside_uid:2000 outside_uid:1000 count:1 newuidmap:false
[2018-10-03T17:07:27+0800] Gid map: inside_gid:2000 outside_gid:1000 count:1 newgidmap:false
[2018-10-03T17:07:27+0800] Executing '/test_x86' for '[STANDALONE MODE]'
[2018-10-03T17:07:27+0800] [E][1] void subproc::subprocNewProc(nsjconf_t*, int, int, int, int)():194 execve('/test_x86') failed: No such file or directory
[2018-10-03T17:07:27+0800] [F][1] bool subproc::runChild(nsjconf_t*, int, int, int)():431 Launching child process failed
[2018-10-03T17:07:27+0800] [W][20878] bool subproc::runChild(nsjconf_t*, int, int, int)():459 Received error message from the child process before it has been executed
[2018-10-03T17:07:27+0800] [E][20878] int nsjail::standaloneMode(nsjconf_t*)():146 Couldn't launch the child process

x86-64:
GCC compile: gcc test.c -o test_x64
Nsjail: ./nsjail -Mo --user 2000 --group 2000 -R /lib -R /lib64/ -R /usr/ -R /bin/ -R /dev/null --chroot /home/shahril/Desktop/nsjail-report/RELEASE --keep_caps -- /test_x64
Error: No error appears

[2018-10-03T17:08:37+0800] Mode: STANDALONE_ONCE
[2018-10-03T17:08:37+0800] Jail parameters: hostname:'NSJAIL', chroot:'/home/shahril/Desktop/nsjail-report/RELEASE', process:'/test_x64', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:true, keep_caps:true, disable_no_new_privs:false, max_cpus:0
[2018-10-03T17:08:37+0800] Mount point: src:'/home/shahril/Desktop/nsjail-report/RELEASE' dst:'/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:08:37+0800] Mount point: src:'/lib' dst:'/lib' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:08:37+0800] Mount point: src:'/lib64/' dst:'/lib64/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:08:37+0800] Mount point: src:'/usr/' dst:'/usr/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:08:37+0800] Mount point: src:'/bin/' dst:'/bin/' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:true
[2018-10-03T17:08:37+0800] Mount point: src:'/dev/null' dst:'/dev/null' flags:'MS_RDONLY|MS_BIND|MS_REC|MS_PRIVATE' type:'' options:'' is_dir:false
[2018-10-03T17:08:37+0800] Mount point: src:'' dst:'/proc' flags:'MS_RDONLY' type:'proc' options:'' is_dir:true
[2018-10-03T17:08:37+0800] Uid map: inside_uid:2000 outside_uid:1000 count:1 newuidmap:false
[2018-10-03T17:08:37+0800] Gid map: inside_gid:2000 outside_gid:1000 count:1 newgidmap:false
[2018-10-03T17:08:37+0800] Executing '/test_x64' for '[STANDALONE MODE]'
test[2018-10-03T17:08:38+0800] PID: 20911 ([STANDALONE MODE]) exited with status: 0, (PIDs left: 0)

As you can see above, when running 32-bit binary, it shows an error that execv() No such file or directory.

However, when I tried compiling the 32-binary statically, it works fine without any No such file or directory.

While looking online, I found this site: https://www.cert.pl/en/news/single/technical-aspects-of-ctf-contest-organization/
which they said, Installation of lib32z1 is necessary if the service to be attacked is a binary built in 32 bit mode (-m32). In case of lack of this library we could receive a misleading error message /app/pwn_me – no such file or directory.

So, any pointers to fix this problem?

Why changing the theme so often?

Hi, I am not to be joking, but the theme changes over these months are a little annoying. I am a developer try to build things with nsjail. Checking things on the official site is often while I am getting used to this tool, but every time I open the site I am stunned because the site is keep changing theme with content almost unchanged.

Are you not sure about what theme fits? Personally, among those you tried, I recommend jekyll-theme-slate the most. I am sorry for this non-productive issue, but changing theme is unproductive, also.

nsjail after fork() on existing process

Is it possible to apply nsjail to an existing process via the C api? Similar to how e.g. chroot(2) or setrlimit(2) in Linux apply restriction on the current process?

My use case is that I have a faily complicated application, in which I want to run a piece of untrusted code in a temporary fork with some restrictions (without execv()) , and when it's done have the parent process kill of the entire child's process tree.

Missing dependency rule for config.pb.h

Commit 5c2d985 leaves config.pb.h no rule to get generated. Even though the $(SRCS_PB) rule will do the work necessary, the last depend line:
config.pb.o: config.pb.h
will not be satisfied.

Adding the header to the SRCS_PB definition:
SRCS_PB = $(SRCS_PROTO:.proto=.pb.cc) $(SRCS_PROTO:.proto=.pb.h)
allows for successful compilation.

A different fix could be to add a separate variable for the header:
HDRS_PB = $(SRCS_PROTO:.proto=.pb.h)
then, later, an empty rule for dependency resolution:
$(HDRS_PB): $(SRCS_PROTO)

As a side note, I think 'clear' is a typo of 'clean' on the .PHONY target line in the Makefile

nsjail on openwrt [cross compiling]

Hello,
This question is asked from openwrt community but I didn't get any answer.
To reproduce you can follow the basic instruction of "using the openwrt SDK" :

  • install possible dependent packages that are mentioned in the beginning of the Makefile
  • clone this repository in the OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/package/ directory.

simply run: make or make package/nsjail/compile on the SDK root

CFLAGS="-Os -pipe -fno-caller-saves -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result  -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/usr/include -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/include -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/usr/include -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/include " CXXFLAGS="-Os -pipe -fno-caller-saves -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result  -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/usr/include -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/include -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/usr/include -I/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/include " LDFLAGS="-L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/usr/lib -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/lib -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/usr/lib -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/lib " make -j1 -C /home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/build_dir/target-x86_64_uClibc-0.9.33.2/nsjail-1.0/. AR="x86_64-openwrt-linux-uclibc-gcc-ar" AS="ccache_cc -c -Os -pipe -fno-caller-saves -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result" LD=x86_64-openwrt-linux-uclibc-ld NM="x86_64-openwrt-linux-uclibc-gcc-nm" CC="ccache_cc" GCC="ccache_cc" CXX="ccache_cxx" RANLIB="x86_64-openwrt-linux-uclibc-gcc-ranlib" STRIP=x86_64-openwrt-linux-uclibc-strip OBJCOPY=x86_64-openwrt-linux-uclibc-objcopy OBJDUMP=x86_64-openwrt-linux-uclibc-objdump SIZE=x86_64-openwrt-linux-uclibc-size CROSS="x86_64-openwrt-linux-uclibc-" ARCH="x86_64" USE_NL3=no;  
make[4]: Entering directory '/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/build_dir/target-x86_64_uClibc-0.9.33.2/nsjail-1.0'
ccache_cxx -o nsjail nsjail.o caps.o cmdline.o contain.o log.o cgroup.o mount.o net.o pid.o sandbox.o subproc.o user.o util.o uts.o cpu.o config.o config.pb.o kafel/libkafel.a -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/usr/lib -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/lib -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/usr/lib -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/toolchain-x86_64_gcc-4.8-linaro_uClibc-0.9.33.2/lib  -Wl,-z,now -Wl,-z,relro -pie -Wl,-z,noexecstack -lpthread -lcap -L/home/laptopx/Downloads/OpenWrt-SDK-15.05.1-x86-64_gcc-4.8-linaro_uClibc-0.9.33.2.Linux-x86_64/staging_dir/target-x86_64_uClibc-0.9.33.2/usr/lib -lprotobuf -pthread -lpthread
cmdline.o: In function `cmdlineParseRLimit':
cmdline.c:(.text+0x7be): undefined reference to `prlimit64'
contain.o: In function `containContain':
contain.c:(.text+0x21f): undefined reference to `prlimit64'
contain.c:(.text+0x249): undefined reference to `prlimit64'
contain.c:(.text+0x270): undefined reference to `prlimit64'
contain.c:(.text+0x29a): undefined reference to `prlimit64'
contain.o:contain.c:(.text+0x2c7): more undefined references to `prlimit64' follow
mount.o: In function `mountInitNsInternal.part.2':
mount.c:(.text+0x877): undefined reference to `mkostemp'
util.o: In function `utilSigName':
util.c:(.text+0x841): undefined reference to `sys_sigabbrev'
collect2: error: ld returned 1 exit status
Makefile:71: recipe for target 'nsjail' failed
make[4]: *** [nsjail] Error 1

Unable to set nosuid or nodev options on tmpfs mount using config file

When I try to mount /tmp in a container like this:

mount {
  dst: "/tmp"
  fstype: "tmpfs"
  rw: true
  options: "nosuid,nodev"
}

I get Invalid argument error:

[2018-01-08T00:40:27.513440+0100] [D][1] mountMount():124 Mounting 'src:'[NULL]' dst:'/tmp' type:'tmpfs' flags:0 options:'nosuid' isDir:true'
[2018-01-08T00:40:27.513516+0100] [W][1] mountMount():205 mount('src:'[NULL]' dst:'/tmp' type:'tmpfs' flags:0 options:'nosuid' isDir:true') src:'none' dst:'/tmp/nsjail.root//tmp' failed: Invalid argument

My guess is that in order to enable nosuid, nodev, noexec in mount syscall it's also required to specify them in mountflags argument as MS_NOSUID, MS_NODEV, MS_NOEXEC.

Harder and less straightforward, but more user friendly solution would be to parse options and set appropriate flags, as mount command line utility does.
Easier solution is to add mountflags field to MountPt config proto message and require setting those flags (MS_NOSUID and others) there.

Compile Error

Linux ubuntu 4.10.0-28-generic x86_64
g++ 5.4.0 20160609

In file included from config.cc:41:0:
log.h:46:15: error: ‘namespace log { }’ redeclared as different kind of symbol
namespace log {
^
In file included from /usr/include/features.h:367:0,
from /usr/include/fcntl.h:25,
from config.cc:22:
/usr/include/x86_64-linux-gnu/bits/mathcalls.h:109:1: note: previous declaration ‘double log(double)’
__MATHCALL_VEC (log,, (Mdouble __x));

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.