nvidia / enroot Goto Github PK

A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.

License: Apache License 2.0

Makefile 3.70% Shell 56.87% C 39.42%

enroot's Introduction

ENROOT

A simple, yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.

Enroot can be thought of as an enhanced unprivileged chroot(1). It uses the same underlying technologies as containers but removes much of the isolation they inherently provide while preserving filesystem separation.

This approach is generally preferred in high-performance environments or virtualized environments where portability and reproducibility is important, but extra isolation is not warranted.

Enroot is also similar to other tools like proot(1) or fakeroot(1) but instead relies on more recent features from the Linux kernel (i.e. user and mount namespaces), and provides facilities to import well known container image formats (e.g. Docker).

Usage example:

# Import and start an Ubuntu image from DockerHub
$ enroot import docker://ubuntu
$ enroot create ubuntu.sqsh
$ enroot start ubuntu

Key Concepts

Adheres to the KISS principle and Unix philosophy
Standalone (no daemon)
Fully unprivileged and multi-user capable (no setuid binary, cgroup inheritance, per-user configuration/container store...)
Easy to use (simple image format, scriptable, root remapping...)
Little to no isolation (no performance overhead, simplifies HPC deployements)
Entirely composable and extensible (system-wide and user-specific configurations)
Fast Docker image import (3x to 5x speedup on large images)
Built-in GPU support with libnvidia-container
Facilitate collaboration and development workflows (bundles, in-memory containers...)

Documentation

Copyright and License

This project is released under the Apache License 2.0.

Issues and Contributing

Please let us know by filing a new issue
You can contribute by opening a pull request, please make sure you read CONTRIBUTING first.

Reporting Security Issues

When reporting a security issue, do not create an issue or file a pull request.
Instead, disclose the issue responsibly by sending an email to psirt<at>nvidia.com.

enroot's People

Contributors

Stargazers

Watchers

Forkers

flx42 kristyste lukeyeager michelepy kcgthb biocyberman maflister vishwas1234567 ukinau maxim-belkin hporav koffisam jafermarq benjamir hephaex vatsan62 dholt dbkinghorn cxz eweill-nv itzsimpl zeta1999 doctaweeks kuangllbnu kkvtran sjtu-hpc jasonyangshadow sergey-mamonov mrbr-github shinmorino arguello adamsimpson wahid612 doytsujin org-mars alexfok wahello sahilsuri008 zhcf andrew-johnson-melb krono algoskynet jimba86 rujschafer pedrosalgueiro leortyz capnoid congld cheyunfei royvelich aavbsouza shahmed ayushchatur ugiwgh all-users fishbaitcache heiner perifaws ventaubain chris-cox ltalirz q-leap-networks bhdschmidt ttjoseph ungtb10d madeeks lgtm-migrator oluwoye m1tttt4 wiejdm nicolas-schreiber yhgon manote101 xajxiang eluzet thanit456 nickcao jlbutler paulhendi s1gnate-sync wphiphi mangunowsky jfolz akoumpa mbrukman tuchsanai hkust-hpc-team arsvision susumuota mohamed-elsayed nvcastet

enroot's Issues

enroot update container using package manager

Is it possible to import ubuntu-18.04 docker image to enroot and then install some packages when running the enroot container?
If I do
apt install vim
I get dpkg: error: requested operation requires superuser privilege

If I do
sudo apt install vim
I get
bash: sudo: command not found

A solution is to go back and add basic packages to my docker container? Is there an enroot specific way I can do this?

Enroot usage

Could you please share a basic ,how we can use enroot with example from scratch for docker container which can be use by user without root privilege .

I want to user tenserflow and pyTorch docker container with enroot

enroot-unshare: failed to disable SSBD mitigation: Numerical result out of range

Hi All,

I am not able to start container on Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-20-generic x86_64). I have tried to configure any possible value for kernel parameter "spec_store_bypass_disable", however always with the same error. I have no problem on CentOS 7.6.

https://github.com/tyhicks/ssbd-tools/blob/master/README.md

Error log:
0: slurmstepd-gn03: pyxis: importing docker image ...
0: slurmstepd-gn03: pyxis: creating container filesystem ...
0: slurmstepd-gn03: pyxis: starting container ...
0: slurmstepd-gn03: error: pyxis: container start failed with error code: 1
0: slurmstepd-gn03: error: pyxis: printing contents of log file ...
0: slurmstepd-gn03: error: pyxis: enroot-unshare: failed to disable SSBD mitigation: Numerical result out of range
0: slurmstepd-gn03: error: spank: required plugin spank_pyxis.so: task_init_privileged() failed with rc=-1

Thanks for any suggestions.
R.

enroot exit with "0" even the cli contains invalid options.

[root@rcn04 ~]# /usr/bin/enroot start --mount /home/lcui:/home/lcui --mount /scratch/lsf_addon/lcui/lsf1011:/scratch/lsf_addon/lcui/lsf1011 --mount /tmp:/tmp --aaa value --rw ubuntu hostname
Usage: enroot start [options] [--] NAME|IMAGE [COMMAND] [ARG...]

Start a container and invoke the command script within its root filesystem.
Command and arguments are passed to the script as input parameters.

In the absence of a command script and if a command was given, it will be executed directly.
Otherwise, an interactive shell will be started within the container.

Options:
-c, --conf CONFIG Specify a configuration script to run before the container starts
-e, --env KEY[=VAL] Export an environment variable inside the container
--rc SCRIPT Override the command script inside the container
-r, --root Ask to be remapped to root inside the container
-w, --rw Make the container root filesystem writable
-m, --mount FSTAB Perform a mount from the host inside the container (colon-separated)
[root@rcn04 ~]# echo $?
0
^^^^^^^^
should it be non-zero?

Here is result for podman and docker.
root@ucn01:# docker run -asdf ubuntu hostname
invalid argument "sdf" for "-a, --attach" flag: valid streams are STDIN, STDOUT and STDERR
See 'docker run --help'.
root@ucn01:# echo $?
125
^^^^^^^

[root@rcn04 ~]# podman run --aaaa centos hostname
Error: unknown flag: --aaaa
[root@rcn04 ~]# echo $?
125
^^^^^

enroot-mksquashovlfs: failed to mount overlay "Invalid argument"

Hi,

Problem

When I run enroot import docker://ubuntu:18.04 I get the following error:

$ enroot import docker://ubuntu:18.04 
[INFO] Querying registry for permission grant
[INFO] Authenticating with user: <anonymous>
[INFO] Authentication succeeded
[INFO] Fetching image manifest list
[INFO] Fetching image manifest
[INFO] Found all layers in cache
[INFO] Extracting image layers...

100% 4:0=0s d7c3167c320d7a820935f54cf4290890ea19567da496ecf774e8773b35d5f065                                                                                                                                                                                                      

[INFO] Converting whiteouts...

100% 4:0=0s d7c3167c320d7a820935f54cf4290890ea19567da496ecf774e8773b35d5f065                                                                                                                                                                                                      

[INFO] Creating squashfs filesystem...

enroot-mksquashovlfs: failed to mount overlay: 0:1:2:3:4: Invalid argument

System setup

I ran the enroot-check --verify program, which returns that the CONFIG_OVERLAY_FS is set correctly. It's stated in the documentation that this is "required in order to import Docker images or use enroot-mksquashovlfs".

$ ./enroot-check_3.1.0_x86_64.run --verify
Kernel version:

Linux version 3.10.0-1062.12.1.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Thu Dec 12 06:44:49 EST 2019

Kernel configuration:

CONFIG_NAMESPACES                 : OK
CONFIG_USER_NS                    : OK
CONFIG_SECCOMP_FILTER             : OK
CONFIG_OVERLAY_FS                 : OK (module)
CONFIG_X86_VSYSCALL_EMULATION     : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_EMULATE           : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_NATIVE            : KO (required if glibc <= 2.13)

Kernel command line:

namespace.unpriv_enable=1         : OK
user_namespace.enable=1           : OK
vsyscall=native                   : KO (required if glibc <= 2.13)
vsyscall=emulate                  : KO (required if glibc <= 2.13)

Kernel parameters:

user.max_user_namespaces          : OK
user.max_mnt_namespaces           : OK

Extra packages:

nvidia-container-cli              : KO (required for GPU support)

However, in journalctl -f I see jul 06 16:25:48 tcn1189.bullx kernel: overlayfs: filesystem on '0' not supported.

I've used the following filesystems:

NFS (fails, unless ENROOT_TEMP_PATH is set to local SSD)
Lustre (fails, unless ENROOT_TEMP_PATH is set to local SSD)
Local SSD with EXT4 (works)

My enroot.conf:

ENROOT_RUNTIME_PATH         /scratch/nodespecific/tcn1189/enroot/runtime
ENROOT_CACHE_PATH           /scratch/nodespecific/tcn1189/enroot/cache
ENROOT_DATA_PATH            /scratch/nodespecific/tcn1189/enroot/data
ENROOT_TEMP_PATH            /scratch/nodespecific/tcn1189/enroot/temp

Which I changed to point to directories on NFS, Lustre and the local SSD with EXT4. All returning the same error, except for the local SSD. Clearly something is still missing, the root cause is most probably with kernel: overlayfs: filesystem on '0' not supported (from journalctl). Any suggestion?

It does work for all when I change the ENROOT_TEMP_PATH to point to the local disk. But preferably, due to storage needs, this would ideally also be Lustre.

NFS

10.200.208.13:/mnt/nfshome3 on /nfs/home3 type nfs (rw,nosuid,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.200.208.13,mountvers=3,mountport=50395,mountproto=udp,local_lock=none,addr=10.200.208.13)

Lustre

10.200.200.80@o2ib,10.201.200.80@o2ib1,10.200.200.81@o2ib,10.201.200.81@o2ib1:/lustre3 on /scratch type lustre (rw,nosuid,flock,lazystatfs)

Local SSD

/dev/sda1 on / type ext4 (rw,relatime,data=ordered)

PPC64LE support?

Having access to IBM AC922 with NVidia gpus, we would like to make use of enroot with pyxis/slurm for simple and secure containers.

Is this possible?

Attempting AWS ECR integration

Hi Guys,

Thanks for your help over on NVIDIA/pyxis#34.

Im attempting to configure enroot on an EC2 instance within my VPC to point to our ECR but am running into an issue Im struggling to debug. Im assuming it is something I have misconfigured but cannot see what...

I have confirmed the following:

the aws-cli is installed.
that the instance has sufficient IAM Role privileges to access our registry and the.
that aws ecr get-login-password returns a token.
that I can pull an image via the docker cli using the ecr-credential-helper.

(Note: I have obfuscated the ECR url)

I have created a readable credentials file at /etc/enroot/.credentials:

machine 1234.dkr.ecr.eu-west-2.amazonaws.com login AWS password $(aws ecr get-login-password)

when I call the following command I get a 404 not found error:

ENROOT_CONFIG_PATH=/etc/enroot enroot import --output bob.sqsh 'docker://1234.dkr.ecr.eu-west-2.amazonaws.com/my-repo'
[INFO] Querying registry for permission grant
[INFO] Authenticating with user: AWS
[ERROR] URL https://1234.dkr.ecr.eu-west-2.amazonaws.com/ returned error code: 404 Not Found

I hacked in a bit of additional logging to see what was going on:

docker::_authenticate() {
    local -r user="$1" registry="$2" url="$3"
    local realm= token= req_params=() resp_headers=

    # Query the registry to see if we're authorized.
    common::log INFO "Querying registry for permission grant"

    resp_headers=$(CURL_IGNORE=401 common::curl "${curl_opts[@]}" -I ${req_params[@]+"${req_params[@]}"} -- "${url}")
    common::log INFO "curl command opts: ${curl_opts[*]} ${url}"
    common::log INFO "resp_headers: ${resp_headers}"

    # If we don't need to authenticate, we're done.
    if ! grep -qi '^www-authenticate:' <<< "${resp_headers}"; then
        common::log INFO "Permission granted"
        return
    fi

    # Otherwise, craft a new token request from the WWW-Authenticate header.
    printf "%s" "${resp_headers}" | awk -F '="|",' '(tolower($1) ~ "^www-authenticate:"){
        sub(/"\r/, "", $0)
        print $2
        for (i=3; i<=NF; i+=2) print "--data-urlencode\n" $i"="$(i+1)
    }' | { common::read -r realm; readarray -t req_params; }

    if [ -z "${realm}" ]; then
        common::err "Could not parse authentication realm from ${url}"
    fi

    # If a user was specified, lookup his credentials.
    common::log INFO "Authenticating with user: ${user:-<anonymous>}"
    if [ -n "${user}" ]; then
        if grep -qs "machine[[:space:]]\+${registry}[[:space:]]\+login[[:space:]]\+${user}" "${creds_file}"; then
            common::log INFO "Using credentials from file: ${creds_file}"
            exec {fd}< <(common::evalnetrc "${creds_file}" 2> /dev/null)
            req_params+=("--netrc-file" "/proc/self/fd/${fd}")
        else
            req_params+=("-u" "${user}")
        fi
    fi

    # Request a new token.
    common::log INFO "Fetching new token"
    common::curl "${curl_opts[@]}" -G ${req_params[@]+"${req_params[@]}"} -- "${realm}" \
      | common::jq -r '.token? // .access_token? // empty' \
      | common::read -r token
    common::log INFO "Fetching new token - complete"

    [ -v fd ] && exec {fd}>&-

    # Store the new token.
    if [ -n "${token}" ]; then
        mkdir -m 0700 -p "${token_dir}"
        (umask 077 && printf 'header "Authorization: Bearer %s"' "${token}" > "${token_dir}/${registry}.$$")
        common::log INFO "Authentication succeeded"
    fi
}

and got the following output:

[INFO] Querying registry for permission grant
[INFO] curl command opts: --proto =https --retry 0 --connect-timeout 30 --max-time 0 -SsL https://1234.dkr.ecr.eu-west-2.amazonaws.com/v2/my-repo/manifests/latest
[INFO] resp_headers: HTTP/1.1 401 Unauthorized
Docker-Distribution-Api-Version: registry/2.0
Www-Authenticate: Basic realm="https://1234.dkr.ecr.eu-west-2.amazonaws.com/",service="ecr.amazonaws.com"
Date: Wed, 17 Feb 2021 14:14:11 GMT
Content-Length: 15
Content-Type: text/plain; charset=utf-8

[INFO] Authenticating with user: AWS
[INFO] Using credentials from file: /etc/enroot/.credentials
[INFO] Fetching new token
[ERROR] Could not process JSON input -r .token? // .access_token? // empty
[ERROR] URL --proto =https --retry 0 --connect-timeout 30 --max-time 0 -SsL -G --data-urlencode service=ecr.amazonaws.com --netrc-file /proc/self/fd/10 -- https://1234.dkr.ecr.eu-west-2.amazonaws.com/ returned error code: 404 Not Found

If I attempt to use the doker daemon, the following command does works as expected:

enroot import --output bob.sqsh 'dockerd://1234.dkr.ecr.eu-west-2.amazonaws.com/my-repo'

I feel like I am missing something simple...

I appreciate that AWS is not your primary target but any suggestions would be well received!
Best
Jon

HPC installation in every computing node?

Hi !

my name is Stefano Elefante and I am working at the Institute of Science and Technology Austria.
I am working in the team whose duty is to administrate the local HPC cluster.

We are investing possible docker solutions that may be deployed in our HPC cluster. We are very interested in using NVIDIA/enroot. I am writing to you as I would like to know:

Do we need to install NVIDIA/enroot in every computing node?
We are using SLURM as workload manager. I suppose NVIDIA/enroot is fully compatible, isn't it?

Kind regards,
Stefano

XDG_RUNTIME_DIR

Hi.

I built enroot from source (build logs are here: https://gist.github.com/49b9266a7ee0ae1d4383c0e4b3474bbb)

When I execute enroot, the following error message pops up:

mkdir: cannot create directory '/run/enroot': Permission denied

It looks like the location enroot is trying to use (/run/enroot) is controlled by XDG_RUNTIME_DIR variable. Is there a way to set this variable in a non-invasive way (that is, without modifying the source code) at the make install step?

Enroot import permission error

Description

When launching the following command:
enroot import docker://redis
I get the following output:

[INFO] Querying registry for permission grant
[INFO] Authenticating with user: <anonymous>
[INFO] Authentication succeeded
[INFO] Fetching image manifest list
[INFO] Fetching image manifest
[INFO] Downloading 7 missing layers...

100% 7:0=0s 62f1d3402b787aebcd74aaca5df9d5fe5e8fe4c0706d148a963c70d74a497e51                                                                                                                                       

[INFO] Extracting image layers...

100% 6:0=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                       

[INFO] Converting whiteouts...

0% 0:6=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                         enroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
16% 1:5=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                        enroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
33% 2:4=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                        enroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
50% 3:3=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                        enroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
66% 4:2=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                        enroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
83% 5:1=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90                                                                                                                                        enroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
100% 6:0=0s bb79b6b2107fea8e8a47133a660b78e3a546998fcf0427be39ac9a0af4a97e90

I don't have any errors when executing enroot import dockerd://redis (without having the image locally)

Context

I use Ubuntu 20 (x86). I'd rather use docker than dockerd, since it requires being in the docker group.
I tried playing with the /etc/enroot/enroot.conf, placing dirs in place I was sure I had access, but this did not work out.
It also works with sudo.
I tried echoing the relevant lines of code, but I found nothing helpful for me.

enroot import failing with NFS and Lustre directories (could not create destination file)

Image import failing on file output with permission error for certain NFS and Lustre filesystems. Have tried changing mksquashfs options (-force-uid, etc.) with no solution.

Failing import
(note /mnt/scratch/lcapps has permissions 777)

lcapps@xxx:/mnt/scratch/lcapps/$ enroot import --output nccl.squash 'docker://ubuntu'
[INFO] Querying registry for permission grant
[INFO] Authenticating with user:
[INFO] Authentication succeeded
[INFO] Fetching image manifest
[INFO] Found all digests in cache
[INFO] Extracting image layers...

100% 4:0=0s 5b7339215d1d5f8e68622d584a224f60339f5bef41dbd74330d081e912f0cddd

[INFO] Converting whiteouts...

100% 4:0=0s 5b7339215d1d5f8e68622d584a224f60339f5bef41dbd74330d081e912f0cddd

[INFO] Creating squashfs filesystem...

Could not create destination file: No such file or directory

Passing import:

lcapps@xxx:/mnt/scratch/lcapps$ enroot import --output /tmp/nccl.squash 'docker://ubuntu'
[INFO] Querying registry for permission grant
[INFO] Found valid credentials in cache
[INFO] Fetching image manifest
[INFO] Found all digests in cache
[INFO] Extracting image layers...

100% 4:0=0s 5b7339215d1d5f8e68622d584a224f60339f5bef41dbd74330d081e912f0cddd

[INFO] Converting whiteouts...

100% 4:0=0s 5b7339215d1d5f8e68622d584a224f60339f5bef41dbd74330d081e912f0cddd

[INFO] Creating squashfs filesystem...

Parallel mksquashfs: Using 80 processors
Creating 4.0 filesystem on /tmp/nccl.squash, block size 131072.
[====================================================================================================================================================================|] 2639/2639 100%

Exportable Squashfs 4.0 filesystem, lzo compressed, data block size 131072
uncompressed data, compressed metadata, compressed fragments, compressed xattrs
duplicates are removed
Filesystem size 46816.67 Kbytes (45.72 Mbytes)
74.47% of uncompressed filesystem size (62869.08 Kbytes)
Inode table size 38894 bytes (37.98 Kbytes)
37.39% of uncompressed inode table size (104027 bytes)
Directory table size 32379 bytes (31.62 Kbytes)
51.64% of uncompressed directory table size (62701 bytes)
Number of duplicate files found 99
Number of inodes 3163
Number of files 2410
Number of fragments 251
Number of symbolic links 180
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 573
Number of ids (unique uids + gids) 2
Number of uids 1
lcapps (11868)
Number of gids 1
dip (30)

enroot start: No space left on device

xianwu@haswell04[~]:$df -h /home/xianwu/
Filesystem Size Used Avail Use% Mounted on
filerb.eng.platformlab.ibm.com:/vol/vol2/xianwu 2.0G 618M 1.4G 31% /home/xianwu

xianwu@haswell04[]:$ll /tmp/cuda.sqsh
-rw-r--r-- 1 root root 3584847872 Sep 24 01:35 /tmp/cuda.sqsh
xianwu@haswell04[]:$ll -h /tmp/cuda.sqsh
-rw-r--r-- 1 root root 3.4G Sep 24 01:35 /tmp/cuda.sqsh

xianwu@haswell04[~]:$enroot create --name cuda /tmp/cuda.sqsh
**[INFO] Extracting squashfs filesystem...

Parallel unsquashfs: Using 88 processors
10471 inodes (37434 blocks) to write

[=============================================================================================================|] 37434/37434 100%

created 9690 files
created 1373 directories
created 776 symlinks
created 0 devices
created 0 fifos**

xianwu@haswell04[]:$enroot list -f
NAME PID STATE STARTED TIME MNTNS USERNS COMMAND
cuda. --> image is created successfully.
ubuntu
xianwu@haswell04[]:$

xianwu@haswell04[~]:$enroot start cuda hostname
sed: couldn't close stdout: No space left on device
[ERROR] /etc/enroot/hooks.d/10-shadow.sh exited with return code 4

xianwu@haswell04[]:$df -h /home/xianwu/
Filesystem Size Used Avail Use% Mounted on
filerb.eng.platformlab.ibm.com:/vol/vol2/xianwu 2.0G 2.0G 0 100% /home/xianwu. --> no space.
xianwu@haswell04[]:$

Can't import images with unreadable dirs

$ cat Dockerfile
FROM ubuntu:20.04
RUN apt-get update && apt-get install -y --no-install-recommends snapd

$ docker build -t local/bug-repro .
$ enroot import dockerd://local/bug-repro
[INFO] Fetching image

14a5572c7828d4463a60ef81eeddb6aca3286f4b59d68b8a872a355f7df55c03

[INFO] Extracting image content...
find: ‘rootfs/var/lib/snapd/void’: Permission denied

$ docker run --rm local/bug-repro ls -ld /var/lib/snapd/void
d--x--x--x 2 root root 4096 Feb  2 08:21 /var/lib/snapd/void

error in starting enroot container for gromacs:2020.2 on v100 cards

Getting below mentioned error for : enroot start --rw gromacs:2020.2

nvidia-container-cli: mount error: stat failed: /dev/nvidia-modeset: no such file or directory
[ERROR] /home/apps/centos7/enroot/3.2.0/etc/enroot/hooks.d/98-nvidia.sh exited with return code 1

Checked with the solution provided at : #15
Setting NVIDIA_DRIVER_CAPABILITIES=utility gives another error

ERROR Failed to verify GPU driver: GPU driver verification failed: Failed to parse host GPU driver version Malformed version:
FATAL Please double check that GPU passthrough is enabled: --nv (Singularity), --gpus all (Docker)

GPU Info : Name : Tesla V100 Driver Version: 450.51.05 CUDA Version: 11.0

Dockerfile's ENTRYPOINT has priority over [COMMAND] in start mode

If there is an ENTRYPOINT in the Dockerfile, when the container is launched with a command, it will still start in shell mode. But maybe it's deliberate?

➜ ~/test cat Dockerfile 
from ubuntu
ENTRYPOINT bash
 ➜ ~/test docker build --tag=test/test:test . && \
enroot import dockerd://test/test:test && \
enroot create -f test+test+test.sqsh
...
 ➜ enroot start test+test+test echo "HELLO"
jgsch@SERVER:/$

➜ ~/test cat Dockerfile 
from ubuntu
 ➜ ~/test docker build --tag=test/test:test . && \
enroot import dockerd://test/test:test && \
enroot create -f test+test+test.sqsh
...
 ➜ enroot start test+test+test echo "HELLO"
HELLO

Interactive login skipped when username in credentials file for another registry

Here's the line in question:
https://github.com/NVIDIA/enroot/blob/v1.1.0/src/docker.sh#L55
Are you sure that's what you want to do?

$ echo 'machine a.com login lyeager password foo' >.credentials
$ enroot import "docker://[email protected]#myimage:v1.0"
[INFO] Using credentials from file: /home/lyeager/.config/enroot/.credentials

That log line makes it look like curl is going to find a matching username+pw for the host in question for this request. In fact, it's not.

missing enroot-check_3.2.0_x86_64.run in 3.2.0 release

Just a quick note to let you know that:

$ curl -fSsL -O https://github.com/NVIDIA/enroot/releases/download/v3.2.0/enroot-check_3.2.0_$(uname -m).run
curl: (22) The requested URL returned error: 404 Not Found

Looks like the x86_64 version of the check script is still version 3.1.1 in 3.2.0.

Thanks!

Unable to install enroor from sources due to git submodule update issue

Hi,
I am trying to install enroot from sources for centos 7.

While running the command 'Sudo make install', I am getting following error:

Clone of 'git://git.musl-libc.org/musl' into submodule path 'deps/musl' failed

Do I have to configure some things first?

After applying Pyxis, a namespace permission error occurred.

Environment:
Baremetal 8xA100
PRETTY_NAME="Oracle Linux Server 7.9"
Kernel: 5.4.17-2036.100.6.1.el7uek.x86_64
NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0
slurm-20.11.2-1.el7.x86_64

Error:
After applying Pyxis, we did a job test. However, namespace permission error occurred.
Problem state
$ srun --container-image=/nfs/cluster/pyxis_test/enroot_image/centos.sqsh grep PRETTY /etc/os-release
slurmstepd: error: pyxis: container start failed with error code: 1
slurmstepd: error: pyxis: printing contents of log file ...
slurmstepd: error: pyxis: enroot-nsenter: failed to create user namespace: Permission denied
slurmstepd: error: pyxis: couldn't start container
slurmstepd: error: pyxis: if the image has an unusual entrypoint, try using --no-container-entrypoint
slurmstepd: error: spank: required plugin spank_pyxis.so: task_init() failed with rc=-1
slurmstepd: error: Failed to invoke spank plugin stack
srun: error: inst-foysz-vocal-drake: task 0: Exited with exit code 1

Check the current settings

$ ./enroot-check_*.run --verify
Kernel version:
Linux version 5.4.17-2036.100.6.1.el7uek.x86_64 (mockbuild@jenkins-172-17-0-2-980b5770-aa59-43ca-ab82-ac28f1d4376e) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39.0.3) (GCC)) NVIDIA/pyxis#2 SMP Thu Oct 29 17:04:48 PDT 2020
Kernel configuration:
CONFIG_NAMESPACES                 : OK
CONFIG_USER_NS                    : OK
CONFIG_SECCOMP_FILTER             : OK
CONFIG_OVERLAY_FS                 : OK (module)
CONFIG_X86_VSYSCALL_EMULATION     : OK
CONFIG_VSYSCALL_EMULATE           : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_NATIVE            : KO (required if glibc <= 2.13)
 
Kernel command line:
 
vsyscall=native                   : KO (required if glibc <= 2.13)
vsyscall=emulate                  : KO (required if glibc <= 2.13)
 
Kernel parameters:
 
user.max_user_namespaces          : OK
user.max_mnt_namespaces           : OK
 
Extra packages:
 
nvidia-container-cli              : OK

Kernel parameters updated and did a reboot. After reboot:

[root@inst-foysz-vocal-drake ~]# sudo grubby --info /boot/vmlinuz-5.4.17-2036.100.6.1.el7uek.x86_64
index=1
kernel=/boot/vmlinuz-5.4.17-2036.100.6.1.el7uek.x86_64
args="ro crashkernel=auto LANG=en_US.UTF-8 console=tty0 console=ttyS0,9600 rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 rd.iscsi.bypass=1 rd.net.timeout.carrier=5 rd.net.timeout.dhcp=10 netroot=iscsi:169.254.0.2:::1:iqn.2015-02.oracle.boot:uefi iscsi_param=node.session.timeo.replacement_timeout=6000 net.ifnames=1 nvme_core.shutdown_timeout=10 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 ipmi_si.trydefaults=0 libiscsi.debug_libiscsi_eh=1 loglevel=4 ip=dhcp nouveau.modeset=0 rd.driver.blacklist=nouveau namespace.unpriv_enable=1 user_namespace.enable=1"
root=UUID=a46bac1b-7608-4d7c-b8c6-a3abca37be9f
initrd=/boot/initramfs-5.4.17-2036.100.6.1.el7uek.x86_64.img
title=Oracle Linux Server 7.9, with Unbreakable Enterprise Kernel 5.4.17-2036.100.6.1.el7uek.x86_64
[root@inst-foysz-vocal-drake ~]#

Investigation:
I started looking at enroot source code and found bundle.sh which has logic to validate if the kernel parameters were set.
https://github.com/NVIDIA/enroot/blob/master/src/bundle.sh#L145:L155
You will see it only checks for “centos7*|rhel7*“. , should this be updated to add Oracle Linux like this ? centos7*|rhel7*|ol7”

On OCI cloud

source /etc/os-release 2> /dev/null; echo "${ID-}${VERSION_ID-}"
ol7.9

I did a sample “enroot” test like below and it worked fine with the default bundle.sh and with bundle.sh modified to include “ol7*” and both times it worked.

I ran this sample example. This is standalone, so it helps to isolate enroot issue from slurm or slurm pyxis invoking enroot.

Usage example:

# Import and start an Ubuntu image from DockerHub
$ enroot import docker://ubuntu
$ enroot create ubuntu.sqsh
$ enroot start ubuntu

I am not an enroot or pyxis expert, but I am assuming, since standalone enroot works, the issue could be in how pyxis is configured to use enroot or pyxis follows a code path which is different from the above simple example.

How was pyxis installed and configured?

Install

Since RPM package is not currently provided, for Oracle Linux the source installation was used. From Pyxis github, clone source and build.

$ git clone https://github.com/NVIDIA/pyxis.git
$ cd pyxis
$ sudo make install
cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -std=gnu11 -O2 -g -Wall -Wunused-variable -fstack-protector-strong -fpic -D_GNU_SOURCE cc -shared -Wl,-znoexecstack -Wl,-zrelro -Wl,-znow -o spank_pyxis.so spank_pyxis.lds common.strip --strip-unneeded -R .comment spank_pyxis.so
install -d -m 755 /usr/local/lib/slurm
install -m 644 spank_pyxis.so /usr/local/lib/slurm
install -d -m 755 /usr/local/share/pyxis
echo 'required /usr/local/lib/slurm/spank_pyxis.so' | install -m 644 /dev/stdin /usr/local/
$ sudo mkdir /etc/slurm/plugstack.conf.d
$ ln -s /usr/local/share/pyxis/pyxis.conf /etc/slurm/plugstack.conf.d/pyxis.conf
# systemctl restart slurmd

Github's installation guide specifies the path of the spank_pyxis.so library built in /etc/slurm/plugstack.conf.d/pyxis.conf to recognize the Spank plugin specified by Slurm, but this path is set well in Slurm, it don't seem to recognize it. As an alternative, create a plugstack.conf file that explicitly defines the Spank Plugin library in the slurm config directory and specify the library path of spank_pyxis.so.

Install Guide in Github ← have a problem

# ln -s /usr/local/share/pyxis/pyxis.conf /etc/slurm/plugstack.conf.d/pyxis.conf
# cat /usr/local/share/pyxis/pyxis.conf
required /usr/local/lib/slurm/spank_pyxis.so

We changed install guide ← works well

Master node
# vi /etc/slurm/plugstack.conf
required /usr/local/lib/slurm/spank_pyxis.so
# systemctl restart slurmctld && systemctl status slurmctld
Worker Nodes
# for i in {1..2}; do ssh node$i "mkdir /usr/local/lib/slurm"; done
# for i in {1..2}; do scp /usr/local/lib/slurm/spank_pyxis.so
node$i:/usr/local/lib/slurm/; done
# for i in {1..2}; do scp /etc/slurm/plugstack.conf node$i:/etc/slurm/; done
# for i in {1..4}; do ssh node$i "systemctl restart slurmd && systemctl status slurmd";
done

$ cat /etc/enroot/enroot.conf 
#ENROOT_LIBRARY_PATH        /usr/lib/enroot
#ENROOT_SYSCONF_PATH        /etc/enroot
ENROOT_RUNTIME_PATH        /tmp/enroot/user-$(id -u)
#ENROOT_CONFIG_PATH         ${XDG_CONFIG_HOME}/enroot
#ENROOT_CACHE_PATH          ${XDG_CACHE_HOME}/enroot
ENROOT_DATA_PATH           /tmp/enroot-data/user-$(id -u)
#ENROOT_TEMP_PATH           ${TMPDIR:-/tmp}

Had to change the PATH of ENROOT from /run to /tmp , otherwise it fails with
slurmstepd: error: pyxis: mkdir: cannot create directory ‘/run/enroot’: Permission denied

Run GUI applications?

Have you guys succeeded with running tensorboard or jupyter nodtebook on SLURM+enroot-driven cluster? If you did, I would like to see an example.

What is fuse-overlayfs required for?

We're on kernel 4.15 and fuse-overlayfs lists 4.18 as minimum, so we cannot install it.
What would be be missing out on without this dependency?

Edit: doubly relevant, since our DGX are also on 4.15 ;)

run via slurm: permission denied

I installed enroot version enroot-hardened+caps_2.0.1-1 on both frontend node and compute node n1.

I opened an interactive shell:

ADuser@frontend:~/salloc --nodelist=n1 --ntasks=1 -c 2 --gres=gpu:1 srun --pty bash -l

and then ran this command:

enroot list

And got this error:

mkdir: cannot create directory ‘/run/user/140196’: Permission denied

If I login with a local user to n1 and run enroot list it works. It also works if I run the same command on frontend node with ADuser: ADuser@frontend:~/ enroot list.

how to pass an environment file into enroot container like --env-file in docker/podman?

This can be done by putting the environment file into /etc/enroot/environ.d/, but it's for global enroot configuration, and I want to specify environment file one by one, is there any option can do this?

Unable to get shell on existing running GPU containers

I was trying to get a shell on running container but encountered this issue. not sure how gpu handled on pyxis & enroot. Below are the test cases

something similar to : docker exec

Case 1:[GPU job]

srun --pty -p debug  --gpus=1 --container-image=pytorch:19.10-py3  sleep 1000
slurmstepd: pyxis: running "enroot import" ...
slurmstepd: pyxis: running "enroot create" ...
slurmstepd: pyxis: running "enroot start" ...
slurmstepd: task_p_pre_launch: Using sched_affinity for tasks

srun --pty  --container-name=188639.0 --jobid=188639 bash
srun: Job 188639 step creation temporarily disabled, retrying

If I use non-gpu container it works but if I use GPU container am not able get shell

Case 2: [Non -GPU continer]

srun  --pty --container-image=ubuntu  sleep 1000
slurmstepd: pyxis: running "enroot import" ...
slurmstepd: pyxis: running "enroot create" ...
slurmstepd: pyxis: running "enroot start" ...
slurmstepd: task_p_pre_launch: Using sched_affinity for tasks

srun  --pty --container-name=1142.0 --jobid=1142 bash
slurmstepd: pyxis: reusing existing container
slurmstepd: pyxis: running "enroot start" ...
slurmstepd: task_p_pre_launch: Using sched_affinity for tasks
root@compute3:#

Case 3:[GPU Job]

 srun --pty  --gpus=1 --container-image=pytorch:19.10-py3  sleep 1000
slurmstepd: pyxis: running "enroot import" ...
slurmstepd: pyxis: running "enroot create" ...
slurmstepd: pyxis: running "enroot start" ...
slurmstepd: task_p_pre_launch: Using sched_affinity for tasks

with second srun slurm created a step with existing container but with --gups=0 below error occurred:

slurmstepd: error: pyxis: [ERROR] /etc/enroot/hooks.d/98-nvidia.sh exited with return code 1

srun  --pty  --gpus=0 --container-name=1144.0 --jobid=1144 bash
slurmstepd: pyxis: reusing existing container
slurmstepd: pyxis: running "enroot start" ...
slurmstepd: error: pyxis: container start failed with error code: 1
slurmstepd: pyxis: printing contents of log file ...
slurmstepd: error: pyxis:     nvidia-container-cli: initialization error: cuda error: no cuda-capable device is detected
slurmstepd: error: pyxis:     [ERROR] /etc/enroot/hooks.d/98-nvidia.sh exited with return code 1
slurmstepd: error: spank: required plugin spank_pyxis.so: task_init_privileged() failed with rc=-1
slurmstepd: error: spank_task_init_privileged failed
slurmstepd: error: write to unblock task 0 failed: Broken pipe
srun: error: compute-dgx-03: task 0: Exited with exit code 1
srun: Terminating job step 1144.1

RFE: parse credentials file, then swizzle docker URI

I have a feeling this will be the initial experience for most users when trying to import from a private registry:

$ enroot import my-registry.com/foo/bar:v1.0
[ERROR] Invalid argument: my-registry.com/foo/bar:v1.0

$ # find out about the docker:// prefix

$ enroot import docker://my-registry.com/foo/bar:v1.0
[ERROR] URL https://registry-1.docker.io/v2/my-registry.com/foo/bar/manifests/v1.0 returned error code: 401 

$ # realize that URL doesn't look right
$ # find out about the '/' to '#' conversion

$ enroot import docker://my-registry.com#foo/bar:v1.0
[ERROR] URL https://my-registry.com/jwt/auth returned error code: 403 

$ # realize there's some sort of permissions error going on
$ # find out about the credentials file

$ echo "machine my-registry.com login $USER password $ACCESS_TOKEN" >>~/.config/enroot/.credentials
-bash: /home/lyeager/.config/enroot/credentials: No such file or directory
$ mkdir -p ~/.config/enroot/
$ echo "machine my-registry.com login $USER password $ACCESS_TOKEN" >>~/.config/enroot/.credentials
$ enroot import docker://my-registry.com#foo/bar:v1.0

$ # still didn't work ...
$ # on a whim, try adding the user explicitly

$ enroot import docker://[email protected]#foo/bar:v1.0

$ # OMG it works!

I have a proposal to help with some of that pain.

What if we parse the .credentials file before attempting to use curl. We keep a list of (machine,login) pairs. Then, we parse the URI given on the command line. If the registry is null, then the user may have forgotten to swap the "/" for a "#". Check to see if the first part of the image matches any hostname in the credentials file. If so, swap the next "/" for a "#". Also, add the "USER@" prefix to the URI if it is not present, since we can read the user from the credentials file.

enroot-nsenter: failed to create user namespace: Invalid argument

I'm trying to build a deepops test environment with 2 KVM worker VMs (CentOS 7) with 1 V100D-8Q vGPU and 64G RAM each. I don't know if that makes sense, but for Kubernetes it worked mostly. For slurm I followed https://github.com/NVIDIA/deepops/blob/master/docs/deepops/dgx-pod.md#7-deploy-slurm, which also worked (with some modifications because of different NVIDIA driver for vGPU).

Now I want to "Verify Pyxis and Enroot can run GPU jobs across all nodes." (https://github.com/NVIDIA/deepops/tree/master/docs/slurm-cluster). The command that's run by ansible is:
srun -N 2 -G 2 --ntasks-per-node=1 --mpi=pmix --exclusive --container-image=deepops/nccl-tests-tf20.06-ubuntu18.04:latest all_reduce_perf -b 1M -e 4G -f 2 -g 1

It fails with:
pyxis: creating container filesystem ...
pyxis: starting container ...
slurmstepd: error: pyxis: container start failed with error code: 1
slurmstepd: error: pyxis: printing contents of log file ...
slurmstepd: error: pyxis: enroot-nsenter: failed to create user namespace: Invalid argument
slurmstepd: error: pyxis: couldn't start container
slurmstepd: error: pyxis: if the image has an unusual entrypoint, try using --no-container-entrypoint
slurmstepd: error: spank: required plugin spank_pyxis.so: task_init() failed with rc=-1
slurmstepd: error: Failed to invoke spank plugin stack
srun: error: deepops-node1: task 0: Exited with exit code 1

I'm very new to Slurm/pyxis/enroot, so I don't know what to make of "enroot-nsenter: failed to create user namespace: Invalid argument" and how to debug this. Any help appreciated, thank you very much.

PPC64LE support?

Having access to IBM AC922 with NVidia gpus, we would like to make use of enroot with pyxis/slurm for simple and secure containers.

Is this possible?

Problem in enroot working with cgroups

Getting following error with enroot while using with cgroups.

enroot start --rw lammps
nvidia-container-cli: mount error: open failed: /scratch/pbs/enroot-data/user-609.chas052/lammps/sys/fs/cgroup/devices/user.slice/devices.allow: permission denied

/scratch/pbs/enroot-data/user-609.chas052/lammps is our enroot data path

Below is the directory structure , having all required permission, please let us know where it is going wrong.

ll lammps/
total 80
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 bin
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Apr 24 2018 boot
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 dev
drwxr-xr-x 38 skapil.vfaculty cc_vfaculty 4096 Jan 11 12:55 etc
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Apr 24 2018 home
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Jan 11 12:55 host_pwd
drwxr-xr-x 8 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 lib
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 lib64
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 media
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 mnt
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 opt
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Apr 24 2018 proc
drwx------ 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 root
drwxr-xr-x 5 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 run
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 sbin
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 srv
drwxr-xr-x 2 skapil.vfaculty cc_vfaculty 4096 Apr 24 2018 sys
drwxrwxrwt 2 skapil.vfaculty cc_vfaculty 4096 Nov 14 2018 tmp
drwxr-xr-x 10 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 usr
drwxr-xr-x 11 skapil.vfaculty cc_vfaculty 4096 Oct 19 2018 var
[skapil.vfaculty@vsky003 /scratch/pbs/enroot-data/user-610.chas052]
$ ls -ld lammps/
drwxr-xr-x 22 skapil.vfaculty cc_vfaculty 4096 Jan 11 12:55 lammps/

Switch to lz4 as default squashfs compression?

For the simple example of 'enroot import 'docker://nvcr.io#nvidia/cuda:10.0-base', moving to lz4 instead of lzo is 2-3x faster in my quick testing and seems to end up with smaller images. lz4 was not installed by default on my system, but on debian systems, 'sudo apt install liblz4-tool' was all that was required.

[WARN] fusermount: mount failed: Operation not permitted

Hi!

I'm trying to track down a permission issue with enroot start on RHEL/CentOS 7.6.

All the listed requirements seem to be satisfied:

# rpm -q glibc
glibc-2.17-260.el7_6.6.x86_64

# ./enroot-check_2.0.0.run --verify
Kernel version:

Linux version 3.10.0-957.27.2.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Mon Jul 29 17:46:05 UTC 2019

Kernel configuration:

CONFIG_NAMESPACES                 : OK
CONFIG_USER_NS                    : OK
CONFIG_SECCOMP_FILTER             : OK
CONFIG_OVERLAY_FS                 : OK (module)
CONFIG_X86_VSYSCALL_EMULATION     : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_EMULATE           : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_NATIVE            : KO (required if glibc <= 2.13)

Kernel command line:

namespace.unpriv_enable=1         : OK
user_namespace.enable=1           : OK
vsyscall=native                   : KO (required if glibc <= 2.13)
vsyscall=emulate                  : KO (required if glibc <= 2.13)

Kernel parameters:

user.max_user_namespaces          : OK
user.max_mnt_namespaces           : OK

Extra packages:

nvidia-container-cli              : OK
pv                                : OK

and

# rpm -q squashfuse fuse-overlayfs
squashfuse-0.1.102-1.el7.x86_64
fuse-overlayfs-0.5.2-1.el7.x86_64

I can import a container just fine:

# ENROOT_SQUASH_OPTIONS='-comp xz -noD' enroot import -o /tmp/alpine.sqsh docker://alpine
[INFO] Querying registry for permission grant
[INFO] Authenticating with user: <anonymous>
[INFO] Authentication succeeded
[...]
Parallel mksquashfs: Using 40 processors
Creating 4.0 filesystem on /tmp/alpine.sqsh, block size 131072.
[==============================================================================================================================================================================================================================================================|] 105/105 100%

Exportable Squashfs 4.0 filesystem, xz compressed, data block size 131072
        uncompressed data, compressed metadata, compressed fragments, compressed xattrs
        duplicates are removed
Filesystem size 5030.47 Kbytes (4.91 Mbytes)
        91.86% of uncompressed filesystem size (5476.08 Kbytes)
Inode table size 2374 bytes (2.32 Kbytes)
        13.65% of uncompressed inode table size (17387 bytes)
Directory table size 4207 bytes (4.11 Kbytes)
        50.96% of uncompressed directory table size (8255 bytes)
Number of duplicate files found 7
Number of inodes 486
Number of files 73
Number of fragments 6
Number of symbolic links  324
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 89
Number of ids (unique uids + gids) 1
Number of uids 1
        root (0)
Number of gids 1
        root (0)

But then (as root):

# enroot start /tmp/alpine.sqsh
[WARN] fusermount: mount failed: Operation not permitted
[ERROR] Failed to mount: /tmp/alpine.sqsh

The command that fails is in runtime::_do_mount_rootfs()

[WARN] + squashfuse -f -o uid=0,gid=0 /tmp/alpine.sqsh /run/enroot/alpine/lower
[WARN] fusermount: mount failed: Operation not permitted

Fuse and fusermount work fine on that host (especially as root), and I can run that very same command outside of enroot:

# squashfuse -o uid=0,gid=0 /tmp/alpine.sqsh /run/enroot/alpine/lower
# df /run/enroot/alpine/lower
Filesystem     1K-blocks  Used Available Use% Mounted on
squashfuse             0     0         0    - /run/enroot/alpine/lower

But I can't figure out where the EPERM in enroot start is coming from. Any hint to diagnose the issue and hopefully fix it would be much appreciated!

Thanks!

Squashfs image uses lzo compression

Am I missing something?

$ enroot import docker://ubuntu
$ enroot start ubuntu.sqsh 
[WARN] Squashfs image uses lzo compression, this version supports only xz, zlib, lz4, zstd.
[ERROR] Failed to mount: /tmp/tmp.qdpNZdmXxx/ubuntu.sqsh

I already have fuse-overlayfs and squashfuse installed.

https://github.com/NVIDIA/enroot/blob/v1.1.0/doc/cmd/start.md#starting-container-images

Why not singularity

Nice work! I just stumbled across this. Is there information about benefits comparing against other container solutions? Some points pop out from your general description but more guidance and reasoning would be helpful.

unable to pull gcr images

enroot import docker://gcr.io#google-containers/addon-builder
[INFO] Querying registry for permission grant
[INFO] Found valid credentials in cache
[INFO] Fetching image manifest list
[INFO] Fetching image manifest
[ERROR] URL https://gcr.io/v2/google-containers/addon-builder/manifests/latest returned error code: 401 Unauthorized

I wonder if it's even picking up my credentials file. Even after deleting all the cache and credentials file. It still says found valid credentials in cache. Btw I am able to pull from docker hub but no luck gcr private/public images.

seccomp issues in enroot container

Hi,

I am having some issues running a Docker container https://github.com/LPDI-EPFL/masif/blob/master/docker_tutorial.md via enroot v3.1.1 getting Bad system call (core dumped) when I run a static binary (reduce).

The program successfully runs outside enroot and when I run the container using Docker. What's strange is that reduce also successfully runs if the enroot container is run via pyxis v0.8.1 on our SLURM cluster.

As far as I understand it, the problem is likely related to the seccomp filter set here

enroot/bin/enroot-nsenter.c

Lines 253 to 263 in 98dca83

 static int 

 seccomp_set_filter(void) 

 { 

 #ifdef ALLOW_SPECULATION 

 if ((int)syscall(SYS_seccomp, SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_SPEC_ALLOW, &(const struct sock_fprog){ARRAY_SIZE(filter), filter}) == 0) 

 return (0); 

 else if (errno != EINVAL) 

 return (-1); 

 #endif 

 return prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &(const struct sock_fprog){ARRAY_SIZE(filter), filter}); 

 }

Pyxis seems to set its seccomp filters differently which probably makes it work there https://github.com/NVIDIA/pyxis/blob/5d6d9a3657e853b32bb790dc7a89dc4d60ea0c8c/seccomp_filter.h#L70-L79

Is there a simple way to change the seccomp filter for enroot to get reduce also to work locally (without running via pyxis and SLURM)?

I already tried to compile enroot from source with -DALLOW_SPECULATION because I thought SECCOMP_FILTER_FLAG_SPEC_ALLOW might set some additional permissions. But this didn't help (but I might have messed up somewhere) edit: just realized this is probably not relevant for this.

Thank you

Outputs

`enroot-check --verify`:

Kernel version:

Linux version 4.18.0-193.28.1.el8_2.x86_64 ([email protected]) (gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)) #1 SMP Thu Oct 22 00:20:22 UTC 2020

Kernel configuration:

CONFIG_NAMESPACES                 : OK
CONFIG_USER_NS                    : OK
CONFIG_SECCOMP_FILTER             : OK
CONFIG_OVERLAY_FS                 : OK (module)
CONFIG_X86_VSYSCALL_EMULATION     : OK
CONFIG_VSYSCALL_EMULATE           : KO (required if glibc <= 2.13)
CONFIG_VSYSCALL_NATIVE            : KO (required if glibc <= 2.13)

Kernel command line:

vsyscall=native                   : KO (required if glibc <= 2.13)
vsyscall=emulate                  : OK

Kernel parameters:

user.max_user_namespaces          : OK
user.max_mnt_namespaces           : OK

Extra packages:

nvidia-container-cli              : OK

`strace reduce` | enroot:

execve("./reduce.3.23.130521.linuxi386", ["./reduce.3.23.130521.linuxi386"], 0x7ffdd8653080 /* 15 vars */) = 0
strace: [ Process PID=38139 runs in 32 bit mode. ]
uname( <unfinished ...>)                = ?
+++ killed by SIGSYS (core dumped) +++
Bad system call (core dumped)

`strace reduce` | baremetal:

execve("./reduce.3.23.130521.linuxi386", ["./reduce.3.23.130521.linuxi386"], 0x7ffde750e800 /* 26 vars */) = 0
strace: [ Process PID=37676 runs in 32 bit mode. ]
uname({sysname="Linux", nodename="localhost.localdomain", ...}) = 0
brk(NULL)                               = 0x885b000
brk(0x885bcb0)                          = 0x885bcb0
set_thread_area({entry_number=-1, base_addr=0x885b830, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=12)
brk(0x887ccb0)                          = 0x887ccb0
brk(0x887d000)                          = 0x887d000
stat64("reduce_wwPDB_het_dict.txt", 0xff8af580) = -1 ENOENT (No such file or directory)
write(2, "reduce: version 3.23 05/21/2013,"..., 69reduce: version 3.23 05/21/2013, Copyright 1997-2013, J. Michael Word) = 69
write(2, "\n", 1
)                       = 1
write(2, "reduce.3.23.130521", 18reduce.3.23.130521)      = 18
write(2, "\n", 1
)                       = 1
write(2, "arguments: [-flags] filename or "..., 33arguments: [-flags] filename or -) = 33
write(2, "\n", 1
)                       = 1
write(2, "\n", 1
)                       = 1
write(2, "Suggested usage:", 16Suggested usage:)        = 16
write(2, "\n", 1
)                       = 1
write(2, "reduce -FLIP myfile.pdb > myfile"..., 53reduce -FLIP myfile.pdb > myfileFH.pdb (do NQH-flips)) = 53
write(2, "\n", 1
)                       = 1
write(2, "reduce -NOFLIP myfile.pdb > myfi"..., 61reduce -NOFLIP myfile.pdb > myfileH.pdb (do NOT do NQH-flips)) = 61
write(2, "\n", 1
)                       = 1
write(2, "\n", 1
)                       = 1
write(2, "Flags:", 6Flags:)                   = 6
write(2, "\n", 1
)                       = 1
write(2, "-FLIP             add H and rota"..., 54-FLIP             add H and rotate and flip NQH groups) = 54
write(2, "\n", 1
)                       = 1
write(2, "-NOFLIP           add H and rota"..., 59-NOFLIP           add H and rotate groups with no NQH flips) = 59
write(2, "\n", 1
)                       = 1
write(2, "-Trim             remove (rather"..., 52-Trim             remove (rather than add) hydrogens) = 52
write(2, "\n", 1
)                       = 1
write(2, "\n", 1
)                       = 1
write(2, "-NOBUILD#.#       build with a g"..., 61-NOBUILD#.#       build with a given penalty often 200 or 999) = 61
write(2, "\n", 1
)                       = 1
write(2, "-BUILD            add H, includi"..., 73-BUILD            add H, including His sc NH, then rotate and flip groups) = 73
write(2, "\n", 1
)                       = 1
write(2, "                  (except for pr"..., 71                  (except for pre-existing methionine methyl hydrogens)) = 71
write(2, "\n", 1
)                       = 1
write(2, "\n", 1
)                       = 1
write(2, "\n", 1
)                       = 1
write(2, "-STRING           pass reduce a "..., 75-STRING           pass reduce a string object from a script, must be quoted) = 75
write(2, "\n", 1
)                       = 1
write(2, "usage: from within a script, red"..., 71usage: from within a script, reduce -STRING "_name_of_string_variable_") = 71
write(2, "\n", 1
)                       = 1
write(2, "\n", 1
)                       = 1
write(2, "-Quiet            do not write e"..., 56-Quiet            do not write extra info to the console) = 56
write(2, "\n", 1
)                       = 1
write(2, "-REFerence        display citati"..., 44-REFerence        display citation reference) = 44
write(2, "\n", 1
)                       = 1
write(2, "-Version          display the ve"..., 47-Version          display the version of reduce) = 47
write(2, "\n", 1
)                       = 1
write(2, "-Changes          display the ch"..., 40-Changes          display the change log) = 40
write(2, "\n", 1
)                       = 1
write(2, "-Help             the more exten"..., 74-Help             the more extensive description of command line arguments) = 74
write(2, "\n", 1
)                       = 1
exit_group(1)                           = ?
+++ exited with 1 +++

How to run enroot instance on multiple nodes , in order to run multinode job using container.

We want to run mutinode mpi job using enroot, the nodes are interconnected with each other using infiniband , password less ssh is also enabled , the problem is that when we are trying to run multinode job from one of the node ,the mpirun is not able to get the same container environment on the other nodes . Please guide us for the above mentioned scenario, in order to activate same enroot container environment on multiple node.

is there any way to name enroot container so that the container could be filtered by enroot list -f easier?

below are containers running on my machine:
[root@rcn04 ~]# enroot list -f
NAME PID STATE STARTED TIME MNTNS USERNS COMMAND
ubuntu 1583996 S 05:08 01:51:13 4026532603 4026532602 /bin/sh /root/test/1597122491.2123
1588974 S 05:31 01:27:44 4026532606 4026532605 /bin/sh -c sleep 22222222222222222
1588979 S 05:31 01:27:44 4026532608 4026532607 /bin/sh -c sleep 22222222222222222

I cannot distinguish them except maintaining a container-pid map.
Is there any option used to name the container like "docker run --name" ?

Thanks

RFE: import local docker image

It'd be cool if I could import a local docker image with enroot. Maybe something like this?

docker build -t my-tmp-image .
enroot import docker:///my-tmp-image

Or maybe there's a way to trick enroot into using my local daemon as a registry somehow?

is there any option to set cwd when enroot container starts?

[root@rcn04 ~]# enroot start ubuntu /bin/bash
root@rcn04:/# pwd
/
^^^^
When enroot container starts, the cwd is /, is there any way to set it?

Starting containers fails with EACCESS to bash/sh for select users

[Caveat: running the ppc64le branch]

My users report that they cannot start any container; they fail with:

enroot-switchroot: failed to execute: /bin/sh: Permission denied

I straced the execution and found the following:

For my own unpriviliged user, thinks work fine (sanitized):

chdir("${HOME}     = 0
access("/bin/bash", R_OK|X_OK)          = 0
access("/etc/rc", F_OK)                 = 0
execve("/bin/bash", ["-bash", "/etc/rc", "nvidia-smi"], 0x123d700c0 /* 13 vars */) = 0
brk(NULL)                               = 0x12cdc0000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
....

For other users, not so much:

chdir("${HOME}") = 0
access("/bin/bash", R_OK|X_OK)          = -1 EACCES (Permission denied)
access("/etc/rc", F_OK)                 = 0
execve("/bin/sh", ["-sh", "/etc/rc", "nvidia-smi"], 0x107a900e0 /* 13 vars */) = -1 EACCES (Permission denied)
writev(2, [{iov_base="enroot-switchroot: ", iov_len=19}, {iov_base=NULL, iov_len=0}], 2enroot-switchroot: ) = 19
writev(2, [{iov_base="failed to execute: /bin/sh", iov_len=26}, {iov_base=NULL, iov_len=0}], 2failed to execute: /bin/sh) = 26
writev(2, [{iov_base="", iov_len=0}, {iov_base=": ", iov_len=2}], 2: ) = 2
writev(2, [{iov_base="", iov_len=0}, {iov_base="Permission denied", iov_len=17}], 2Permission denied) = 17
writev(2, [{iov_base="", iov_len=0}, {iov_base="\n", iov_len=1}], 2
) = 1
exit_group(1)                           = ?
+++ exited with 1 +++

== Notes ==

The permissions of sh and bash are ok, and although they have SELinux labels:

# ls -alZ /bin/bash /bin/sh
-rwxr-xr-x. 1 root root system_u:object_r:shell_exec_t:s0 1980208 Aug 30  2019 /bin/bash
lrwxrwxrwx. 1 root root system_u:object_r:bin_t:s0              4 Aug 30  2019 /bin/sh -> bash

SELinux is off:

# sestatus
SELinux status:                 disabled

The home directory is on a network file system, but since it works for my unpriviliged user, I don't think that bit matter.

I would appreciate some guidance how to debug here :)

Unable to spawn more pty devices

I am trying to run a Jupyter Lab instance inside of a container using the continuumio/anaconda3 image. I did the recommended steps for pulling and creating the enroot container image. When running this container in Docker, I am able to do the following:

# python
>>> import pty
>>> pty.fork()
(12, 3)

But when I do this on my enroot container:

# python
>>> import pty
>>> pty.fork()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.7/pty.py", line 96, in fork
    master_fd, slave_fd = openpty()
  File "/opt/conda/lib/python3.7/pty.py", line 29, in openpty
    master_fd, slave_name = _open_terminal()
  File "/opt/conda/lib/python3.7/pty.py", line 59, in _open_terminal
    raise OSError('out of pty devices')
OSError: out of pty devices
>>>

This is important to Jupyter in that this is the process it uses to run bash commands or open up an interactive shell.

Thanks for the help!

Missing package dependencies for build

At least on my Ubuntu setup, I also needed to install 'automake'

How to run enroot in a managed K8s?

Hi, how to setup enroot kernel config in a customized Kubernetes setup? Or perhaps this is not possible when running in a K8s unprivileged pod?

./enroot-check_*.run --verify
[ERROR] Could not find kernel configuration

cat /proc/sys/user/max_user_namespaces
3062836

cat /proc/sys/user/max_mnt_namespaces
3062836

cat /proc/sys/kernel/unprivileged_userns_clone
1

gpu choice

I can't prevent access to all gpus with standard hooks.

export NVIDIA_VISIBLE_DEVICES=0
enroot start --env NVIDIA_AVAILABLE_DEVICES cuda-container

I took the hook given by nvidia-container-runtime but once in the container, nvidia-smi still gives informations for all gpus.

enroot-mount reports fail at specific directory.

enroot reports enroot-mount fail when cwd is some specific directory.

see below example:
yingbai@lsf1x125[yingbai]:$ mycommand /bin/bash --> pwd is /opt/yingbai
...
final cli:

/usr/bin/enroot start --mount /scratch/lsf_addon/lcui/lsf1011:/scratch/lsf_addon/lcui/lsf1011 --mount /home/yingbai:/home/yingbai --mount /tmp:/tmp --mount /opt/yingbai:/opt/yingbai --rw -m /scratch/lsf_addon/lcui/:/scratch/lsf_addon/lcui/ --mount /scratch/lsf_addon/lcui/passwd:/etc/passwd -m /scratch/lsf_addon/lcui/group:/etc/group -c /tmp/.lsf1011.job.conf.933 ubuntu /home/yingbai/.lsbatch/1599375571.933
enroot-mount: failed to mount: /opt/yingbai at /home/yingbai/.local/share/enroot/ubuntu/opt/yingbai: No such file or directory

^^^^^^fail!!! this is not always reproduce, I cannot figure out why, any inputs?

yingbai@lsf1x125[~]:$ mycommand /bin/bash. --> pwd is /home/yingbai

final cli:

/usr/bin/enroot start --mount /tmp:/tmp --mount /home/yingbai:/home/yingbai --mount /home/yingbai/:/home/yingbai/ --mount /scratch/lsf_addon/lcui/lsf1011:/scratch/lsf_addon/lcui/lsf1011 --rw -m /scratch/lsf_addon/lcui/:/scratch/lsf_addon/lcui/ --mount /scratch/lsf_addon/lcui/passwd:/etc/passwd -m /scratch/lsf_addon/lcui/group:/etc/group -c /tmp/.lsf1011.job.conf.934 ubuntu /home/yingbai/.lsbatch/1599375579.934
groups: cannot find name for group ID 100001
yingbai@lsf1x125:/$

--> success!!!

How to configure enroot with PBS Pro 19

Please provide guidelines to setup enroot with PBS , for interactive as well as batch job (mutinode)

/dev/nvidia-modeset doesn't exist

enroot start --env DISPLAY --env NVIDIA_DRIVER_CAPABILITIES --mount /tmp/.X11-unix:/tmp/.X11-unix cuda
nvidia-container-cli: mount error: stat failed: /dev/nvidia-modeset: no such file or directory
[ERROR] /etc/enroot/hooks.d/98-nvidia.sh exited with return code 1

How do I install/create the file?

Some more info:

lsmod|grep nvidia
nvidia_uvm            794624  0
nvidia_drm             45056  0
nvidia_modeset       1048576  1 nvidia_drm
nvidia              17235968  518 nvidia_uvm,nv_peer_mem,nvidia_modeset
drm_kms_helper        172032  1 nvidia_drm
ipmi_msghandler        53248  4 ipmi_devintf,ipmi_si,nvidia,ipmi_ssif
drm                   401408  4 drm_kms_helper,nvidia_drm,ttm

List dev

ls -1 /dev/nvidia*
/dev/nvidia0
/dev/nvidia1
/dev/nvidia10
/dev/nvidia11
/dev/nvidia12
/dev/nvidia13
/dev/nvidia14
/dev/nvidia15
/dev/nvidia2
/dev/nvidia3
/dev/nvidia4
/dev/nvidia5
/dev/nvidia6
/dev/nvidia7
/dev/nvidia8
/dev/nvidia9
/dev/nvidiactl
/dev/nvidia-nvlink
/dev/nvidia-nvswitch0
/dev/nvidia-nvswitch1
/dev/nvidia-nvswitch10
/dev/nvidia-nvswitch11
/dev/nvidia-nvswitch2
/dev/nvidia-nvswitch3
/dev/nvidia-nvswitch4
/dev/nvidia-nvswitch5
/dev/nvidia-nvswitch6
/dev/nvidia-nvswitch7
/dev/nvidia-nvswitch8
/dev/nvidia-nvswitch9
/dev/nvidia-nvswitchctl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools

Support for A100 MIG

I've seen c392284 but it's still no clear to me if enroot already fully supports "A100 GPU to be securely partitioned into up to seven separate GPU Instances"

MIG supports containers using Docker Engine
https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html

The article doesn't mention enroot.

enroot option to build sqsh image from Dockerfile

Hi,
Is that possible to add option to enroot so it can accept local Dockerfile as input, not only docker image?
Maybe enroot can use https://buildah.io/ non-priv OCI build tool internally to provide such convenience?

It will really help to automate cases (and provide best out-of-the-box experience) where one needs to edit/tune dockerfile frequently and run on enroot, and not jump through the hoops of building docker image that will be converted to sqsh anyway?

Thanks a lot

Build container from recipe file

Is it me, or is there no mention of building containers from a recipe file? It seems that enroot fully depends on container registries to import and run something? I can't build a container myself locally like with Singularity with the --fakeroot option and like Podman? If that's incorrect, then please highlight in the usage as well, and in the main page with the quick start overview. Thanks.

compared to Singularity

Hi there, is enroot similar to Singularity?

	static int
	seccomp_set_filter(void)
	{
	#ifdef ALLOW_SPECULATION
	if ((int)syscall(SYS_seccomp, SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_SPEC_ALLOW, &(const struct sock_fprog){ARRAY_SIZE(filter), filter}) == 0)
	return (0);
	else if (errno != EINVAL)
	return (-1);
	#endif
	return prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &(const struct sock_fprog){ARRAY_SIZE(filter), filter});
	}

nvidia / enroot Goto Github PK

enroot's Introduction

ENROOT

Key Concepts

Documentation

Copyright and License

Issues and Contributing

Reporting Security Issues

enroot's People

Contributors

Stargazers

Watchers

Forkers

enroot's Issues

Problem

System setup

NFS

Lustre

Local SSD

Description

Context

Outputs

enroot-check --verify:

strace reduce | enroot:

strace reduce | baremetal:

Recommend Projects

Recommend Topics

Recommend Org

`enroot-check --verify`:

`strace reduce` | enroot:

`strace reduce` | baremetal: