Git Product home page Git Product logo

jupyterhub-deploy-docker's Introduction

jupyterhub-deploy-docker

jupyterhub-deploy-docker provides a reference deployment of JupyterHub, a multi-user Jupyter Notebook environment, on a single host using Docker.

Possible use cases include:

  • Creating a JupyterHub demo environment that you can spin up relatively quickly.
  • Providing a multi-user Jupyter Notebook environment for small classes, teams, or departments.

Disclaimer

This deployment is NOT intended for a production environment. It is a reference implementation that does not meet traditional requirements in terms of availability, scalability, or security.

If you are looking for a more robust solution to host JupyterHub, or you require scaling beyond a single host, please check out the excellent zero-to-jupyterhub-k8s project.

Technical Overview

Key components of this reference deployment are:

  • Host: Runs the JupyterHub components in a Docker container on the host.

  • Authenticator: Uses Native Authenticator to authenticate users. Any user will be allowed to sign-up

  • Spawner:Uses DockerSpawner to spawn single-user Jupyter Notebook servers in separate Docker containers on the same host.

  • Persistence of Hub data: Persists JupyterHub data in a Docker volume on the host.

  • Persistence of user notebook directories: Persists user notebook directories in Docker volumes on the host.

Prerequisites

Docker

This deployment uses Docker, via Docker Compose, for all the things.

  1. Use Docker's installation instructions to set up Docker for your environment.

Authenticator setup

This deployment uses JupyterHub Native Authenticator to authenticate users.

  1. An single admin user will be enabled be default. Any user will be allowed to signup.

Build the JupyterHub Docker image

  1. Use docker-compose to build the JupyterHub Docker image:

    docker-compose build

Customisation: Jupyter Notebook Image

You can configure JupyterHub to spawn Notebook servers from any Docker image, as long as the image's ENTRYPOINT and/or CMD starts a single-user instance of Jupyter Notebook server that is compatible with JupyterHub.

To specify which Notebook image to spawn for users, you set the value of the
DOCKER_NOTEBOOK_IMAGE environment variable to the desired container image.

Whether you build a custom Notebook image or pull an image from a public or private Docker registry, the image must reside on the host.

If the Notebook image does not exist on host, Docker will attempt to pull the image the first time a user attempts to start his or her server. In such cases, JupyterHub may timeout if the image being pulled is large, so it is better to pull the image to the host before running JupyterHub.

This deployment defaults to the jupyter/minimal-notebook Notebook image, which is built from the minimal-notebook Docker stacks.

You can pull the image using the following command:

docker pull jupyter/minimal-notebook:latest

Run JupyterHub

Run the JupyterHub container on the host.

To run the JupyterHub container in detached mode:

docker-compose up -d

Once the container is running, you should be able to access the JupyterHub console at

http://localhost:8000

To bring down the JupyterHub container:

docker-compose down

FAQ

How can I view the logs for JupyterHub or users' Notebook servers?

Use docker logs <container>. For example, to view the logs of the jupyterhub container

docker logs jupyterhub

How do I specify the Notebook server image to spawn for users?

In this deployment, JupyterHub uses DockerSpawner to spawn single-user Notebook servers. You set the desired Notebook server image in a DOCKER_NOTEBOOK_IMAGE environment variable.

JupyterHub reads the Notebook image name from jupyterhub_config.py, which reads the Notebook image name from the DOCKER_NOTEBOOK_IMAGE environment variable:

# DockerSpawner setting in jupyterhub_config.py
c.DockerSpawner.image = os.environ['DOCKER_NOTEBOOK_IMAGE']

If I change the name of the Notebook server image to spawn, do I need to restart JupyterHub?

Yes. JupyterHub reads its configuration which includes the container image name for DockerSpawner. JupyterHub uses this configuration to determine the Notebook server image to spawn during startup.

If you change DockerSpawner's name of the Docker image to spawn, you will need to restart the JupyterHub container for changes to occur.

In this reference deployment, cookies are persisted to a Docker volume on the Hub's host. Restarting JupyterHub might cause a temporary blip in user service as the JupyterHub container restarts. Users will not have to login again to their individual notebook servers. However, users may need to refresh their browser to re-establish connections to the running Notebook kernels.

How can I backup a user's notebook directory?

There are multiple ways to backup and restore data in Docker containers.

Suppose you have the following running containers:

    docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"

    CONTAINER ID        IMAGE                    NAMES
    bc02dd6bb91b        jupyter/minimal-notebook jupyter-jtyberg
    7b48a0b33389        jupyterhub               jupyterhub

In this deployment, the user's notebook directories (/home/jovyan/work) are backed by Docker volumes.

    docker inspect -f '{{ .Mounts }}' jupyter-jtyberg

    [{jtyberg /var/lib/docker/volumes/jtyberg/_data /home/jovyan/work local rw true rprivate}]

We can backup the user's notebook directory by running a separate container that mounts the user's volume and creates a tarball of the directory.

docker run --rm \
  -u root \
  -v /tmp:/backups \
  -v jtyberg:/notebooks \
  jupyter/minimal-notebook \
  tar cvf /backups/jtyberg-backup.tar /notebooks

The above command creates a tarball in the /tmp directory on the host.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.