jupyterhub / helm-chart Goto Github PK

A store of Helm chart tarballs for deploying JupyterHub and BinderHub on a Kubernetes cluster

Home Page: https://jupyterhub.github.io/helm-chart/

jupyterhub jupyter jupyter-notebook kubernetes helm helm-charts binderhub binder

helm-chart's Introduction

JupyterHub

With JupyterHub you can create a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.

Project Jupyter created JupyterHub to support many users. The Hub can offer notebook servers to a class of students, a corporate data science workgroup, a scientific research project, or a high-performance computing group.

Technical overview

Three main actors make up JupyterHub:

multi-user Hub (tornado process)
configurable http proxy (node-http-proxy)
multiple single-user Jupyter notebook servers (Python/Jupyter/tornado)

Basic principles for operation are:

Hub launches a proxy.
The Proxy forwards all requests to Hub by default.
Hub handles login and spawns single-user servers on demand.
Hub configures proxy to forward URL prefixes to the single-user notebook servers.

JupyterHub also provides a REST API for administration of the Hub and its users.

Installation

Check prerequisites

A Linux/Unix based system
Python 3.8 or greater
nodejs/npm
- If you are using conda, the nodejs and npm dependencies will be installed for you by conda.
- If you are using pip, install a recent version (at least 12.0) of nodejs/npm.
If using the default PAM Authenticator, a pluggable authentication module (PAM).
TLS certificate and key for HTTPS communication
Domain name

Install packages

Using `conda`

To install JupyterHub along with its dependencies including nodejs/npm:

conda install -c conda-forge jupyterhub

If you plan to run notebook servers locally, install JupyterLab or Jupyter notebook:

conda install jupyterlab
conda install notebook

Using `pip`

JupyterHub can be installed with pip, and the proxy with npm:

npm install -g configurable-http-proxy
python3 -m pip install jupyterhub

If you plan to run notebook servers locally, you will need to install JupyterLab or Jupyter notebook:

python3 -m pip install --upgrade jupyterlab
python3 -m pip install --upgrade notebook

Run the Hub server

To start the Hub server, run the command:

jupyterhub

Visit http://localhost:8000 in your browser, and sign in with your system username and password.

Note: To allow multiple users to sign in to the server, you will need to run the jupyterhub command as a privileged user, such as root. The wiki describes how to run the server as a less privileged user, which requires more configuration of the system.

Configuration

The Getting Started section of the documentation explains the common steps in setting up JupyterHub.

The JupyterHub tutorial provides an in-depth video and sample configurations of JupyterHub.

Create a configuration file

To generate a default config file with settings and descriptions:

jupyterhub --generate-config

Start the Hub

To start the Hub on a specific url and port 10.0.1.2:443 with https:

jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert

Authenticators

Authenticator	Description
PAMAuthenticator	Default, built-in authenticator
OAuthenticator	OAuth + JupyterHub Authenticator = OAuthenticator
ldapauthenticator	Simple LDAP Authenticator Plugin for JupyterHub
kerberosauthenticator	Kerberos Authenticator Plugin for JupyterHub

Spawners

Spawner	Description
LocalProcessSpawner	Default, built-in spawner starts single-user servers as local processes
dockerspawner	Spawn single-user servers in Docker containers
kubespawner	Kubernetes spawner for JupyterHub
sudospawner	Spawn single-user servers without being root
systemdspawner	Spawn single-user notebook servers using systemd
batchspawner	Designed for clusters using batch scheduling software
yarnspawner	Spawn single-user notebook servers distributed on a Hadoop cluster
wrapspawner	WrapSpawner and ProfilesSpawner enabling runtime configuration of spawners

Docker

A starter docker image for JupyterHub gives a baseline deployment of JupyterHub using Docker.

Important: This quay.io/jupyterhub/jupyterhub image contains only the Hub itself, with no configuration. In general, one needs to make a derivative image, with at least a jupyterhub_config.py setting up an Authenticator and/or a Spawner. To run the single-user servers, which may be on the same system as the Hub or not, Jupyter Notebook version 4 or greater must be installed.

The JupyterHub docker image can be started with the following command:

docker run -p 8000:8000 -d --name jupyterhub quay.io/jupyterhub/jupyterhub jupyterhub

This command will create a container named jupyterhub that you can stop and resume with docker stop/start.

The Hub service will be listening on all interfaces at port 8000, which makes this a good choice for testing JupyterHub on your desktop or laptop.

If you want to run docker on a computer that has a public IP then you should (as in MUST) secure it with ssl by adding ssl options to your docker configuration or by using an ssl enabled proxy.

Mounting volumes will allow you to store data outside the docker image (host system) so it will be persistent, even when you start a new image.

The command docker exec -it jupyterhub bash will spawn a root shell in your docker container. You can use the root shell to create system users in the container. These accounts will be used for authentication in JupyterHub's default configuration.

Contributing

If you would like to contribute to the project, please read our contributor documentation and the CONTRIBUTING.md. The CONTRIBUTING.md file explains how to set up a development installation, how to run the test suite, and how to contribute to documentation.

For a high-level view of the vision and next directions of the project, see the JupyterHub community roadmap.

A note about platform support

JupyterHub is supported on Linux/Unix based systems.

JupyterHub officially does not support Windows. You may be able to use JupyterHub on Windows if you use a Spawner and Authenticator that work on Windows, but the JupyterHub defaults will not. Bugs reported on Windows will not be accepted, and the test suite will not run on Windows. Small patches that fix minor Windows compatibility issues (such as basic installation) may be accepted, however. For Windows-based systems, we would recommend running JupyterHub in a docker container or Linux VM.

Additional Reference: Tornado's documentation on Windows platform support

License

We use a shared copyright model that enables all contributors to maintain the copyright on their contributions.

All code is licensed under the terms of the revised BSD license.

Help and resources

We encourage you to ask questions and share ideas on the Jupyter community forum. You can also talk with us on our JupyterHub Gitter channel.

JupyterHub follows the Jupyter Community Guides.

helm-chart's People

Contributors

Stargazers

Watchers

helm-chart's Issues

Github Authentication

I was trying to get the Github Authentication working. I saw that it wasn't part of the Helm chart and thought I'd try to add it.

You can see the changes here. I'm glad to issue as a pull request but it doesn't seem to be working.
master...Kuberlytics:githubauth

I get no error in the deployment and even after restarting the Hub pod it doesn't seem to use the Github authentication. Not sure if maybe I'm missing something.

auth:
type: github
github:
clientId:
clientSecret:
callbackUrl: http:///hub/oauth_callback

When I ssh into the hub it looks like everything was updated properly.

Persistent storage not working with kubespawner

If the persistent storage is enabled for the singleuser, pods stay in the "pending" state. The logs show following error:

5m 38s 22 default-scheduler Warning FailedScheduling [SchedulerPredicates failed due to persistentvolumeclaim "claim-yuvipanda" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaim "claim-yuvipanda" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaim "claim-yuvipanda" not found, which is unexpected.]

Interestingly, the pvc name in the error message doesn't match with the PVCs that are created in the system.

ubuntu@ip-10-0-0-12:~$ kubectl get pvc --namespace=indigo
NAME                STATUS    VOLUME                                     CAPACITY   ACCESSMODES   STORAGECLASS   AGE
claim-test-1        Bound     pvc-4132c691-55f3-11e7-a0c9-027e69b560b0   10Gi       RWO           default        7m
claim-yuvipanda-2   Bound     pvc-81a58060-55f3-11e7-a0c9-027e69b560b0   10Gi       RWO           default        5m
pvc name has "-2". Not sure

However if we change the storage type to none in the config.yaml file, the pods do come up without any problems.

Here are some additional logs: https://gist.github.com/kamalhussain/fb0ec2e6a412e7ecdec50f21a29682a7

v0.2 release tracking

Add support for global password to dummyauthenticator
Upgrade version of kubespawner
Better defaults for memory limits / cpu limits
Move culler into the hub pod

This is good I think

Add support for Mediawiki Authenticator

For use in the Wikimedia installation :)

.

Fix already existing random job name.

Matthias Bussonnier @Carreau 09:52
Morning/Evening all, when running helm upgrade ... I get a
 Error: jobs.batch "pull-all-nodes-<mynamespace>-1" already exists
Any clue why, and what I can do ?

Matthias Bussonnier @Carreau 09:58
Response : kubectl delete jobs pull-all-nodes-<mynamespace>-1 --namespace=<mynamespace>
and job were list-able with kubectl get jobs --namespace=<mynamespace>

Yuvi Panda @yuvipanda 10:00
thanks @Carreau
that seems to happen when a helm install or upgrade is interrupted, or when you re-install into same namespace. should fix them both
can you open an issue in the helm-chart repo?

Matthias Bussonnier @Carreau 10:01
Yep. Seem to be the case when the docker image is huge. I got a timeout but the kubectl --namespace=<YOUR_NAMESPACE> get pod listed them as successful after a few minutes.

Add configuration to support `svc` annotations

I'd like to add configuration values to add arbitrary service annotations. In particular, using AWS ELB SSL termination a la https://gist.github.com/mgoodness/1a2926f3b02d8e8149c224d25cc57dc1

Validate install on 1.6.x version of GKE

The defaults are still 1.5.x, but let's verify this works with 1.6.x too

Allow customizing base images for hub, proxy and culler pods

Some installations (such as Wikimedia's) require that all container images derive from a pre-approved list. We can probably do this pretty easily, and we should!

Restart the hub whenever any hub specific config changes

Related to #9 - any other config change should automatically restart the hub as well. Should not restart the proxy, however!

CPU guarantee parsing error

I tried to set the CPU limits as specified in the docs, but it didn't start due to the following error:

    traitlets.traitlets.TraitError: The 'cpu_guarantee' trait of a KubeSpawner instance must be a float, but a value of '500m' <class 'str'> was specified.

Setup:

  cpu:
    limit: 500m
    guarantee: 500m

Versions:

jupyterhub-v0.4
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T08:56:23Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

Add ability to pass a variables dictionary to the hub's configmap.

I have a modified hub image with some code changes in jupyter_config.py. I'd like to pass some external variables that are accessible inside of it, but I would like to avoid forking this helm chart.

I propose the following - can we modify templates/hub/configmap.yaml to include a new entry in the data dictionary, say hub.extra_parameters. A user would then be able to fill these extra parameters in their values.yaml file.

So the values.yaml might include:

hub:
  extraParameters:
    hash_salt: course24
    timeout: 1600

If there's another way to pass an external parameter into jupyterhub_config.py without modifying the Docker image every time this parameter changes, please let me now.

Happy to create a pull request for this too.

Document extending storage

Right now in the docs it's not intuitive how to:

Enable or disable persistent storage (even though it's mentioned in the docs just above here
Configure how much storage each user has

Add support for LDAP Authenticator

https://github.com/jupyterhub/ldapauthenticator

Looking up user details in jupyterhub can fail when container run a uid not in passwd file.

Under a host service such as OpenShift where the security model results in containers being forced to run as an effective random user ID (not in passwd file), the jupyterhub process will fail to start as it doesn't cope with there being no user in the passwd file.

Traceback (most recent call last):
  File "/usr/local/bin/jupyterhub", line 3, in <module>
    from jupyterhub.app import main
  File "/usr/local/lib/python3.5/dist-packages/jupyterhub/app.py", line 48, in <module>
    from .services.service import Service
  File "/usr/local/lib/python3.5/dist-packages/jupyterhub/services/service.py", line 107, in <module>
    class Service(LoggingConfigurable):
  File "/usr/local/lib/python3.5/dist-packages/jupyterhub/services/service.py", line 175, in Service
    user = Unicode(getuser(),
  File "/usr/lib/python3.5/getpass.py", line 170, in getuser
    return pwd.getpwuid(os.getuid())[0]
KeyError: 'getpwuid(): uid not found: 1000090000'

Options are to change jupyterhub to be tolerant of this and not fail, or to make /etc/passwd file writable in Dockerfile and then in startup wrapper script for jupyterhub, add an entry for an assigned UID if not existing in the passwd file. For example:

# Ensure that assigned uid has entry in /etc/passwd.

if [ `id -u` -ge 10000 ]; then
    cat /etc/passwd | sed -e "s/^$NB_USER:/builder:/" > /tmp/passwd
    echo "$NB_USER:x:`id -u`:`id -g`:,,,:/home/$NB_USER:/bin/bash" >> /tmp/passwd
    cat /tmp/passwd > /etc/passwd
    rm /tmp/passwd
fi

Collect logs from culling script.

I haven't found a way to collect logs from the cull-idle service without having to copy the hub docker image files and modify jupyterhub_config.py. It might be a good addition to add that to the config map. Here's how the code looks for enabling this without changing the config map.

if get_config('cull.enabled', False):
    cull_timeout = get_config('cull.timeout')
    cull_every = get_config('cull.every')
    c.JupyterHub.services = [
        {
            'name': 'cull-idle',
            'admin': True,
            'command': [
                '/usr/bin/python3',
                '/usr/local/bin/cull_idle_servers.py',
                '--timeout=%s' % cull_timeout,
                '--cull_every=%s' % cull_every,
                '--log_file_prefix=/srv/cull-idle-log.txt'   # <---- Added this line
            ]
        }
    ]

Schema for helm chart

We're developing a schema for the helm chart so that we can structure / describe / document possible fields.

@yuvipanda is taking a pass at adding the pieces, then @choldgraf will take a pass.

Could use this for docs: https://pypi.python.org/pypi/sphinx-jsonschema

Steps for deployment on OpenShift.

Following is steps required for deployment on OpenShift.

Create a new project.

oc new-project jupyter-helm

Initialise helm in the new project (don't use default of kube-system).

helm init --tiller-namespace jupyter-helm

This will deploy tiller-deploy.

Grant the default service account in the project cluster-reader access. This is necessary as helm wants to be able to read all namespaces for the cluster.

As a cluster admin run:

oc policy add-role-to-user cluster-reader -z default -n jupyter-helm

Grant the default service account in the project edit access. This is necessary as helm and JupyterHub need to create resources in the same project.

As a project or cluster admin run:

oc policy add-role-to-user edit -z default -n jupyter-helm

Allow images to be run as any user in project. This is necessary as the Jupyter Notebook images currently must be run as UID 1000 and cannot run as an assigned UID.

As a cluster admin run:

oc adm policy add-scc-to-user anyuid -z default -n jupyter-helm

Create a config.yaml containing:

hub:
  cookieSecret: "RANDOM_STRING_1"
  db:
    type: sqlite-memory
token:
  proxy: "RANDOM_STRING_2"
prePuller:
  enabled: false
singleuser:
  storage:
    type: none

Need to disable image pre-puller and use of persistent volumes. Need to work out what other roles need to be granted to allow those.

Deploy JupyterHub using helm.

helm install https://github.com/jupyterhub/helm-chart/releases/download/v0.3/jupyterhub-v0.3.tgz --name=hub --namespace=jupyter-helm --tiller-namespace=jupyter-helm -f config.yaml

Necessary to specify namespace for tiller-deploy and where JupyterHub is to be deployed.

Expose the JuptyterHub proxy via a route.

Create route.json containing:

{
    "apiVersion": "v1",
    "kind": "Route",
    "metadata": {
        "name": "proxy-public"
    },
    "spec": {
        "host": "",
        "to": {
            "kind": "Service",
            "name": "proxy-public",
            "weight": 100
        }
    }
}

Run:

oc create -f route.json

Note that you CANNOT run:

oc expose svc/proxy-public

as it is not setting up the route correctly.

Strangely, creating route from web console for the service does work though. Maybe a bug in oc expose.

Hub in State Crash for jupyterhub/k8s-hub:b2fb7a6 or jupyterhub/k8s-hub:b2fb7a6:627ff35

see below.

Merged with zero-to-jupyterhub

This repo's been merged with z2jh. There's still a lot over here that isn't merged

We still need to migrate helm-chart docs and all Issues here over to there. Do the low-level 'developing on the helm charts' docs belong in the z2jh rest docs, or a separate docs location in the same repo, since now there are two separate things to document?

Once issues have been migrated, we need to disable Issues and pull requests on this repo and update the README to point to the new home.

Deprecate token.proxy

Ancient config from before we separated things out into hub / proxy / culler, etc.

We haven't changed it yet because we didn't want to force a restart on the UCB clusters. Let's change it for the 0.3 release

Question: Why do you map jupyterhub config via configmap?

Hey,

mapping jupyterhub config via a configmap hides configuration details. Not sure what's the reason for that?

Better CI

We should do the following on CI:

Run minikube (with --provider=none)
Install helm (pin version?)
Install chart with various configs
Make sure that with dummy authenticator we can get a pod running.

Remove support for deprecated values

Currently the following things are deprecated (or soon will be!):

createNamespace (set to false by default, set to true in Berkeley deployments for b/c reasons)
name (used to create namespace, used only in Berkeley deployments)
token.proxy (#11)

Let's get rid of all of these in a release, and provide an 'upgrade' advisory.

Use secrets for cookieSecret + proxy tokens

Since these are in fact secrets!

After scaling cluster from 0 to x, external IP address times out

I was following the recommendations to scale the cluster to 0, the scale up on the day as per: https://zero-to-jupyterhub.readthedocs.io/en/latest/extending-jupyterhub.html#expanding-and-contracting-the-size-of-your-cluster

But when scaling back up, the main external IP address just times out.

I can't quite seem to find the problem. The LB is pointing to the service. The service is pointing to the proxy. The proxy is pointing to the Hub. It's almost like the hub is binding to the wrong IP address.

I'll add more information if I find it.

Recreate:

Standard instructions to create a cluster, add helm, install jupyter hub.

Then resize to zero nodes:

gcloud container clusters resize ${CLUSTER_NAME} --size 0 --zone ${ZONE}

Wait for that to finish. Then scale back up. I made the size different to the original size. Not sure if that made any difference.

gcloud container clusters resize ${CLUSTER_NAME} --size 5 --zone ${ZONE}

Workaround
Don't scale down your cluster to zero. Always make sure you have at least one node running.

Edit 1
This only occurs if you resize to a number of nodes that was NOT your original size.

In fact, I can fix a broken cluster by scaling from 5 (not original) to 3 (original).

Edit 2
Scaling from the original size, to another larger size works. This implies that there is some dependence on the hub running in the same place, or at least one of the original nodes. I'm not sure.

Support for setting additional env vars to helm chart.

@yuvipanda

As discussed, want to add support for setting private env vars GITHUB_API_TOKEN and ALLOWED_GITHUB_ACCOUNTS to helm chart. Thanks!

Error: PersistentVolumeClaim is not bound: "hub-db-dir"

Hi,

I'm trying to run this chart on GKE but the deployment keeps failing with PersistentVolumeClaim is not bound: "hub-db-dir".

Steps I took:

Created a container cluster (v1.7.0)
$ kubectl proxy
$ kubectl config current-context
Installed Helm (v2.5.0)
$ helm init --upgrade
Added a file config.yaml:

hub:
  cookieSecret: "xxxx"
proxy:
  secretToken: "xxxx"

$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
$ helm install jupyterhub/jupyterhub --version=v0.4 --name=jupyterhub-test --namespace=jupyterhub-test -f config.yaml

This resulted in the following output:

NAME:   xxxxxx
LAST DEPLOYED: Fri Jul 14 15:30:48 2017
NAMESPACE: xxxxxxx
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME          CLUSTER-IP     EXTERNAL-IP  PORT(S)       AGE
proxy-public  10.31.247.38   <pending>    80:32453/TCP  1s
proxy-api     10.31.241.160  <none>       8001/TCP      1s
hub           10.31.247.218  <none>       8081/TCP      1s

==> v1beta1/Deployment
NAME              DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
hub-deployment    1        1        1           0          1s
proxy-deployment  1        1        1           0          1s

==> v1/Secret
NAME        TYPE    DATA  AGE
hub-secret  Opaque  2     1s

==> v1/ConfigMap
NAME          DATA  AGE
hub-config-1  13    1s

==> v1/PersistentVolumeClaim
NAME        STATUS   VOLUME    CAPACITY  ACCESSMODES  STORAGECLASS  AGE
hub-db-dir  Pending  standard  1s


NOTES:
Thank you for installing JupyterHub!

Your release is named xxxxxxx and installed into the namespace xxxxxx.

You can find if the hub and proxy is ready by doing:

 kubectl --namespace=xxxxxx get pod

and watching for both those pods to be in status 'Ready'.

You can find the public IP of the JupyterHub by doing:

 kubectl --namespace=xxxxxx get svc proxy-public

It might take a few minutes for it to appear!

Note that this is still an alpha release! If you have questions, feel free to
  1. Come chat with us at https://gitter.im/jupyterhub/jupyterhub
  2. File issues at https://github.com/jupyterhub/helm-chart/issues

Output from kubectl's get pod and get svc:

$ kubectl --namespace=xxxxx get pod

NAME                              READY     STATUS    RESTARTS   AGE
hub-deployment-820122001-d236w    0/1       Pending   0          1m
proxy-deployment-51742714-tc73p   1/1       Running   0          1m

$ kubectl --namespace=xxxxx get svc

NAME           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE
hub            10.31.247.218   <none>           8081/TCP       2m
proxy-api      10.31.241.160   <none>           8001/TCP       2m
proxy-public   10.31.247.38    104.199.48.144   80:32453/TCP   2m

(The external IP for the proxy-public service should be accessible in a minute or two.)

Nothing appears on the external IP ofcourse..
Any ideas how I can fix the PersistentVolumeClaim is not bound: "hub-db-dir" error?

Thanks!

Gijs

Getting the gitlab authenticator working against jupyterhub 0.8x

I use a self-hosted gitlab as an authenticator and wanted to get it working with the 0.5x branch of this repo, which is evidently compatible with v0.8x of the hub.

While various authenticators from oauthenticator are handled in here, gitlab isn't explicitly, so some of the following changes are necessary.

/images/hub/jupyterhub_config.py: added handling for gitlab. While the github authenticator uses c.JupyterHub.authenticator_class = 'oauthenticator.GitHubOAuthenticator', I found it was necessary to use 'oauthenticator.gitlab.GitLabOAuthenticator' for gitlab, otherwise it was complaining about not being able to import the class.

elif auth_type == 'gitlab':
    c.JupyterHub.authenticator_class = 'oauthenticator.gitlab.GitLabOAuthenticator'
    c.GitLabOAuthenticator.oauth_callback_url = get_config('auth.gitlab.callback-url')
    c.GitLabOAuthenticator.client_id = get_config('auth.gitlab.client-id')
    c.GitLabOAuthenticator.client_secret = get_config('auth.gitlab.client-secret')

/jupyterhub/templates/hub/configmap.yaml: added in handling for gitlab.

{{ if eq .Values.auth.type "gitlab" -}}
  auth.gitlab.client-id: {{.Values.auth.gitlab.clientId | quote}}
  auth.gitlab.client-secret: {{.Values.auth.gitlab.clientSecret | quote}}
  auth.gitlab.callback-url: {{.Values.auth.gitlab.callbackUrl | quote}}
  {{- end }}

In previous incarnations of running jupyterhub on kubernetes, it was necessary to pass environment variables such as GITLAB_CLIENT_ID, GITLAB_CLIENT_SECRET and OAUTH_CALLBACK_URL as key:value pairs in the deployment.yaml, which would then allow them to be surfaced as environment variables on hub and then accessed in the running jupyterhub_config.py. As I understand it, this has changed with all these settings getting automatically passed to a series of config files in /etc/jupyterhub/config and jupyterhub_config.py now accesses the values from there. The only exception is that GITLAB_HOST should still be passed as a key:value into the deployment.yaml, which I'm doing via a hub.extraEnv.

So my config.yaml looks something like this:

hub:
  cookieSecret: "xxx"
  image:
    name: my-private-registry/jupyterhub-k8s
    tag: v0.8.0b5_ns_gl
  extraEnv:
    GITLAB_HOST: "https://my-self-hosted-gitlab.com/"

proxy:
  secretToken: "xxxx"

singleuser:
  image:
    name: my-private-registry/k8s-singleuser-sample
    tag: v0.8.0b5

auth:
  type: gitlab
  gitlab:
    callbackUrl: "http://jupyterhub-ip.com/hub/oauth_callback"
    clientId: "xxxx"
    clientSecret: "xxxx"
#    gitlabHost: "https://my-self-hosted-gitlab.com/"
#    gitlabApiUrl: "https://my-self-hosted-gitlab.com/api/v3/"

admin:
  users:
    - testuser

On the hub image, I have installed the latest 0.7.0-dev version of oauthenticator alongside the latest 0.8.0b5 version of hub. It is also using a relatively recent version of kubespawner (@fa874329f).

With these various changes implemented ... while I'm being correctly directed to an orange gitlab login button, whatever inter-change is happening after that is failing and I'm not being presented with the GITLAB_HOST for authentication.

Is it obvious to any user with this working against v0.8x of the hub and using gitlab authentication if I'm missing something. I have the same set-up implemented that works fine against v.0.7.2 of the hub ... so somehow something has changed and I'm not handling it properly.

Thanks.

Public IP Address

Previously there was a configuration setting for setting a public IP address. Is that still available? Might be difficult to get it working with different cloud providers. However, previously it was working with gcloud.

deployment via OpenStack Magnum fails due to PVC failure

I tried out deploying the chart on CERN's OpenStack+ Magnum deployment and am seeing an issue with the PersistentVolumeClaim. The template seems to not get filled properly. on describe I get

$>  kubectl describe pvc hub-db-dir -n myjupyterhub
Name:           hub-db-dir
Namespace:      myjupyterhub
StorageClass:
Status:         Pending
Volume:
Labels:         <none>
Capacity:
Access Modes:
No events.

There is a PV available to which it could in principle bind. I tried manually with a PVC like this and that works fine.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: jupyterhub-manual
  namespace: myjupyterhub
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

any pointers what to change?

Issue with a "non-none" storage type on minikube

Getting an issue with the Helm chart on minikube when I don't specify a singleuse storage type of none:

singleuser:
  storage:
    type: none

How to reproduce:

Launch a new minikube cluster and use a Helm + JupyterHub configuration of this repo's minikube_config.yml without lines 7-8. If you start up a singleuser server (time is a bit sensitive since kubernetes will kill the pod after a number of failed restarts) and then rush to the command line, executing the following will get some similar error messages:

$ kubectl get pods
NAME                                READY     STATUS             RESTARTS   AGE
hub-deployment-2728205822-5qrqv     1/1       Running            0          37m
jupyter-dummy                       0/1       CrashLoopBackOff   1          7s
proxy-deployment-1227971824-cdg8z   1/1       Running            0          2h

$ kubectl describe pod jupyter-dummy > describe_pod_log.txt

$ kubectl logs jupyter-dummy 
/srv/venv/lib/python3.5/site-packages/IPython/paths.py:69: UserWarning: IPython parent '/home/jovyan' is not a writable location, using a temp directory.
  " using a temp directory.".format(parent))
Traceback (most recent call last):
  File "/srv/venv/bin/jupyterhub-singleuser", line 6, in <module>
    main()
  File "/srv/venv/lib/python3.5/site-packages/jupyterhub/singleuser.py", line 322, in main
    return SingleUserNotebookApp.launch_instance(argv)
  File "/srv/venv/lib/python3.5/site-packages/jupyter_core/application.py", line 267, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/srv/venv/lib/python3.5/site-packages/traitlets/config/application.py", line 657, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-7>", line 2, in initialize
  File "/srv/venv/lib/python3.5/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/srv/venv/lib/python3.5/site-packages/notebook/notebookapp.py", line 1290, in initialize
    super(NotebookApp, self).initialize(argv)
  File "<decorator-gen-6>", line 2, in initialize
  File "/srv/venv/lib/python3.5/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/srv/venv/lib/python3.5/site-packages/jupyter_core/application.py", line 243, in initialize
    self.migrate_config()
  File "/srv/venv/lib/python3.5/site-packages/jupyterhub/singleuser.py", line 239, in migrate_config
    super(SingleUserNotebookApp, self).migrate_config()
  File "/srv/venv/lib/python3.5/site-packages/jupyter_core/application.py", line 169, in migrate_config
    migrate()
  File "/srv/venv/lib/python3.5/site-packages/jupyter_core/migrate.py", line 240, in migrate
    ensure_dir_exists(env['jupyter_config'])
  File "/srv/venv/lib/python3.5/site-packages/ipython_genutils/path.py", line 167, in ensure_dir_exists
    os.makedirs(path, mode=mode)
  File "/usr/lib/python3.5/os.py", line 241, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/jovyan/.jupyter'

So the singleuser container is running into permissions issues writing to the /home/jovyan/ directory, which is mounted on the PersistentVolumeClaim claim-dummy, I think (see the attached file from above)? More information regarding the pvcs:

$ kubectl get pvc
NAME                      STATUS    VOLUME                                     CAPACITY   ACCESSMODES   STORAGECLASS   AGE
claim-dummy               Bound     pvc-087fcf72-8a86-11e7-9cf9-080027a064f0   1Gi        RWO           standard       1h
hub-db-dir                Bound     pvc-5016b6c8-8a7b-11e7-9cf9-080027a064f0   1Gi        RWO           standard       3h
jupyterhub-minikube-pvc   Bound     pvc-b5108598-8a85-11e7-9cf9-080027a064f0   1Gi        RWX           standard       1h

$ kubectl describe pvc claim-dummy 
Name:		claim-dummy
Namespace:	default
StorageClass:	standard
Status:		Bound
Volume:		pvc-087fcf72-8a86-11e7-9cf9-080027a064f0
Labels:		app=jupyterhub
		heritage=jupyterhub
		hub.jupyter.org/username=dummy
Annotations:	control-plane.alpha.kubernetes.io/leader={"holderIdentity":"c627a2c9-8a7a-11e7-9cf9-080027a064f0","leaseDurationSeconds":15,"acquireTime":"2017-08-26T17:43:13Z","renewTime":"2017-08-26T17:43:15Z","lea...
		pv.kubernetes.io/bind-completed=yes
		pv.kubernetes.io/bound-by-controller=yes
		volume.beta.kubernetes.io/storage-provisioner=k8s.io/minikube-hostpath
Capacity:	1Gi
Access Modes:	RWO
Events:		<none>

$ kubectl describe pv pvc-087fcf72-8a86-11e7-9cf9-080027a064f0 
Name:		pvc-087fcf72-8a86-11e7-9cf9-080027a064f0
Labels:		<none>
Annotations:	hostPathProvisionerIdentity=c627a242-8a7a-11e7-9cf9-080027a064f0
		pv.kubernetes.io/provisioned-by=k8s.io/minikube-hostpath
		volume.beta.kubernetes.io/storage-class=standard
StorageClass:	standard
Status:		Bound
Claim:		default/claim-dummy
Reclaim Policy:	Delete
Access Modes:	RWO
Capacity:	1Gi
Message:	
Source:
    Type:	HostPath (bare host directory volume)
    Path:	/tmp/hostpath-provisioner/pvc-087fcf72-8a86-11e7-9cf9-080027a064f0
Events:		<none>

Allow alternate ways to specify credentials for kubernetes API access

Some installations (such as... Wikimedia's) don't have serviceaccounts. We should support an alternate (but still maintainable!) way of getting credentials in.

make chart fails with "No rule to make target chart"

Hello Jupyterhub team,
I have made some customisation to the default helm chart. The *Usage" section in the README states that I can package a custom chart using
make chart
however this throws a "No rule to make target 'chart' error.
Has this been depreated?

Pass options to singleuser:cmd:

Enable the command to be configurable so that people can pass in config options.

Specific case is to set HistoryManager.enable=False.

Stress test 5000 concurrent users

I want to be able to simulate 5000 concurrent users, doing some amount of notebook activity (cell execution) and getting results back in some reasonable time frame.

This will require that we test & tune:

JupyterHub
Proxy implementation
Spawner implementation
Kubernetes itself

Ideally, we want Kubernetes itself to be our bottleneck, and our components introduce no additional delays.

This issue will document and track efforts to achieve this, as well as to define what 'this' is.

Storage version mismatch in global template

storage.k8s.io/v1beta1 is specified while the expected API version (in 1.6.2) is storage.k8s.io/v1.

Add Support For NBGrader

Hello,

I’m trying to utilize NB grader with an instance of Jupyterhub on Kubernetes, created using Helm (https://github.com/jupyterhub/helm-chart).

I’ve rebuild the singleuser docker image to include the latest version of NB grader.

As far as I can tell, NB grader seems to be installed and functioning appropriately. However, nbgrader requires a directory to exchange files between users that is writeable by everyone but not readable, by default /srv/nbgrader/exchange.

Any idea how this might be possible? My understanding is that each individual is getting customized data storage linked to their container but that there is nothing which could currently be used to connect them.

Best,
Jason

license

This repo uses an Apache license, whereas all other Jupyter repos use BSD 3-Clause. It would be ideal if we could be consistent with the rest of the project.

Add support for manually setting up SSL certs

Let's encrypt support in the long run would be nice, but until then allow manually specifying SSL for the proxy.

Small error in /images/hub/Dockerfile

When the docker image is being built for the hub, the following error is raised.

Collecting git+https://github.com/jupyterhub/kubespawner@804947a
  Cloning https://github.com/jupyterhub/kubespawner (to 804947a) to /tmp/pip-oqdowm7a-build
  Could not find a tag or branch '804947a', assuming commit.

The problem is with line 37 in /images/hub/Dockerfile

RUN pip3 --no-cache-dir install git+https://github.com/jupyterhub/kubespawner@804947a

which can be corrected with:

RUN pip3 --no-cache-dir install https://github.com/jupyterhub/kubespawner/archive/v0.6.0.zip

Set up and use an nginx ingress by default

Even on GKE / Google, we should use an nginx ingress rather than the Google one, since we aren't moving as much traffic and the google one is pretty expensive.

We should stop relying on having a LB per service we want to expose, and set up an Ingress instead. This will also help us in the future when we want to get rid of the proxy and replace it directly with ingress services :)

One feature we do not want to lose is the ability of people to test this with just an IP, without needing a DNS entry. It's ok to require DNS based routing as soon as we have multiple things (such as jupyterhub and binder) but not otherwise.

Current idea is that we'll depend on the nginx-ingress and kube-lego charts, scoped to just the one namespace. These will have to be optional in some form, so non-cloud users with their own ingress controllers can skip these. Instead of setting up LoadBalancer type on Service, we'll just add an ingress object.

setting up singleserver image without tags

Auto- assume v0.2 which I think is wrong, it should likely error sooner (or assume latest ?)

Rename 'singleuser' to 'user'

A lot of people expressed confusion at the fact that all these properties called 'singleuser' exist, but JupyterHub is a multi user deployment. I think this warrants us renaming all the 'singleuser' properties to just 'user'.

Some guidance on the appropriate settings in getting helm-chart for jupyterhub 0.8x working

Posting as an issue on the advice of @willingc here.

I'm revisiting testing jupyterhub 0.8 in a kubernetes context, using the jupyterhub/helm-chart, to try to get multi-server per user working ... having initially hacked on this at jupytercon.

As I understand it, there have been some changes to how the proxy is set up in the context of jupyterhub v0.8 (as against v0.7.2), so perhaps someone could wade in here and help clear things up.

I did try to run the jupyterhub-0.8 feature branch in this repo, which doesn't use the configurable-http-proxy, but I couldn't get that working and it looks like it might be superseded by updates to master in any case.

From a brief discussion with @minrk at jupytercon, it seems there is a built-in proxy with jupyterhub, but since the set-up in this helm chart makes use of an external proxy, it's necessary to switch off the internal one using the c.ConfigurableHTTPProxy.should_start = False setting in the jupyterhub_config.py file that forms parts of the hub image. I have also played-around with setting c.ConfigurableHTTPProxy.api_url, which replaces the deprecated c.JupyterHub.proxy_api_ip and c.JupyterHub.proxy_api_port settings.

An excerpt from my jupyterhub_config.py looks like this:

# Requirements for multi-server per user
c.ConfigurableHTTPProxy.should_start = False
c.JupyterHub.allow_named_servers = True
c.ConfigurableHTTPProxy.api_url = 'http://' + os.environ['PROXY_API_SERVICE_HOST'] + ':' + os.environ['PROXY_API_SERVICE_PORT']
# Connect to a proxy running in a different pod
#c.JupyterHub.proxy_api_ip = os.environ['PROXY_API_SERVICE_HOST']
#c.JupyterHub.proxy_api_port = int(os.environ['PROXY_API_SERVICE_PORT'])

My understanding, as per the zero-to-hub docs, is that feeding a config.yaml file on starting the chart via helm, will over-ride any defaults in the values.yaml file, which in turn go to feed various parameter settings in the jupyterhub helm templates. Note that I'm installing the helm chart from a local directory (to allow me to make changes) rather than using the host chart as per the helm repo add jupyterhub https://jupyterhub.github.io/helm-chart step in the zero-to-hub docs.

My config.yaml is starting out very simple (see below) ... using a home-grown hub image with 0.8.0b4 installed ... and using the latest configurable-http-proxy (I also tried with v. 2.0.4). I'm using a singleuser image which I'm presuming also has the latest jupyterhub, but I didn't get far enough to test that. The hub image is built using this Dockerfile ... where I'm trying to use the latest kubespawner master. If I should be using a jupyterhub-onbuild as my base to which I then add kubespawner, then please advise.

hub:
  cookieSecret: "0ec46355dae0c7368dbaxxxxxxxxxxxxxxxx<and so on>"
  image:
    name: myregistry/jupyterhub-k8s
    tag: v0.8.0b4
proxy:
  secretToken: "041123858273121bd51594f4f4xxxxxxxxxxxxxxxx<and so on>"
  image:
    name: jupyterhub/configurable-http-proxy
    tag: latest
singleuser:
  image:
    name: jupyterhub/k8s-singleuser-sample
    tag: veadcd78

The upshot is that I'm still getting these errors in the proxy logs:

14:17:27.730 - debug: [ConfigProxy] PROXY WEB / to http://hub-deployment-1491691835-lw135:8081
14:17:27.749 - error: [ConfigProxy] 503 GET / Error: getaddrinfo ENOTFOUND hub-deployment-1491691835-lw135 hub-deployment-1491691835-lw135:8081
    at errnoException (dns.js:28:10)
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:76:26)

The hub-deployment-1491691835-lw135:8081 makes me think that its coming from the proxy start command in values.yaml ... ie. --default-target=http://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT) ... which would explain why it's pointing back at the name of the hub pod (although I'm not sure why it's not using the ip).

... and this in the hub logs.

[I 2017-09-07 14:15:34.109 JupyterHub app:1541] Not starting proxy
[I 2017-09-07 14:15:34.109 JupyterHub app:1547] Starting managed service cull-idle
[I 2017-09-07 14:15:34.110 JupyterHub service:266] Starting service 'cull-idle': ['/usr/bin/python3', '/usr/local/bin/cull_idle_servers.py', '--timeout=3600', '--cull_every=600']
[I 2017-09-07 14:15:34.119 JupyterHub service:109] Spawning /usr/bin/python3 /usr/local/bin/cull_idle_servers.py --timeout=3600 --cull_every=600
[W 2017-09-07 14:15:34.148 JupyterHub proxy:304] Adding missing default route
[I 2017-09-07 14:15:34.149 JupyterHub proxy:370] Adding default route for Hub: / => http://hub-deployment-1491691835-lw135:8081
[I 2017-09-07 14:15:34.159 JupyterHub app:1584] JupyterHub is now running at http://10.7.254.59:80/
[I 2017-09-07 14:15:34.321 JupyterHub log:122] 200 GET /hub/api/users ([email protected]) 6.12ms
[I 2017-09-07 14:16:10.651 JupyterHub log:122] 302 GET /503?url=%2F \u2192 /hub/503?url=%2F (@10.4.4.119) 3.23ms
[I 2017-09-07 14:16:13.288 JupyterHub log:122] 302 GET /503?url=%2F \u2192 /hub/503?url=%2F (@10.4.4.119) 0.63ms
[I 2017-09-07 14:17:27.752 JupyterHub log:122] 302 GET /503?url=%2F \u2192 /hub/503?url=%2F (@10.4.4.119) 0.64ms
[I 2017-09-07 14:25:34.418 JupyterHub log:122] 200 GET /hub/api/users ([email protected]) 5.26ms

The 503 errors above are when I'm trying to hit the endpoint of the proxi-api service end-point, to no avail.

Perhaps it has something to do with the default settings in the helm-chart here ... which are somehow over-riding what's getting set in the jupyterhub_config.py.

In there a 'target' is being set using the 'hub' ip ... whereas the c.ConfigurableHTTPProxy.api_url setting suggests pointing at the 'proxy-api' ip. Could someone confirm if I should try to remove this command for the proxy in the values.yaml in order for things to work against jupyterhub 0.8x ... and instead use c.ConfigurableHTTPProxy.command from the jupyterhub_config.py to control how the proxy is started.

I'm ultimately trying to test the multi-server per user feature of jupyterhub 0.8 in a kubernetes context ... so getting a stable functioning hub to begin with is the main objective of this issue.