raphaeldussin / example.pangeo.io-deploy Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pangeo-data/example.pangeo.io-deploy

0.0 0.0 1.0 181 KB

Deployment automation for example.pangeo.io

Shell 0.84% Jupyter Notebook 97.37% Python 1.79%

example.pangeo.io-deploy's People

Contributors

Watchers

Forkers

dsludwig

example.pangeo.io-deploy's Issues

Can we archive this repo and add a note that it is no longer active?

What is the biggest pod we can fit on a single node

We now (#6) have the ability to select notebook pod resources.

This is enabled by the following profile list

      profile_list: |
        c.KubeSpawner.profile_list = [
          {
              'display_name': 'small (n1-highmem-2 | 2 cores, 12GB)',
              'kubespawner_override': {
                  'cpu_limit': 2,
                  'cpu_guarantee': 2,
                  'mem_limit': '12G',
                  'mem_guarantee': '12G',
              }
          },
          {
              'display_name': 'standard (n1-highmem-4 | 4 cores, 24GB)',
              'kubespawner_override': {
                  'cpu_limit': 4,
                  'cpu_guarantee': 4,
                  'mem_limit': '24G',
                  'mem_guarantee': '24G',
              }
          },
          {
              'display_name': 'large (n1-highmem-8 | 8 cores, 50GB)',
              'kubespawner_override': {
                  'cpu_limit': 8,
                  'cpu_guarantee': 8,
                  'mem_limit': '50G',
                  'mem_guarantee': '50G',                  
              }
          },
          {
              'display_name': 'x-large (n1-highmem-16 | 16 cores, 96GB RAM)',
              'kubespawner_override': {
                  'cpu_limit': 16,
                  'cpu_guarantee': 14,
                  'mem_limit': '100G',
                  'mem_guarantee': '96G',
              }
          }
        ]

However, the x-large profile won't spawn. It always gives these errors:

$ kubectl describe pod jupyter-rabernat -n staging
...
Events:
  Type     Reason             Age                From                Message
  ----     ------             ----               ----                -------
  Warning  FailedScheduling   21s (x7 over 52s)  default-scheduler   0/5 nodes are available: 4 Insufficient cpu, 5 Insufficient memory.
  Normal   NotTriggerScaleUp  13s (x3 over 43s)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added)

I am using a n1-highmem-16 nodepool, which should have 16 cores and 104 GB of memory available. But kubernetes won't put these pods there. Even after I took the CPU guarantee down to 14 and the memory guarantee down to 96G, it still won't launch.

How can we find out exactly how big kubernetes "thinks" the node is?

500 server error

I am now getting 500 server errors when trying to start a notebook. Here's how I would debug.

# connect to cluster
$ gcloud container clusters get-credentials ocean-pangeo-io-cluster --zone us-central1-b --project pangeo-181919

# list pods
$ kubectl -n staging get pods
NAME                         READY     STATUS    RESTARTS   AGE
autohttps-6ffdcc94ff-pz77h   2/2       Running   0          5d
hub-7d9c6c8465-7lh8d         1/1       Running   0          14h
proxy-7b4fb46447-rgjms       1/1       Running   0          5d

# get logs from hub
$ kubectl -n staging logs hub-7d9c6c8465-7lh8d 
# skip to relevant part
[I 2018-11-27 13:17:49.311 JupyterHub log:158] 302 GET / -> /hub (@10.128.0.26) 0.78ms
[I 2018-11-27 13:17:49.352 JupyterHub log:158] 302 GET /hub -> /hub/ (@10.128.0.26) 0.69ms
[I 2018-11-27 13:17:49.434 JupyterHub log:158] 302 GET /hub/ -> /user/rabernat/ ([email protected]) 40.93ms
[I 2018-11-27 13:17:49.488 JupyterHub log:158] 302 GET /user/rabernat/ -> /hub/user/rabernat/ (@10.128.0.26) 0.84ms
[W 2018-11-27 13:17:49.580 JupyterHub base:714] User rabernat is slow to start (timeout=0)
[I 2018-11-27 13:17:49.582 JupyterHub base:1056] rabernat is pending spawn
[I 2018-11-27 13:17:49.584 JupyterHub log:158] 200 GET /hub/user/rabernat/ ([email protected]) 53.93ms
[I 2018-11-27 13:17:49.590 JupyterHub spawner:1671] PVC claim-rabernat already exists, so did not create new pvc.
[I 2018-11-27 13:18:13.545 JupyterHub proxy:301] Checking routes
[W 181127 13:18:13 cull_idle_servers:128] Not culling server rabernat with pending spawn
[I 2018-11-27 13:18:13.731 JupyterHub log:158] 200 GET /hub/api/users ([email protected]) 13.38ms
[I 2018-11-27 13:18:39.255 JupyterHub spawner:1770] Deleting pod jupyter-rabernat
[W 2018-11-27 13:18:49.773 JupyterHub user:504] rabernat's server never showed up at http://10.40.47.10:8888/user/rabernat/ after 30 seconds. Giving up
[W 2018-11-27 13:18:49.827 JupyterHub base:689] 4 consecutive spawns failed.  Hub will exit if failure count reaches 5 before succeeding
[E 2018-11-27 13:18:49.828 JupyterHub gen:974] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py:619> exception=TimeoutError("Server at http://10.40.47.10:8888/user/rabernat/ didn't respond in 30 seconds",)> after timeout
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 970, in error_callback
        future.result()
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 626, in finish_user_spawn
        await spawn_future
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 522, in spawn
        raise e
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 496, in spawn
        resp = await server.wait_up(http=True, timeout=spawner.http_timeout)
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 197, in wait_for_http_server
        timeout=timeout
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 155, in exponential_backoff
        raise TimeoutError(fail_message)
    TimeoutError: Server at http://10.40.47.10:8888/user/rabernat/ didn't respond in 30 seconds

rename and move repo?

Could we consider moving this to

pangeo-data/ocean.pangeo.io-deploy

how to specify access?

Where is the configuration which says who has access to the cluster, who has admin privileges, etc?

holoview does not render images

Javascript Error: undefined is not an object (evaluating 'jQuery.ui')

wish list

I am hoping that ocean.pangeo.io can be the most fancy, powerful pangeo yet. To achieve this, we will need to innovate a bit. I particular, I am hoping we can set up the following

My pangeo_ecco_examples repo is a good starting point for the default image / environment
That binder contains a config that automatically populates an intake catalog.
Use google cloud firestore for shared user storage (has already been solved in pangeo-data/pangeo-cloud-federation#25, pangeo-data/pangeo-cloud-federation#28)
Have different options for notebook machine types. For example, e.g., small (2 cores, 8 GB), medium (4 cores, 16 GB), large (16 cores, 64 GB) , GPU option, etc. Has been discussed in pangeo-data/pangeo#348. Should be straightforward with the profile_list open, see kubespawner docs.
I would like to customize the look and feel of the login page so it is more "pangeo themed"

Access request for ocean.pangeo.io

Hi @raphaeldussin ,

Could I please get access to ocean.pangeo.io? I would like to use the llc4320 dataset in my work on finite-time coherent set identification.

Thanks,
Michael

Request access for lewismc to ocean.pangeo.io

Hi @raphaeldussin can you please grant me login access to ocean.pangeo.io. i would like to pursue the task of making NASA JPL PO.DAAC data available for use in Pangeo environment.
Also, see pangeo-data/pangeo#686 for more context on my larger goal.
Thank you

Using pangeo to produce and publish ocean dataset?

Hi @rabernat and @raphaeldussin ,

I am working on building a climatology of surface tracer gradients (temperature, salinity, buoyancy) estimated from large collections of shipboard thermosalinograph data. One objective of this dataset is to compare observational gradients for scales larger than 10 km with those from submesoscale-permitting ocean models. I am using the NCEI-TSG dataset than I am converting into the zarr format. The dataset weights only a few Gb but the computation can be a bit expensive.

I am willing to upload this dataset on the cloud. I am also interested to produce the gradient climatology on the cloud, get a DOI associated and refer to pangeo.

Do you think this would fall into the scope of pangeo?

Request access to ocean.pangeo

Hi @raphaeldussin,

I saw that pangeo.pydata.org is to be closed down soon.
Can I have access to ocean.pangeo.io ? I would like to compute the monthly EKE from AVISO data available on the cloud and if possible also export the data array.

Thanks in advance,
Marine

"Build Recommended" message in jupyterlab

When I log in, I get the following message from jupyterlab.

Build Recommended
JupyterLab build is suggested:
@jupyterlab/hub-extension needs to be included in build
@pyviz/jupyterlab_pyviz needs to be included in build
jupyter-leaflet needs to be included in build
@jupyter-widgets/jupyterlab-manager needs to be included in build
dask-labextension needs to be included in build

I seem to recall similar issues raised elsewhere in other deployments, but I can't track down a specific issue. Perhaps useful to coordinate with other cluster admins in pangeo-data/pangeo#476.

migration to old.ocean.pangeo.io

@adeajayi-kunle @rabernat @chiara7 @LevkeC @liutongya @rogema @rsignell-usgs @valentina-s @ngrigri

we're moving the ocean.pangeo.io cluster to a new deployment. The current cluster is now moved to old.ocean.pangeo.io. We're gonna try to migrate the home directories as well but you may want to save your work.

access to ocean.pangeo.io

HI,

I'd like access to ocean.pangeo.io. I want to look at LLC4320 dataset on there.

raphaeldussin / example.pangeo.io-deploy Goto Github PK

example.pangeo.io-deploy's People

Contributors

Watchers

Forkers

example.pangeo.io-deploy's Issues

Recommend Projects

Recommend Topics

Recommend Org