Git Product home page Git Product logo

example.pangeo.io-deploy's People

Contributors

dsludwig avatar nicwayand avatar rabernat avatar raphaeldussin avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

dsludwig

example.pangeo.io-deploy's Issues

What is the biggest pod we can fit on a single node

We now (#6) have the ability to select notebook pod resources.

image

This is enabled by the following profile list

      profile_list: |
        c.KubeSpawner.profile_list = [
          {
              'display_name': 'small (n1-highmem-2 | 2 cores, 12GB)',
              'kubespawner_override': {
                  'cpu_limit': 2,
                  'cpu_guarantee': 2,
                  'mem_limit': '12G',
                  'mem_guarantee': '12G',
              }
          },
          {
              'display_name': 'standard (n1-highmem-4 | 4 cores, 24GB)',
              'kubespawner_override': {
                  'cpu_limit': 4,
                  'cpu_guarantee': 4,
                  'mem_limit': '24G',
                  'mem_guarantee': '24G',
              }
          },
          {
              'display_name': 'large (n1-highmem-8 | 8 cores, 50GB)',
              'kubespawner_override': {
                  'cpu_limit': 8,
                  'cpu_guarantee': 8,
                  'mem_limit': '50G',
                  'mem_guarantee': '50G',                  
              }
          },
          {
              'display_name': 'x-large (n1-highmem-16 | 16 cores, 96GB RAM)',
              'kubespawner_override': {
                  'cpu_limit': 16,
                  'cpu_guarantee': 14,
                  'mem_limit': '100G',
                  'mem_guarantee': '96G',
              }
          }
        ]

However, the x-large profile won't spawn. It always gives these errors:

$ kubectl describe pod jupyter-rabernat -n staging
...
Events:
  Type     Reason             Age                From                Message
  ----     ------             ----               ----                -------
  Warning  FailedScheduling   21s (x7 over 52s)  default-scheduler   0/5 nodes are available: 4 Insufficient cpu, 5 Insufficient memory.
  Normal   NotTriggerScaleUp  13s (x3 over 43s)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added)

I am using a n1-highmem-16 nodepool, which should have 16 cores and 104 GB of memory available. But kubernetes won't put these pods there. Even after I took the CPU guarantee down to 14 and the memory guarantee down to 96G, it still won't launch.

How can we find out exactly how big kubernetes "thinks" the node is?

500 server error

I am now getting 500 server errors when trying to start a notebook. Here's how I would debug.

# connect to cluster
$ gcloud container clusters get-credentials ocean-pangeo-io-cluster --zone us-central1-b --project pangeo-181919

# list pods
$ kubectl -n staging get pods
NAME                         READY     STATUS    RESTARTS   AGE
autohttps-6ffdcc94ff-pz77h   2/2       Running   0          5d
hub-7d9c6c8465-7lh8d         1/1       Running   0          14h
proxy-7b4fb46447-rgjms       1/1       Running   0          5d

# get logs from hub
$ kubectl -n staging logs hub-7d9c6c8465-7lh8d 
# skip to relevant part
[I 2018-11-27 13:17:49.311 JupyterHub log:158] 302 GET / -> /hub (@10.128.0.26) 0.78ms
[I 2018-11-27 13:17:49.352 JupyterHub log:158] 302 GET /hub -> /hub/ (@10.128.0.26) 0.69ms
[I 2018-11-27 13:17:49.434 JupyterHub log:158] 302 GET /hub/ -> /user/rabernat/ ([email protected]) 40.93ms
[I 2018-11-27 13:17:49.488 JupyterHub log:158] 302 GET /user/rabernat/ -> /hub/user/rabernat/ (@10.128.0.26) 0.84ms
[W 2018-11-27 13:17:49.580 JupyterHub base:714] User rabernat is slow to start (timeout=0)
[I 2018-11-27 13:17:49.582 JupyterHub base:1056] rabernat is pending spawn
[I 2018-11-27 13:17:49.584 JupyterHub log:158] 200 GET /hub/user/rabernat/ ([email protected]) 53.93ms
[I 2018-11-27 13:17:49.590 JupyterHub spawner:1671] PVC claim-rabernat already exists, so did not create new pvc.
[I 2018-11-27 13:18:13.545 JupyterHub proxy:301] Checking routes
[W 181127 13:18:13 cull_idle_servers:128] Not culling server rabernat with pending spawn
[I 2018-11-27 13:18:13.731 JupyterHub log:158] 200 GET /hub/api/users ([email protected]) 13.38ms
[I 2018-11-27 13:18:39.255 JupyterHub spawner:1770] Deleting pod jupyter-rabernat
[W 2018-11-27 13:18:49.773 JupyterHub user:504] rabernat's server never showed up at http://10.40.47.10:8888/user/rabernat/ after 30 seconds. Giving up
[W 2018-11-27 13:18:49.827 JupyterHub base:689] 4 consecutive spawns failed.  Hub will exit if failure count reaches 5 before succeeding
[E 2018-11-27 13:18:49.828 JupyterHub gen:974] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py:619> exception=TimeoutError("Server at http://10.40.47.10:8888/user/rabernat/ didn't respond in 30 seconds",)> after timeout
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 970, in error_callback
        future.result()
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/handlers/base.py", line 626, in finish_user_spawn
        await spawn_future
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 522, in spawn
        raise e
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/user.py", line 496, in spawn
        resp = await server.wait_up(http=True, timeout=spawner.http_timeout)
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 197, in wait_for_http_server
        timeout=timeout
      File "/usr/local/lib/python3.6/dist-packages/jupyterhub/utils.py", line 155, in exponential_backoff
        raise TimeoutError(fail_message)
    TimeoutError: Server at http://10.40.47.10:8888/user/rabernat/ didn't respond in 30 seconds

how to specify access?

Where is the configuration which says who has access to the cluster, who has admin privileges, etc?

wish list

I am hoping that ocean.pangeo.io can be the most fancy, powerful pangeo yet. To achieve this, we will need to innovate a bit. I particular, I am hoping we can set up the following

  • My pangeo_ecco_examples repo is a good starting point for the default image / environment
  • That binder contains a config that automatically populates an intake catalog.
  • Use google cloud firestore for shared user storage (has already been solved in pangeo-data/pangeo-cloud-federation#25, pangeo-data/pangeo-cloud-federation#28)
  • Have different options for notebook machine types. For example, e.g., small (2 cores, 8 GB), medium (4 cores, 16 GB), large (16 cores, 64 GB) , GPU option, etc. Has been discussed in pangeo-data/pangeo#348. Should be straightforward with the profile_list open, see kubespawner docs.
  • I would like to customize the look and feel of the login page so it is more "pangeo themed"

Using pangeo to produce and publish ocean dataset?

Hi @rabernat and @raphaeldussin ,

I am working on building a climatology of surface tracer gradients (temperature, salinity, buoyancy) estimated from large collections of shipboard thermosalinograph data. One objective of this dataset is to compare observational gradients for scales larger than 10 km with those from submesoscale-permitting ocean models. I am using the NCEI-TSG dataset than I am converting into the zarr format. The dataset weights only a few Gb but the computation can be a bit expensive.

I am willing to upload this dataset on the cloud. I am also interested to produce the gradient climatology on the cloud, get a DOI associated and refer to pangeo.

Do you think this would fall into the scope of pangeo?

Request access to ocean.pangeo

Hi @raphaeldussin,

I saw that pangeo.pydata.org is to be closed down soon.
Can I have access to ocean.pangeo.io ? I would like to compute the monthly EKE from AVISO data available on the cloud and if possible also export the data array.

Thanks in advance,
Marine

"Build Recommended" message in jupyterlab

When I log in, I get the following message from jupyterlab.

Build Recommended
JupyterLab build is suggested:
@jupyterlab/hub-extension needs to be included in build
@pyviz/jupyterlab_pyviz needs to be included in build
jupyter-leaflet needs to be included in build
@jupyter-widgets/jupyterlab-manager needs to be included in build
dask-labextension needs to be included in build

I seem to recall similar issues raised elsewhere in other deployments, but I can't track down a specific issue. Perhaps useful to coordinate with other cluster admins in pangeo-data/pangeo#476.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.