Git Product home page Git Product logo

Comments (13)

davclark avatar davclark commented on June 27, 2024 3

I'm putting together a tutorial right now - adapted from dask/dask-tutorial and focused more on the kind of use-case discussed here (for my purposes, running single-machine Dask via Gigantum on Digital Ocean). I will digest the contents of this issue there, but it seems this isn't quite documented anywhere. Is there a place that it should be documented?

from dask-labextension.

jacobtomlinson avatar jacobtomlinson commented on June 27, 2024 2

@mangecoeur that is correct.

from dask-labextension.

girgink avatar girgink commented on June 27, 2024 2

A simple way to access the scheduler is to use arbitrary host/port access feature of Jupyter ServerProxy. Just enter proxy/daskscheduler:8787 (without forward slash at the beginning, i.e. relative URL) instead of http://daskscheduler:8787 as the URL address. This allows the Dask JupyterLab extension to access the scheduler behind the reverse proxy.

from dask-labextension.

ian-r-rose avatar ian-r-rose commented on June 27, 2024 1

Hi @consideRatio (and @stsievert), sorry about the extremely late response. As far as I can tell, you are unable to access your remote dashboard due to CORS and/or mixed content errors. These security policies are there for a good reason, especially in an application like JupyterLab: if untrusted content gets access to the page it could trigger arbitrary code execution.

Our approach to this has been to make the cluster manager, which starts the clusters on the server side, and enforces the dashboard URLs (rather than allowing the user to put any URL in the iframes). That being said, there are many different types of deployments, so we are still working out good examples for configuration in these different contexts.

So to fix your current issue, I see a couple of options:

  1. Use the Cluster Manager. We'd be happy to give advice, and if you figure it out for your setup, we would love to include it as an example in the docs.
  2. Disable your browser CORS/mixed-content security policies (not recommended).

from dask-labextension.

mangecoeur avatar mangecoeur commented on June 27, 2024 1

Similar issue - as far as I can tell its not possible to use the extension to connect to a cluster that's started from within a notebook or terminal. From my reading of the nb proxy docs that is by design, am I right? The proxy system can only proxy processes that it starts itself?

from dask-labextension.

consideRatio avatar consideRatio commented on June 27, 2024

I just learned about: https://jupyter-server-proxy.readthedocs.io/en/latest/index.html - the new nbserverproxy as I understand it.

There is a reference in jupyter-server-proxy's docs:

image

I did not understand if this was a reference saying that dask-labextension utilizes such functionality, or that dask-labextension depends on jupyter-server-proxy specifically, or nbserverproxy.

/cc: @yuvipanda @ryanlovett

from dask-labextension.

consideRatio avatar consideRatio commented on June 27, 2024

Update

@ian-r-rose helped me to install the extension properly by #40 (:heart:), and I have now tested the merged PRs #31 and #34 (at least I think so, I have version the PyPI package of version 0.3.1 installed). I was able to press "new" to create local clusters, connect to them and such as well. All of this while still acting on a remote Jupyter server provided to me by a zero-to-jupyterhub-k8s deployment made available at https://jupyter.allting.com/user/erik.sundell.

But, I still run into an issue when I try to connect to a scheduler that I have deployed on the same kubernetes cluster. The pods can access it using their network and a kubernetes service, but my browser cannot. It seems like the browser is attempting to make a lot of requests to the provided URL rather than delegating this to the actual jupyter notebook server as a proxy. This is what I've found so far.

I can utilize the remote scheduler+workers like this

image

I can also curl the dashboard

image

But I fail to utilize it

image
image


When utilizing a local cluster...

A field is populated with dask/dashboard/7708ccdb-ce0b-4f74-b2db-970a07e94b24
image

Question

Is there some way I can reference the remote cluster?

Investigation

I looked into the scheduler logs...

('INFO',
  'distributed.scheduler - INFO - Starting worker compute stream, tcp://10.0.2.6:44307'),
 ('INFO', 'distributed.scheduler - INFO - Register tcp://10.0.2.5:32781'),
 ('INFO',
  'distributed.scheduler - INFO - Starting worker compute stream, tcp://10.0.2.4:35733'),
 ('INFO',
  'distributed.scheduler - INFO - Starting worker compute stream, tcp://10.0.2.5:32781'),
 ('INFO',
  'distributed.scheduler - INFO - Receive client connection: Client-97858002-1098-11e9-80f3-0a580a00013d'),
 ('INFO',
  'distributed.scheduler - INFO - Receive client connection: Client-01d7bab4-1099-11e9-80f3-0a580a00013d'),
 ('INFO',
  'distributed.scheduler - INFO - Receive client connection: Client-e4c69902-109d-11e9-80f3-0a580a00013d'),
 ('INFO',
  'distributed.scheduler - INFO - Receive client connection: Client-f0702c94-109d-11e9-80f3-0a580a00013d')

I learned that I can find the "client id" like this as well:
image

But using that like i was able to use the id for the cluster created locally in the same pod failed with the same "does not appear to be a valid bla bla"...

I tested requesting a response from the locally created dask cluster by accessing:
https://jupyter.allting.com/user/erik.sundell/dask/dashboard/7708ccdb-ce0b-4f74-b2db-970a07e94b24

It worked fine and I got:

{"Individual Task Stream": "/individual-task-stream", "Individual Progress": "/individual-progress", "Individual Graph": "/individual-graph", "Individual Profile": "/individual-profile", "Individual Profile Server": "/individual-profile-server", "Individual Nbytes": "/individual-nbytes", "Individual Nprocessing": "/individual-nprocessing", "Individual Workers": "/individual-workers"}

Accessing the remote cluster clients ip failed though:

https://jupyter.allting.com/user/erik.sundell/dask/dashboard/f0702c94-109d-11e9-80f3-0a580a00013d

But I learned that the local dask clusters client.id was different from the one i utilize to connect with anyhow. It is probably because the client.id is one among many clients to a cluster, and the connection string I see for the cluster must refer to something else than the client.id connecting to it at some later point...

from dask-labextension.

stsievert avatar stsievert commented on June 27, 2024

I'd like to see this issue resolved too because I typically start JupyterLab on a remote server. It'd be really nice to access the dashboard through JupyterLab, and not having to setup dashboard access (which I do through port-forwarding).

from dask-labextension.

dvweersel avatar dvweersel commented on June 27, 2024

I think my issue is related. I can succesfully start a LocalCluster on a remote server using the extension but the visualisation tabs are blank when opened. I'm running jupyterhub behind nginx.

dask error

I don't mean to hijack this issue, so if this is unrelated I can make a separate post. If not, I'd happily accept any configuration suggestions to make this work.

from dask-labextension.

jsignell avatar jsignell commented on June 27, 2024

Currently, the only dask-labextension docs are the readme in this repository. If you are willing to write a troubleshooting section, I think that would be helpful.

from dask-labextension.

rafdesouza avatar rafdesouza commented on June 27, 2024

Is there any solution for this? I would like to run the dask jupyter lab extension on my azure VM.

from dask-labextension.

jsignell avatar jsignell commented on June 27, 2024

@georaf I think the steps outlined in #41 (comment) should work. Are you getting errors? The last suggestion was that it might be helpful to add some docs around this, so if that sounds interesting to you, a pull request would be welcome.

from dask-labextension.

davclark avatar davclark commented on June 27, 2024

This is a little Gigantum-centric, but the approach should be pretty clear. You can just launch the related Project in Gigantum Hub and look at the Environment tab to see how we set it up (links are in the post):

https://blog.gigantum.com/scaling-on-the-cheap-with-dask-gigantum-and-digitalocean

I somehow missed these comments for a LONG time - sorry about that. It's not exactly troubleshooting - more showing an approach that works. If this seems helpful, I'm happy to digest the relevant bits into a README section. I'm also happy to simply link to the blog post.

from dask-labextension.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.