Comments (12)
Oh, and if anyone has a better name, that would be welcome. I've copied over what was in the pangeo repository.
from dask-jobqueue.
Thanks @mrocklin for doing this and everyone involved in writing and sharing the code. :)
Had been thinking recently that it would be nice if there was something to construct shim shell scripts for getting dask-worker
jobs launched on schedulers and connected back to a Distributed Client. Was about to raise an issue over at Distributed where it may or may not have been appropriate. Happy to see this emerge around the same time. :)
from dask-jobqueue.
I don't have any problem using the dask license or loosing my commit history, remarkable as it was.
from dask-jobqueue.
Hi everyone,
I 've followed Dask and Pangeo for a few months now, and I was wondering where to start to collaborate with you guys. I feel that this may be the right place for me, as it is the closer to my work environment (Dask on PBS cluster).
I will follow the issues opened here, but do not hesitate to cc me if you need some test on an alternative environment, and tell me if you think about a simple task on which you need some manpower.
Cheers!
from dask-jobqueue.
Thanks @guillaumeeb ! I see three main directions of work short term:
- Build CI systems. This requires some understanding of docker and docker-compose. See #2
- Add new implementations for new job schedulers. Given that you are most familiar with PBS and that PBS already exists this might not be a good fit.
- Proceed through the actions of a beginning user and try to use PBSCluster on your cluster. Report back any confusion or bugs that you encounter. This code was developed on a couple of PBS instances, but I suspect that it is not yet completely general and will fail in interesting ways on other PBS deployments.
from dask-jobqueue.
I don't have any concerns with license or history - thanks for taking this.
from dask-jobqueue.
To finish up item 2, have put up PR ( #9 ), which just copies Dask's License file and sets the copyright year to 2018. Sounds like we are on agreement on that. Also includes some stuff to package the license and some misc packaging things. Requested reviews of the original code owners. Hope that is ok. :)
from dask-jobqueue.
@mrocklin
I can begin with point 3. Do you foresee to add some beginner documentation on how to setup the module, or even more, on how to set up a dask python environment with job queue, to have the prerequisite to run the example in the Readme (I could write it, or can we point to some equivalent documentation)? Should we add some cluster scheduler specific documentation on how to check that the Dask cluster has started (eg. using qstat for PBS)?
The point 3 will give me some insights to help on issue #7 also.
I will then be happy to help on point 1, so issue #2, by beginning to set up a PBS docker environment, if no one else has the bandwith to do it! Hope I can find the time.
from dask-jobqueue.
I would install dask: http://dask.pydata.org/en/latest/install.html
I would then pip install from this repository:
pip install git+https://github.com/dask/dask-jobqueue
We don't have anything on PyPI or conda-forge yet. Having basic documentation on RTD would be nice in the future.
Should we add some cluster scheduler specific documentation on how to check that the Dask cluster has started
Sure?
from dask-jobqueue.
Okay, had never installed a Python module from git before, no idea it was so simple ^^'.
For the documentation, shall we initialize a docs folder with basic files for Sphinx and Read The Docs integration?
Sorry for the basic questions I am (and will be) asking, beginner on Open source Development here.
from dask-jobqueue.
So here are my first report on trying PBSCluster on our environment:
- It works
👍 , an it is really easy to use. Really good work, thanks! - I am concerned by the fact that there is no options for configuring the
--local-directory
of the dask-worker. We could use theextra
configuration option, but I would prefer to have a named option, which is linked to #7 reflexion. In particular, our PBS conf sets a $TMPDIR env var that should be used. - I could not find a simple
cluster.stop()
method. I tried to usecluster.scheduler.close()
, and take advantage of the death-timeout, but apparently it did not work. I think it would be good to have suchcluster.stop()
method that stops every PBS jobs and the sheduler process. - One minor point (and that's probably because I did not search long enough), I could not find a method to get the info about both scheduler and bokeh bindings ip:port, like what can be found in a scheduler.json written file.
from dask-jobqueue.
I'm going to close this. There are a smattering of comments/issues that we may want to open specific issues for but this issue isn't the right place to work on those.
from dask-jobqueue.
Related Issues (20)
- HTCondor CI is failing HOT 2
- CI: Distributed fixtures not compliant with dask-jobqueue
- dask-jobqueue for Fujitsu HPC HOT 3
- Remove deprecated parameters `env_extra`, `extra`, `job_extra` HOT 3
- Suppress "Couldn't detect a suitable IP address" messages on cluster nodes with no internet HOT 1
- Cluster keeps appending "interface" flag to job script HOT 7
- Release 0.8.1 HOT 2
- OARCluster implementation does not let OAR take into account the memory parameter HOT 4
- `JobQueueCluster` with local worker(s) HOT 3
- Restart cluster job on task completion HOT 3
- Add CI with more tests for OAR
- dask_jobqueue tries to import non-existent function dask.utils.ignoring HOT 3
- a direct way to specify the worker spec HOT 4
- Documentation bug: interface HOT 1
- documentation: document `worker_command` kwarg
- Strange Worker KeyError when using LSFCluster. HOT 6
- Update NERSC Cori to NERSC Perlmutter in docs HOT 3
- SLURMCluster doesn't spawn new workers when old ones timeout HOT 12
- conftest.py not included in PyPI source tarball HOT 1
- CI is currently failing HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dask-jobqueue.