Git Product home page Git Product logo

Comments (9)

lesteve avatar lesteve commented on May 28, 2024 1

Also a bit tangential but I have to say that when reading the dask-worker --help about --memory-limit, I would have guessed that --memory-limit is about the dask-worker memory limit and not a per-process memory limit.

from dask-jobqueue.

guillaumeeb avatar guillaumeeb commented on May 28, 2024

I share your concern: I also feel that keywords are not really intuitive at first.

However I felt this first when starting a Dask cluster without dask-jobqueue, just using custom scripts launching dask-worker command, particularly with the following options you are mentioning:

  --nthreads INTEGER            Number of threads per process.
  --nprocs INTEGER              Number of worker processes.  Defaults to one.
  --memory-limit TEXT           Bytes of memory that the worker can use. This
                                can be an integer (bytes), float (fraction of
                                total system memory), string (like 5GB or
                                5000M), 'auto', or zero for no memory
                                management

--memory-limit is not completely clear here neither...

For dask-jobqueue, I think I would rather stick to the dask-worker command options, but maybe there is something to do here?
Another point, I believe that @jhamman for example does some "over subscription" of cores: declaring something like 16 processes of 4 threads for 32 cores node for instance. But that could be doable with what you propose.

Another solution maybe to have several ways of declaring resources, e.g. both

cluster = FooCluster(threads=4, processes=8, memory="16GB")

and something like

cluster = FooCluster(cores=32, processes=8, overall_memory="128GB")

could be used? No breaking change.

from dask-jobqueue.

mrocklin avatar mrocklin commented on May 28, 2024

I could get behind the cores= and overall_memory= keyword alternatives. Any thoughts on this @jhamman @lesteve ?

from dask-jobqueue.

jhamman avatar jhamman commented on May 28, 2024

I don't have particularly strong feelings here. If others think alternative keywords are more descriptive and user friendly, let's move that way. So long as we maintain a reasonable amount of transparency and flexibility.

from dask-jobqueue.

lesteve avatar lesteve commented on May 28, 2024

I have to admit I was not particularly aware of these subtleties.

As a naive cluster and dask-jobqueue user, I'd like to easily specify the memory per "scheduler job", because that may determine which queue I will end up in.

For "cores vs threads" I don't have a strong opinion.

Maybe @jakirkham has some opinion based on dask-drmaa?

from dask-jobqueue.

mrocklin avatar mrocklin commented on May 28, 2024

See also https://gitter.im/dask/dask?at=5b05f70296af8f1186c5aa70

from dask-jobqueue.

mrocklin avatar mrocklin commented on May 28, 2024

OK, I'm going to start work on a PR to add two new configuration options:

  • cores: the total number of cores that we should endeavor to use

    I hope that cores + processes is a better solution than threads + processes. Hopefully it is easier for people to manipulate

  • total_memory: the total amount of memory on a node to use

from dask-jobqueue.

mrocklin avatar mrocklin commented on May 28, 2024

See #86

from dask-jobqueue.

guillaumeeb avatar guillaumeeb commented on May 28, 2024

Closed thanks to #86.

from dask-jobqueue.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.