Comments (3)
Sorry for the delay @martinmestre!
Can you say a bit more about how you are using the MultiPool
, and what kind of machine you are running on?
When I test this on my MacBook Pro, running a Python script from the command line, it seems to work fine. Here's the script I ran:
import sys
import time
import random
def worker(task):
i, num = task
time.sleep(0.1)
return num ** 2
if __name__ == '__main__':
from schwimmbad import MultiPool
size = None
if len(sys.argv) > 1:
size = int(sys.argv[1])
tasks = [(i, random.random()) for i in range(100)]
with MultiPool(processes=size) as pool:
print(pool)
results = []
for r in pool.map(worker, tasks):
results.append(r)
sys.exit(0)
Some timing:
% time python schwimmbad_test.py 1
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=1>
python schwimmbad_test.py 1 0.59s user 0.51s system 9% cpu 11.184 total
% time python schwimmbad_test.py 2
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=2>
python schwimmbad_test.py 2 0.83s user 0.65s system 24% cpu 6.169 total
% time python schwimmbad_test.py 4
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=4>
python schwimmbad_test.py 4 1.45s user 1.02s system 63% cpu 3.885 total
So, the timings seem to be scaling as I would expect.
from schwimmbad.
Oh, and just in case you are using an interactive interpreter (like an IPython session or notebook), multiprocessing pools do not work in these settings: https://docs.python.org/3.8/library/multiprocessing.html#using-a-pool-of-workers
from schwimmbad.
Happy New Year @adrn !
Thanks for your answer.
My computer is Lenovo G50-80 with the following cpu info (Linux: Debian 9):
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 61
Model name: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
I made some tests with your example. First I start giving the output from using your script. Then I will address the original problem I bumped into, which is your example of selecting a pool from the command line argument here:
https://schwimmbad.readthedocs.io/en/latest/examples/index.html.
- Using your original script above (with the worker just peforming a sleep task) I obtain the following outputs:
$ \time python script-adrn.py 1
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=1>
0.32user 0.02system 0:10.31elapsed 3%CPU
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=2>
0.25user 0.04system 0:05.50elapsed 5%CPU
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=4>
0.26user 0.04system 0:03.10elapsed 9%CPU
Looking at the elapsed times, I guess I am obtaining the same behaviour that your total times.
But there is a great discrepancy in the values of the %CPU with respect to your results.
When looking with the htop command in order to see how many processors are being used
simultaneously, I couldn't see much because the task is basically sleeping. So I modified the worker functions to be this:
def worker(task): i, num = task time.sleep(0.1) sum(np.linspace(0,1000,10000000)) return num ** 2
I repeated the analysis and obtained:
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=1>
86.76user 0.94system 1:37.75elapsed 89%CPU
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=2>
106.42user 0.94system 1:01.06elapsed 175%CPU
<schwimmbad.multiprocessing.MultiPool state=RUN pool_size=4>
190.03user 0.99system 0:54.65elapsed 349%CPU
Looking at htop I verified that the number of processors used in each case is correct.
although the elapsed times are not scaling as expected: 1.37, 1.01 and 0:54 seconds.
If you know of any reason for this behaviour please let me know.
- Now I address the original problem that made me open this issue. When running your example
about selecting a pool with command line arguments, using a little heavier worker function
and wrote my own time print inside:
import schwimmbad
import math
import time
import random
def worker(task):
a, b = task
return math.cos(a) + math.sin(b)
def main(pool):
# Here we generate some fake data
import random
a = [random.uniform(0, 2*math.pi) for _ in range(10000000)]
b = [random.uniform(0, 2*math.pi) for _ in range(10000000)]
tasks = list(zip(a, b))
results = list(pool.map(worker, tasks))
pool.close()`
if __name__ == "__main__":
import schwimmbad
import time
from argparse import ArgumentParser
parser = ArgumentParser(description="Schwimmbad example.")
group = parser.add_mutually_exclusive_group()
group.add_argument("--ncores", dest="n_cores", default=1,
type=int, help="Number of processes (uses "
"multiprocessing).")
group.add_argument("--mpi", dest="mpi", default=False,
action="store_true", help="Run with MPI.")
args = parser.parse_args()
pool = schwimmbad.choose_pool(mpi=args.mpi, processes=args.n_cores)
start_time = time.time()
main(pool)
execution_time = (time.time()-start_time)
print('execution time =', execution_time)
The outputs are as follows:
$ \time python script-demo.py
execution time = 13.054332256317139
12.62user 0.62system 0:13.24elapsed 99%CPU
$ \time python script-demo.py --ncores=2
execution time = 15.358909130096436
21.56user 1.70system 0:15.56elapsed 149%CPU
$ \time python script-demo.py --ncores=4
execution time = 14.996244668960571
22.05user 2.18system 0:15.20elapsed 159%CPU
I see that the execution/elapsed times are not as expected (nor the %CPU ?).
Please let me know if I was not clear in something.
Thank you very much in advance.
All the best!
Martín
from schwimmbad.
Related Issues (20)
- Link to Read The Docs broken HOT 1
- Speed difference to emcee.utils.MPIPool? HOT 1
- imap in SerialPool and MPIPool HOT 2
- extend all pool interfaces to have common constructor HOT 1
- use context manager? HOT 6
- I fixed some issues I had with MPIPool HOT 3
- SerialPool not starting HOT 3
- JoblibPool on a cluster? HOT 2
- Progress bar in MultiPool HOT 2
- MPIPool is not working in python 3 environment HOT 6
- Bug w/ mpi4py >=3.0?
- How to use MPIPool in slurm? HOT 4
- Move CI from travis -> actions
- Annoying coverage reporting HOT 2
- Trouble running MPI demo: need at least two MPI processes HOT 3
- Large speed differences between MPIPool and MultiPool HOT 3
- Option to reuse a worker instance for different tasks HOT 2
- Overcome the 2GB limit in MPI pickling
- Always one MPI slave blocked, when pass in one Class
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from schwimmbad.