Git Product home page Git Product logo

Comments (3)

adrn avatar adrn commented on May 26, 2024

Hi 👋 ! Thanks for the issue. I'm offline for a few days but will take a look ASAP. Thanks for the patience!

from schwimmbad.

adrn avatar adrn commented on May 26, 2024

@enajx I ended up with some time tonight, so I did some tests 😅 .

First off, I can reproduce what you see on macOS. I can think of many some reasons why this might be happening, but I don't know exactly why. That said, I did some experiments, and have some recommendations.

Experiment: slow down the worker function

My first thought is that your worker function is very fast to execute, so it's possible that you are dominated by MPI overheads -- either in message passing the large number of tasks, or in the pool waiting for workers to finish before sending new tasks. So, the first experiment I tried was to slow down your worker function by adding a time.sleep(1e-4) just under the function definition. That made the execution times much closer:

MPI

$ mpiexec -n 4 python demo.py --mpi
Elapsed time n=100: 0.0066378116607666016
Elapsed time n=1000: 0.054388999938964844
Elapsed time n=10000: 0.5381379127502441

Multiprocessing (MPI with -n 4 only uses 3 workers, so it's fairer to compare to 3 cores):

$ python demo.py --ncores=3
Elapsed time n=100: 0.48464012145996094
Elapsed time n=1000: 0.04478788375854492
Elapsed time n=10000: 0.4312679767608642

Batching tasks before map'ing

If your worker function is inherently very fast to execute and you just have a ton of tasks to execute on, I have gotten much better performance by first batching up the tasks and sending batches of tasks to the worker function.

Example script here (a modified version of your demo):
def worker(batch):
    _, tasks = batch

    results = []
    for a, b in tasks:
        results.append(a**2 + b**2)
    return results

def main(pool, n):
    from schwimmbad.utils import batch_tasks

    # Here we generate some fake data
    import random
    a = [random.random() for _ in range(n)]
    b = [random.random() for _ in range(n)]

    tasks = list(zip(a, b))
    batches = batch_tasks(min(1, pool.size-1),
                        arr=tasks)

    tic = time.time()
    results = pool.map(worker, batches)
    toc = time.time()
    results = [x for sublist in results for x in sublist]
    print(f'Elapsed time n={n}: {toc-tic}')

if __name__ == "__main__":
    import schwimmbad
    import time

    from argparse import ArgumentParser
    parser = ArgumentParser(description="Schwimmbad example.")

    group = parser.add_mutually_exclusive_group()
    group.add_argument("--ncores", dest="n_cores", default=32,
                    type=int, help="Number of processes (uses multiprocessing).")
    group.add_argument("--mpi", dest="mpi", default=False,
                    action="store_true", help="Run with MPI.")
    args = parser.parse_args()

    pool = schwimmbad.choose_pool(mpi=args.mpi, processes=args.n_cores)

    for n in [10 ** x for x in range(2, 6+1)]:
        main(pool, n)

MPI:

$ mpiexec -n 4 python schw-test.py --mpi
Elapsed time n=100: 0.001007080078125
Elapsed time n=1000: 0.003452777862548828
Elapsed time n=10000: 0.03164792060852051
Elapsed time n=100000: 0.29506802558898926
Elapsed time n=1000000: 2.743013858795166

Multiprocessing:

$ python schw-test.py --ncores=3
Elapsed time n=100: 0.4871842861175537
Elapsed time n=1000: 0.001354217529296875
Elapsed time n=10000: 0.006591081619262695
Elapsed time n=100000: 0.07938289642333984
Elapsed time n=1000000: 0.699350118637085

So, MPI is still slower here, but closer than in your initial example (though you could probably get slightly better performance by tuning the number of batches).

from schwimmbad.

enajx avatar enajx commented on May 26, 2024

That makes sense!

I run a benchmark again with a more realistic worker function and now I get more consistent results, in this case MPI taking the lead:

Computing pi, running on OSX with 8 cores:

n = 1000000
MultiPool : 0.54s
MPIPool : 0.12s

n = 10000000
MultiPool : 1.45s
MPIPool : 1.22s

import math
import random
import time
import schwimmbad

def sample(num_samples):
    num_inside = 0
    for _ in range(num_samples):
        x, y = random.uniform(-1, 1), random.uniform(-1, 1)
        if math.hypot(x, y) <= 1:
            num_inside += 1
    return num_inside

def approximate_pi_parallel(num_samples, cores, mpi):
    
    sample_batch_size = 1000

    with schwimmbad.choose_pool(mpi=mpi, processes=cores) as pool: 
        print(pool)
        start = time.time()
        num_inside = 0
        sample_batch_size = sample_batch_size
        for result in pool.map(sample, [sample_batch_size for _ in range(num_samples//sample_batch_size)]):
            num_inside += result
        print(f"pi ~= {(4*num_inside)/num_samples}")
        print(f"Finished in: {time.time()-start}s")
        pool.close()    

  
if __name__ == "__main__":

    n = 1000000
    # n = 10000000
    cores = 8
    mpi = False # True when run with mpiexec
    approximate_pi_parallel(n, cores, mpi)

Issue solved, thank you!

from schwimmbad.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.