I am trying to bench some data structures with a large number of items in each structu

How to specify setup code with Runner.bench_func() ? about pyperf HOT 6 OPEN

psf commented on July 23, 2024 2

How to specify setup code with Runner.bench_func() ?

from pyperf.

Comments (6)

vstinner commented on July 23, 2024

IMHO your benchmark works as you expect, it's just that child worker processese are flooding stdout :-)

See the documentation of the perf architecture:
http://perf.readthedocs.io/en/latest/run_benchmark.html#perf-architecture

Many processes are spawned. Even if a worker process displays "Running: Benchmark: bar" and "Running: Benchmark: foo", in practice, a worker only runs a single bench_func() function. The other methods are skipped. That's done internally by perf.

You may want to skip messages in worker processes. You can do that using:

args = runner.parse_args()
if not args.worker:
   # master process
   print("...")

Maybe I should explain that better in the documentation.

from pyperf.

parasyte commented on July 23, 2024

Nah, it isn't flooding stdout. Only about 100 messages are printed. The problem is "needlessly" running setup code for each sample. And there are at least 60 samples for each benchmark. 30 seconds * 60 samples * 28 benchmarks = way too much time to effectively benchmark code that reports a mean less than 30ms with standard deviation of 1ms.

I don't want to skip printing messages, I want to "skip" my setup code by reusing existing data structures in memory.

from pyperf.

pawarren commented on July 23, 2024

Is there a solution for this? My setup code takes ~20 seconds, and it looks like every single spawned process runs the setup code again instead of using the objects the parent process has already loaded into memory.

from pyperf.

pawarren commented on July 23, 2024

EDIT: This does not fix the actual original issue. Every single one of my spawned workers still does the setup code, which slows things down a lot. However, I've significantly shortened the amount of time running a single benchmark takes, because each worker no longer does the entire setup code - just what's needed to go from scratch to a specified benchmark.

I found and fixed the problem.

My working loop looks like this:

runner = perf.Runner(values=3, warmups=1, processes=20)

  for path in PATHS_TO_IMAGES:
    filename = path.stem
    image = cv2.imread(str(path)) # a numpy array

    benchmark_name = f'letterbox_resize_{LIBRARY}_{filename}'
    fn = partial(letterbox_resize, image)
    benchmark = runner.bench_func(benchmark_name, fn)

    if benchmark is not None:
      benchmark.dump(f'results/{benchmark_name}.json', compact=False, replace=True)
      if runner.args is not None:
        if runner.args.worker:
          break

I am looping over a list of paths to images. I load an image and benchmark some operations on the image. This benchmark function spawns many child processes. Those processes benchmark the function call as expected...and then continue onwards, fulfilling the rest of the loop and loading every single image in my list.

They don't actually do anything with those images; Runner seems to know that that process has already benchmarked its thing and doesn't call bench_func again. But it wastes a lot of time loading a bunch of images and doing nothing with them. .02s to load an image * 1,000 images * many processes * many iterations = a lot of wasted time.

The solution was using the runner.args.worker to check if I was in a worker process and if so, breaking out of the loop. But you runner.args is only present if you're in a worker process, so first you check if runner.args is not None. Then you check if runner.args.worker is True. Then because I was dumping results to a JSON file, I had to do the dump call here, right before breaking out of the loop. Any other placement of the .dump() call errors out.

from pyperf.

vstinner commented on July 23, 2024

We can consider to add a new parameter to bench_func() to register a "setup function".

In the meanwhile, you can work around the issue by writing a new script per benchmark no? If you want to get a single JSON file, you can use --append rather than --output.

from pyperf.

tyteen4a03 commented on July 23, 2024

Any updates on this? I'm scratching my head trying to write benchmark codes in pyperf (which includes a bunch of imports)

from pyperf.

How to specify setup code with Runner.bench_func() ? about pyperf HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent