I'm trying to utilize pyperf to benchmark some functions and save their results to a C

Maybe write: <div class="snippet-clipboard-content notranslate position-relative o

Waiting for runner's bench_func and bench_command functions to complete instead of receiving outputs individually? about pyperf HOT 5 OPEN

HunterAP23 commented on July 23, 2024

Waiting for runner's bench_func and bench_command functions to complete instead of receiving outputs individually?

from pyperf.

Comments (5)

HunterAP23 commented on July 23, 2024 1

The issue is that dumping the benchmark into a JSON file for the many benchmarks I want to run, which are almost entirely identical except for a different combinations of arguments, would mean I would get tons of JSON files saved.

On top of that, this looping issue would actually write the JSON file for each process that pyperf.Runner is given, so 20 processes would mean that the output JSON file would be written 21 times, once for each process run and one last time where all the runs are aggregated.

from pyperf.

vstinner commented on July 23, 2024

Maybe write:

if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.bench_func('sleep', func)

from pyperf.

HunterAP23 commented on July 23, 2024

I have tried that, but the output remains the same. Just to make sure, this is what that code looks like:

#!/usr/bin/env python3
import pyperf
import time


def func():
    time.sleep(0.001)

if __name__ == "__main__":
    runner = pyperf.Runner()
    results = runner.bench_func('sleep', func)
    print(results)

My workaround for this weird bug is to know how many processes are being used (either set by the user or use the default of 20) and check if len(result.get_runs()) == procs which looks like so:

#!/usr/bin/env python3
import pyperf
import time


def func():
    time.sleep(0.001)

if __name__ == "__main__":
    procs = 20
    runner = pyperf.Runner(processes=procs)
    results = runner.bench_func('sleep', func)
    if len(result.get_runs()) == procs:
        print(f"Mean: {results.mean()}")
        print(f"Median: {results.median()}")

But from my understanding of the overall code, pyperf calls new instances of the python to run the same program file, which is why it ends up printing out multiple times rather than after the benchmark is completed.

from pyperf.

HunterAP23 commented on July 23, 2024

An update regarding this is that this issue also makes it mildly difficult to output all the results into a single output file. Here's an example:

#!/usr/bin/env python3
import pyperf
import time
import csv
from statistics import mean, median


def func():
    time.sleep(0.001)

if __name__ == "__main__":
    procs = 20
    runner = pyperf.Runner(processes=procs)
    my_results = {}

    for i in range(1,11):
        result = runner.bench_func('sleep', func)
        if len(result.get_runs()) == procs:
            my_results[i] = list(result.get_values())
    
    with open("output.csv", "w", "newline="") as my_file:
        headers = ["Loop", "Mean", "Median"]
        writer = csv.DictWriter(my_file, fieldnames=headers)
        writer.writeheader()
        for k in my_results.keys():
            writer.writerow(
                {
                    "Loop": k,
                    "Mean": mean(my_results[k]),
                    "Median": median(my_results[k]),
                }
            )

Even though I have the if len(result.get_runs()) == procs block above to force the program to wait until all the results are available, this doesn't prevent the program from continuing onwards to later code. I also have to change the if len(result.get_runs()) == procs block to check if result is None since it will be None until any benchmark result is returned, which is not always the case in the loop. This is the fixed version of the code:

#!/usr/bin/env python3
import pyperf
import time
import csv
from pathlib import Path
from statistics import mean, median


def func():
    time.sleep(0.001)

if __name__ == "__main__":
    procs = 20
    runner = pyperf.Runner(processes=procs)
    my_results = {}

    for i in range(1,11):
        result = runner.bench_func('sleep', func)
        if result is None:
            pass
        elif len(result.get_runs()) == procs:
            my_results[i] = list(result.get_values())
    
    open_mode = "w"
    if Path("output.csv").exists():
        open_mode = "a"

    csv_file = open("output.csv", open_mode, newline="")
    headers = ["Loop", "Mean", "Median"]
    writer = csv.DictWriter(my_file, fieldnames=headers)
    if open_mode == "a":
        writer.writeheader()
    for k in my_results.keys():
        writer.writerow(
            {
                "Loop": k,
                "Mean": mean(my_results[k]),
                "Median": median(my_results[k]),
            }
        )

This is kind of cumbersome. I'm still trying to understand the underlying code for pyperf to see why this is occurring, but I'd want to look into creating a wait command (similar to asyncio's async/await) where all runs have to complete before moving on with the rest of the program.

from pyperf.

vstinner commented on July 23, 2024

That's a surprising way to use pyperf. Why not writing results in a JSON file and then load the JSON to process it?

https://pyperf.readthedocs.io/en/latest/api.html#BenchmarkSuite.load

from pyperf.

Waiting for runner's bench_func and bench_command functions to complete instead of receiving outputs individually? about pyperf HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent