psf / pyperf Goto Github PK

View Code? Open in Web Editor NEW

761.0 761.0 73.0 1.74 MB

Toolkit to run Python benchmarks

Home Page: http://pyperf.readthedocs.io/

License: MIT License

Python 100.00%

benchmarking python

pyperf's People

Contributors

Stargazers

Watchers

pyperf's Issues

perf breaks on benchmarking Enum creation

I was trying to use perf for benchmarking python/cpython#11318 and tried below example that works with timeit but breaks while using perf. Not sure if this statement cannot be benchmarked as per perf implementation internals. I thought to file this anyway. Feel free to close this if it's invalid.

Using timeit

$ python3 -m timeit -s 'from enum import Enum;' "Enum('Animal', 'ANT BEE CAT DOG')"
2000 loops, best of 5: 121 usec per loop

Using perf

$ python3 -m perf timeit -s 'from enum import Enum;' "Enum('Animal', 'ANT BEE CAT DOG')"
Error when running timeit benchmark:

Statement:
"Enum('Animal', 'ANT BEE CAT DOG')"

Setup:
'from enum import Enum;'

Traceback (most recent call last):
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_timeit.py", line 203, in bench_timeit
    runner.bench_time_func(name, timer.time_func, **kwargs)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 458, in bench_time_func
    return self._main(task)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 423, in _main
    bench = self._worker(task)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 397, in _worker
    run = task.create_run()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_worker.py", line 293, in create_run
    self.compute()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_worker.py", line 331, in compute
    WorkerTask.compute(self)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_worker.py", line 280, in compute
    self.calibrate_loops()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_worker.py", line 245, in calibrate_loops
    calibrate_loops=True)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_worker.py", line 76, in _compute_values
    raw_value = self.task_func(self, self.loops)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 454, in task_func
    return time_func(loops, *args)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_timeit.py", line 109, in time_func
    return inner(it, timer)
  File "<timeit-src>", line 6, in inner
    Enum('Animal', 'ANT BEE CAT DOG')
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/enum.py", line 311, in __call__
    return cls._create_(value, names, module=module, qualname=qualname, type=type, start=start)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/enum.py", line 429, in _create_
    module = sys._getframe(2).f_globals['__name__']
KeyError: '__name__'
Error when running timeit benchmark:

Statement:
"Enum('Animal', 'ANT BEE CAT DOG')"

Setup:
'from enum import Enum;'

Traceback (most recent call last):
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_timeit.py", line 203, in bench_timeit
    runner.bench_time_func(name, timer.time_func, **kwargs)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 458, in bench_time_func
    return self._main(task)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 428, in _main
    bench = self._master()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_runner.py", line 551, in _master
    bench = Master(self).create_bench()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_master.py", line 221, in create_bench
    worker_bench, run = self.create_worker_bench()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_master.py", line 120, in create_worker_bench
    suite = self.create_suite()
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_master.py", line 110, in create_suite
    suite = self.spawn_worker(self.calibrate_loops, 0)
  File "/Users/karthikeyansingaravelan/stuff/python/py37-venv/lib/python3.7/site-packages/perf/_master.py", line 97, in spawn_worker
    % (cmd[0], exitcode))
RuntimeError: /Users/karthikeyansingaravelan/stuff/python/py37-venv/bin/python3 failed with exit code 1

Support utf8 mode

Our code requires utf8 mode, as desribed in https://www.python.org/dev/peps/pep-0540/
I tried to benchmark it with pyperf, but pyperf causes that utf8 mode is not set.
I tried to add PYTHONUTF8 env var, but it still did not help.
Then I found env vars are srtripped, so I added --inherit-environ=PYTHONUTF8.

It would be much easier if PYTHONUTF8 env var was inherited by default. What do you think about it?

Use psutil in perf for CPU affinity

On Python 2, the CPU affinity is not saved in metadata.

"taskset" can be replaced with "psutil.Process.cpu_affinity()" (works on Linux, Windows, FreeBSD) https://pythonhosted.org/psutil/#psutil.Process.cpu_affinity

Advice by @giampaolo who wrote psutil ;-)

I would prefer to keep psutil dependency optional whenever possible.

Move venv setup from pyperformance?

Playing around with pyperf, I can't help but think I'm going to end up reimplementing/importing a bunch of the pyperformance code around venvs. When testing performance of almost any Python code, I probably want to do it with it installed rather than from within a source tree, which means that any user of pyperf will end up implementing that.

I don't really know what makes the most sense API-wise; one obvious option would be a context manager which allows you to with runner.venv("requirements.txt") as env: env.bench_func("name", func) perhaps?

python3.8 time.clock is no longer available

[   24s] >       elif pyperf.perf_counter == time.clock:
[   24s] E       AttributeError: module 'time' has no attribute 'clock'
[   24s] 
[   24s] pyperf/_collect_metadata.py:95: AttributeError

With python 3.8 this errors out, I suppose there shoudl be hassattr check like on the perf_counter first check.

_collect_cpu_metadata (ppc)

I'm currently trying to benchmark numpy operations on pypy (arch ppc64le, which is probably not supported right now). It seems as if the implementation to gather _collect_cpu_metadata does not work for ppc.

Here is what I found:

Spawned process fails to make a set out of an integer. (Line ~256, _collect_cpu_freq in perf/metadata.py). _get_logical_cpu_count returns an integer. Unsure what all_cpus should be.

Runner.bench_func can't run different module's function?

I'm trying to run Runner.bench_func inside a package like this:

└── format
    ├── benchmarks
    │   ├── bm_format.py
    │   ├── bm_fstring.py
    │   ├── bm_percentage.py
    │   └── __init__.py
    ├── __init__.py
    └── run.py

Inside format/run.py the code will run benchmarks' function like this:

import perf
from .benchmarks import BENCHS


def run():
    runner = perf.Runner()
    for bm_name, bm_func in BENCHS:
        runner.bench_func(bm_name, bm_func)


if __name__ == '__main__':
    run()

When running this inside cli, it will raise RuntimeError and ModuleNotFoundError:

➜  python-string-format python -m format.run
Traceback (most recent call last):
  File "/tmp/python-string-format/format/run.py", line 2, in <module>
    from .benchmarks import BENCHS
ModuleNotFoundError: No module named '__main__.benchmarks'; '__main__' is not a package
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/python-string-format/format/run.py", line 12, in <module>
    run()
  File "/tmp/python-string-format/format/run.py", line 8, in run
    runner.bench_func(bm_name, bm_func)
  File "/usr/lib/python3.6/site-packages/perf/_runner.py", line 594, in bench_func
    return self._main(name, sample_func, inner_loops, metadata)
  File "/usr/lib/python3.6/site-packages/perf/_runner.py", line 508, in _main
    bench = self._master()
  File "/usr/lib/python3.6/site-packages/perf/_runner.py", line 768, in _master
    bench = self._spawn_workers()
  File "/usr/lib/python3.6/site-packages/perf/_runner.py", line 724, in _spawn_workers
    suite = self._spawn_worker(calibrate)
  File "/usr/lib/python3.6/site-packages/perf/_runner.py", line 665, in _spawn_worker
    % (cmd[0], exitcode))
RuntimeError: /usr/bin/python failed with exit code 1

Do I did wrong something, or the Runner.bench_func can only run inside its module?

Thanks.

pyperf does not work in REPL

(venv) marco@buzz:~/sources/cpython_mine$ python
Python 3.6.9 (default, Jul 17 2020, 12:50:27) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyperf
>>> 
>>> runner = pyperf.Runner()
>>> runner.timeit(name="sort a sorted list",
...               stmt="sorted(s, key=f)",
...               setup="f = lambda x: x; s = list(range(1000))")
/home/marco/sources/cpython_mine/venv/bin/python: can't find '__main__' module in ''

I'm using pyperf 2.0.0.

Logging instead of prints?

Logging could give the option to choose logging levels (e.g. suppress warnings/errors) or redirect those to different log files without parsing stdout.

Support for IBM Z missing

When trying to run a benchmark I noticed that the function collect_cpu_freq() in _collect_metadata.py parses the output of /proc/cpuinfo but fails on the IBM Z (mainframe, S/390).

On Linux on IBM Z the output may look like this:

vendor_id       : IBM/S390
# processors    : 16
[...]
processor 0: version = 00,  identification = [...]
cpu number      : 0
cpu MHz dynamic : 5208
cpu MHz static  : 5208
[...]

The code splits lines starting with processor at ':' which will not yield a proper CPU number on Z. It should suffice, however, to check the vendor_id for IBM/S390 and look for cpu number in this case instead. The parsing of of the frequency should work ok with the existing logic.

Environment variables are not inherited

For unclear reasons, perf removes all environment variables except for a specific white-list. I consider this a bug: environment variables are generally meant to be passed to all subprocesses.

Hide implementation detail of memory tracing module

It would be interesting to have an abstraction to hide the OS-specific implementation

Originally posted by @vstinner in #93 (comment)

Can I use this project as a library?

I want to use the timeit function in this project to replace standard timeit, and read the document and source code but didn't found a proper way to do it.

Accurately, I want to analyze the raw benchmark result(the time series of run timeit). I found this project deal with CPU affinity and CPU isolated, it's result probably more stable than standard timeit.

Here is a picture of the time series of run timeit:

>50% data is noise.

My stable benchmark algorithm: https://github.com/guyskk/validr/blob/cython/benchmark/stable_timeit.py
see also: guyskk/validr#9

Tests fail on Python 2: AttributeError: .. object has no attribute 'assertRegex'

The setup.py & tests dont declare unittest2 as a mandatory dependency on Python 2, but tox includes it.

Just running setup.py test or nose without it:

[   84s] _____________________________ TestTimeit.test_name _____________________________
[   84s] 
[   84s] self = <pyperf.tests.test_timeit.TestTimeit testMethod=test_name>
[   84s] 
[   84s]     def test_name(self):
[   84s]         name = 'myname'
[   84s]         args = PERF_TIMEIT + ('--name', name) + FAST_BENCH_ARGS
[   84s]         bench, stdout = self.run_timeit_bench(args)
[   84s]     
[   84s]         self.assertEqual(bench.get_name(), name)
[   84s] >       self.assertRegex(stdout, re.compile('^%s' % name, flags=re.MULTILINE))
[   84s] E       AttributeError: 'TestTimeit' object has no attribute 'assertRegex'
[   84s] 
[   84s] pyperf/tests/test_timeit.py:254: AttributeError
[   84s] ============== 17 failed, 143 passed, 1 skipped in 13.43 seconds ===============

Note using six doesnt help; I run into benjaminp/six#164

Make --fast really fast

The main reason why I use the standard timeit module instead of perf timeit -- it is fast. I get a result in 1-2 sec. In contrary, perf timeit even with the --fast option takes at least 10 sec. It is too slow when you are experimenting and run a microbenchark after every tiny change. The precision of the standard timeit module is enough if the difference is tens or hundreds percents.

The --fast option would be more useful if it decrease benchmarking time to 1-2 sec.

Use a better measures than average and standard deviation

Hey Victor! Great library you've got going here (not a surprise; I've been a fan since long before faulthandler was in the stdlib).

As I mentioned on Twitter, I think perf would benefit from using a more robust set of measures than average + standard deviation. I recommended median and median absolute deviation. I have a well-rounded Python2/3 compatible implementation in boltons.statsutils. You can lift the whole module, if you'd like, or just reimplement those functions.

In addition to making the statistics more robust, this also gets rid of the pesky problem of whether to use Bessel's correction. The Python 3 statistics module is rife with it, and in this domain it raises more questions than it answers.

I've written a bit about statistics for exactly cases like these, so I'm obligated to link: Statistics for Software. Let me know if that helps and/or if I can help clarify further.

Handling "ERROR: failed to calibrate the number of warmups"

I see in the source code that this is generated when the number of warmups exceeds a hard-coded value of 300. I'm doing some benchmarks under pypy and I guess the JIT warmup is an issue and this particular benchmark needs more warmups to converge.

How would you suggest I handle this? It looks like the maximum number of warmups is hard-coded - would you be ok with taking a patch that makes it configurable? Or maybe it is configurable already and I'm missing the parameter.

Thanks!

How to specify setup code with Runner.bench_func() ?

I am trying to bench some data structures with a large number of items in each structure. I would like to have a single setup that is called once (and only once) for all benchmark runs, because the setup takes several seconds to complete.

While debugging, I discovered this very peculiar behavior that perf reloads the entire main module for each run.

import perf
import time

def run_benchmarks(module):
    runner = perf.Runner()

    def get_doc(attr):
        return getattr(module, attr).__doc__

    bench_names = sorted([ x for x in dir(module) if x.startswith('bench_') ], key=get_doc)
    for bench_name in bench_names:
        bench = getattr(module, bench_name)
        doc = bench.__doc__
        print('Running: %s' % (doc))
        runner.bench_func(doc, bench)

class BenchModule(object):
    def __init__(self):
        print('Creating BenchModule()')
        time.sleep(3)
        print('Created BenchModule()')

    def bench_foo(self):
        '''Benchmark: foo'''
        return list(range(1000))

    def bench_bar(self):
        '''Benchmark: bar'''
        return list(range(1000))

if __name__ == '__main__':
    print('Starting benchmarks')
    module = BenchModule()
    run_benchmarks(module)

When I run this, I see output like the following:

Starting benchmarks
Creating BenchModule()
Created BenchModule()
Running: Benchmark: bar
Starting benchmarks
Creating BenchModule()
Created BenchModule()
Running: Benchmark: bar
Running: Benchmark: foo
.Starting benchmarks
Creating BenchModule()
Created BenchModule()
Running: Benchmark: bar
Running: Benchmark: foo
.Starting benchmarks
Creating BenchModule()
Created BenchModule()
Running: Benchmark: bar
Running: Benchmark: foo
.Starting benchmarks
Creating BenchModule()

[ ... snip ... ]

.Starting benchmarks
Creating BenchModule()
Created BenchModule()
Running: Benchmark: bar
Running: Benchmark: foo
.
Benchmark: foo: Mean +- std dev: 9.83 us +- 0.16 us

... what the ? 😖

Why is it printing my Starting benchmarks line more than once? More importantly, how is it doing this? Is there some black magic going on with subprocess(__main__)?

But back to the original issue that lead me down this rathole; how do I prevent perf from running my BenchModule constructor on each run? I put a 3-second sleep in there to illustrate why the current behavior is obnoxious. In my real world benchmark, the setup time is more than 30 seconds, and running the complete test suite lasts a few hours.

This is possibly related: #28

Document recommended layout for package authors

Hi. Thanks for writing this.

It'd be pretty helpful if the documentation recommended a layout / setup for package authors that want to write and/or ship a set of benchmarks with their package.

E.g., for jsonschema I went with this layout and have if __name__ == "__main__" blocks in a set of benchmark files, and a toxenv that runs them -- but it's not 100% ideal (for a few reasons).

Would be great if you have better suggestions for a recommended setup.

Non-ascii code is not supported

$ ./python -m perf timeit -s 's="АБВГҐДЕЄЖЗИІЇЙКЛМНОПРСТУФХЦЧШЩЬЮЯ"' 's.count("Є")'
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/serhiy/py/cpython/Lib/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/__main__.py", line 727, in <module>
    main()
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/__main__.py", line 719, in main
    func()
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/__main__.py", line 512, in cmd_timeit
    perf._timeit.main(timeit_runner)
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_timeit.py", line 184, in main
    timer = create_timer(runner, stmt)
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_timeit.py", line 104, in create_timer
    return timeit.Timer(stmt, setup, timer=perf.perf_counter)
  File "/home/serhiy/py/cpython/Lib/timeit.py", line 109, in __init__
    compile(setup, dummy_src_name, "exec")
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 3-68: surrogates not allowed
Traceback (most recent call last):
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_timeit.py", line 194, in main
    runner.bench_sample_func(args.name, sample_func, timer, **kwargs)
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_runner.py", line 529, in bench_sample_func
    return self._main(name, wrap_sample_func, inner_loops)
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_runner.py", line 496, in _main
    bench = self._master()
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_runner.py", line 732, in _master
    bench = self._spawn_workers()
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_runner.py", line 689, in _spawn_workers
    suite = self._spawn_worker(calibrate)
  File "/home/serhiy/.local/lib/python3.7/site-packages/perf/_runner.py", line 626, in _spawn_worker
    % (cmd[0], exitcode))
RuntimeError: /home/serhiy/py/cpython/python failed with exit code 1

Standard timeit module supports non-ascii code.

$ ./python -m timeit -s 's="АБВГҐДЕЄЖЗИІЇЙКЛМНОПРСТУФХЦЧШЩЬЮЯ"' 's.count("Є")'
500000 loops, best of 5: 410 nsec per loop

CPU time option

For some uses, it would be nice to measure CPU time rather than wall-clock time.

Support --track-memory option on macOS

--track-memory does not support macOS yet.
First I'd like to start this issue by using psutil any ideas?

Migrate CI to Github Action

psutil is missing in setup dependencies

When running from a virtual environment shell on Windows 10 (2004):
pipenv run python speed_test.py

I got this warning:
WARNING: unable to increase process priority for each process and one extra for each runner.timeit() call.

Figured out that it is because psutil is missing in package dependencies.

CI failed on Windows

======================================================================
FAIL: test_command_track_memory (test_perf_cli.TestPerfCLI)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\a\pyperf\pyperf\pyperf\tests\test_perf_cli.py", line 619, in test_command_track_memory
    self.run_command(*args)
  File "D:\a\pyperf\pyperf\pyperf\tests\test_perf_cli.py", line 35, in run_command
    self.assertEqual(proc.stderr, '')
AssertionError: 'Traceback (most recent call last):\n  Fil[2698 chars] 1\n' != ''
- Traceback (most recent call last):
-   File "c:\hostedtoolcache\windows\python\3.9.2\x64\lib\runpy.py", line 197, in _run_module_as_main
-     return _run_code(code, main_globals, None,
-   File "c:\hostedtoolcache\windows\python\3.9.2\x64\lib\runpy.py", line 87, in _run_code
-     exec(code, run_globals)
-   File "D:\a\pyperf\pyperf\pyperf\__main__.py", line 763, in <module>
-     main()
-   File "D:\a\pyperf\pyperf\pyperf\__main__.py", line 759, in main
-     func()
-   File "D:\a\pyperf\pyperf\pyperf\__main__.py", line 728, in cmd_bench_command
-     runner.bench_command(name, command)
-   File "D:\a\pyperf\pyperf\pyperf\_runner.py", line 622, in bench_command
-     return self._main(task)
-   File "D:\a\pyperf\pyperf\pyperf\_runner.py", line 427, in _main
-     bench = self._worker(task)
-   File "D:\a\pyperf\pyperf\pyperf\_runner.py", line 401, in _worker
-     run = task.create_run()
-   File "D:\a\pyperf\pyperf\pyperf\_worker.py", line 284, in create_run
-     self.compute()
-   File "D:\a\pyperf\pyperf\pyperf\_command.py", line 53, in compute
-     raise RuntimeError("failed to get the process RSS")
- RuntimeError: failed to get the process RSS
- Traceback (most recent call last):
-   File "c:\hostedtoolcache\windows\python\3.9.2\x64\lib\runpy.py", line 197, in _run_module_as_main
-     return _run_code(code, main_globals, None,
-   File "c:\hostedtoolcache\windows\python\3.9.2\x64\lib\runpy.py", line 87, in _run_code
-     exec(code, run_globals)
-   File "D:\a\pyperf\pyperf\pyperf\__main__.py", line 763, in <module>
-     main()
-   File "D:\a\pyperf\pyperf\pyperf\__main__.py", line 759, in main
-     func()
-   File "D:\a\pyperf\pyperf\pyperf\__main__.py", line 728, in cmd_bench_command
-     runner.bench_command(name, command)
-   File "D:\a\pyperf\pyperf\pyperf\_runner.py", line 622, in bench_command
-     return self._main(task)
-   File "D:\a\pyperf\pyperf\pyperf\_runner.py", line 432, in _main
-     bench = self._manager()
-   File "D:\a\pyperf\pyperf\pyperf\_runner.py", line 560, in _manager
-     bench = Manager(self).create_bench()
-   File "D:\a\pyperf\pyperf\pyperf\_manager.py", line 229, in create_bench
-     worker_bench, run = self.create_worker_bench()
-   File "D:\a\pyperf\pyperf\pyperf\_manager.py", line 128, in create_worker_bench
-     suite = self.create_suite()
-   File "D:\a\pyperf\pyperf\pyperf\_manager.py", line 122, in create_suite
-     suite = self.spawn_worker(0, 0)
-   File "D:\a\pyperf\pyperf\pyperf\_manager.py", line 104, in spawn_worker
-     raise RuntimeError("%s failed with exit code %s"
- RuntimeError: D:\a\pyperf\pyperf\.tox\py\Scripts\python.EXE failed with exit code 1


======================================================================
FAIL: test_python_option (test_timeit.TestTimeit)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\a\pyperf\pyperf\pyperf\tests\test_timeit.py", line 248, in test_python_option
    self.assertEqual(cmd.returncode, 0, repr(cmd.stdout + cmd.stderr))
AssertionError: 1 != 0 : 'Error when running timeit benchmark:\n\nStatement:\n\'time.sleep(1e-6)\'\n\nSetup:\n\'import time\'\n\nNo pyvenv.cfg file\nTraceback (most recent call last):\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_timeit.py", line 230, in bench_timeit\n    runner.bench_time_func(name, timer.time_func, **kwargs)\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_runner.py", line 462, in bench_time_func\n    return self._main(task)\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_runner.py", line 432, in _main\n    bench = self._manager()\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_runner.py", line 560, in _manager\n    bench = Manager(self).create_bench()\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_manager.py", line 229, in create_bench\n    worker_bench, run = self.create_worker_bench()\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_manager.py", line 128, in create_worker_bench\n    suite = self.create_suite()\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_manager.py", line 122, in create_suite\n    suite = self.spawn_worker(0, 0)\n  File "D:\\a\\pyperf\\pyperf\\pyperf\\_manager.py", line 104, in spawn_worker\n    raise RuntimeError("%s failed with exit code %s"\nRuntimeError: C:\\Users\\runneradmin\\AppData\\Local\\Temp\\tmpsstomtue failed with exit code 106\n'

======================================================================
FAIL: test_worker_verbose (test_timeit.TestTimeit)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\a\pyperf\pyperf\pyperf\tests\test_timeit.py", line 107, in test_worker_verbose
    self.assertIsNotNone(match, repr(cmd.stdout))
AssertionError: unexpectedly None : "Warmup 1: 337 us (loops: 1, raw: 337 us)\n\nValue 1: 1.86 ms\nValue 2: 853 us\n\nMetadata:\n- boot_time: 2021-03-16 06:40:42.876194\n- cpu_count: 2\n- date: 2021-03-16 06:46:56.876194\n- duration: 16.0 ms\n- hostname: fv-az68-167\n- loops: 1\n- mem_peak_pagefile_usage: 9988.0 kB\n- name: timeit\n- perf_version: 2.1.1\n- platform: Windows-10-10.0.17763-SP0\n- python_compiler: MSC v.1928 64 bit (AMD64)\n- python_executable: D:\\a\\pyperf\\pyperf\\.tox\\py\\Scripts\\python.EXE\n- python_hash_seed: 337\n- python_implementation: cpython\n- python_version: 3.9.2 (64-bit) revision 1a79785\n- timeit_setup: 'import time'\n- timeit_stmt: 'time.sleep(1e-3)'\n- timer: QueryPerformanceCounter(), resolution: 100 ns\n- unit: second\n- uptime: 6 min 14.0 sec\n\nMean +- std dev: 1.36 ms +- 0.71 ms\n"

----------------------------------------------------------------------
Ran 165 tests in 18.766s

FAILED (failures=3)
ERROR: InvocationError for command 'D:\a\pyperf\pyperf\.tox\py\Scripts\python.EXE' -bb -Wd -m unittest discover -s pyperf/tests/ -v (exited with code 1)
___________________________________ summary ___________________________________
ERROR:   py: commands failed

Benchmark parameters

Good day.

This is more a question, than an issue.

So, let's say I want to benchmark multiple implementations (e.g. https://github.com/gagoman/atomicl/blob/master/benchmarks.py). Is a custom cli option is a best way not to duplicate code or make for loops? For example:

python -m perf benchmarks.py --impl atomicl._py

Should we have something like benchmark parameters and benchmark context?

perf.Runner(params=dict(impl=[first, second, third]))

and benchmark itself:

def bench(ctx):
    ctx.impl().do()

In case if there are multiple params, context would hold all the permutations.

The idea is stolen from JMH. If you would like the idea, I can try to describe it more clearly and implement it.

os.pread not available in Python2

Error when running "pyperf system show":

  File "/usr/lib64/python2.7/site-packages/perf/_system.py", line 146, in read_msr
    data = os.pread(fd, size, reg_num)
AttributeError: 'module' object has no attribute 'pread'

citation?

Do you have any preference for how I should cite perf? I'm using it to write a paper where I will be benchmarking my library against competitors.

I'm happy to just cite the github repo but perhaps there's something else I should be citing.

Add a new clearup stage for timeit

I propose to add a clearup stage for timeit. The reason is today when I run perf to benchmark a program, the program will generate a cache file first time and then always use it. I wish to have a separate clear stage, just like the setup stage, to clean it.

Allow setting updated/new env var values on Runner.

In _utils.py the create_environ() helper produces a dict that can be passed as the "env" arg to subprocess.run(), etc. It is used in Master.spawn_worker() (in _master.py). Currently it takes an "inherit_environ" arg (Bool) that corresponds to the "--inherit-environ" flag in the Runner CLI (see Runner.init() in _runner.py). This results in a potentially problematic situation.

The Problem

Let's say you have a benchmark script that internally relies on some environment variable that is defined relative to the commandline args given to the script. This environment variable may be set already or it might not. Regardless, you will be setting it to some new value. To make this work you need to do something like the following:

    # This is a concrete example.
    os.environ["MY_VAR"] = "spam" if runner.args.my_flag else "eggs"
    if runner.args.inherit_environ is None:
        runner.args.inherit_environ = ["MY_VAR"]
    else:
        runner.args.inherit_environ.append("MY_VAR")

However, in some cases you can't leave the env var set (or maybe the env var could cause pyperf to break). Plus things are more complicated if you have more than one such env var.

The Solution

Consequently, in a benchmark script it would be nice to be able to give the actual env var pairs to Runner rather than doing the dance above. Here are some possible approaches to solve the problem:

allow Runner.args.inherit_environ to be a dict
- create_environ() would be updated to do the right thing
- this is probably too messy to be part of pyperf's public API
add something like Runner.env_vars to allow benchmarks to explicitly set env var values to be used in workers
- create_environ() would grow a new "env" arg or "inherit_environ" would updated as above
add Runner.add_env_var(name, value=INHERIT)
- this is like either of the two above, but probably a better public API for the functionality

Add shell for analysis

pstats has a "shell mode" to conveniently run multiple analysis on the same profile. I think it'd be useful for perf given the various analysis sub-commands.

Handling return from benchmarked functions

Hi, is there any way to handle the output of a function while it is being benchmarked by runner.bench_func?

Something like:

benchmark, output = runner.bench_func("Test", func)

benchmark = runner.bench_func("Test", func)
output = benchmark.get_output()

I understand this assumes the function is deterministic but having the option would be nice.

Thanks!

LOTS of output

I just started using perf again after a while and I'm suddenly getting flooded with output that I don't remember seeing before. Just running python3 -m perf timeit "int('3')" or, in fact, any command after timeit gives 100s of lines of output of the sort:
.....................
(2.081039619447611e-07, 2.0966166496341754e-07, 2.0834495735241876e-07, 2.0798590087862945e-07, 2.0800072097690303e-07, 2.095399551389071e-07, 2.0929106521461183e-07, 2.0892218017577735e-07, 2.0930357360823826e-07, 2.0832581901567004e-07, 2.0800754356280204e-07, 2.0977499198826521e-07, 2.091219863890187e-07, 2.0929869270353008e-07, 2.0946509552040304e-07, 2.0924766921948112e-07, 2.0950784873977057e-07, 2.0922689056464272e-07, 2.0953447914077994e-07, 2.1273634529146712e-07, 2.138244304655118e-07, 2.2082462883092624e-07, 2.1117857170079024e-07, 2.4641811561566807e-07, 2.0901058578499943e-07, 2.0829808425847085e-07, 2.1190711975080379e-07, 2.0787903594852997e-07, 2.0788304519632483e-07, 2.1075399589572108e-07, 2.0771297073313155e-07, 2.0974149704013068e-07, 2.0810379028346482e-07, 2.1122927856484508e-07, 2.1184817314279236e-07, 2.113853111267161e-07, 2.103219852445104e-07, 2.0893610191280443e-07, 2.0859032440201375e-07, 2.0787958335787005e-07, 2.0793043708881853e-07, 2.0974521255413825e-07, 2.099091072071957e-07, 2.0979626083432457e-07, 2.1052976417572367e-07, 2.575373821249449e-07, 2.0835235405025632e-07, 2.1008222007672106e-07, 2.4275443840163224e-07, 2.095295696270122e-07, 2.411018028263684e-07, 2.093657970429763e-07, 2.0874368476982152e-07, 2.1040027618421386e-07, 2.3591923522907343e-07, 2.1126461601309043e-07, 2.1081210899512315e-07, 2.1505261421292388e-07, 2.1558845520082415e-07, 2.157484321595876e-07) 60
{<class 'int'>} 0
{144115188075855872: 29990941615, 1: 0}
{144115188075855872: 29990941615, 1: 0, 288230376151711744: 60430860557}
{144115188075855872: 29990941615, 1: 0, 288230376151711744: 120482205984}
{144115188075855872: 29990941615, 1: 0, 288230376151711744: 120482205984, 576460752303423488: 119895708889}
{144115188075855872: 29990941615, 1: 0, 288230376151711744: 180434332031, 576460752303423488: 119895708889}
{144115188075855872: 29990941615, 1: 0, 288230376151711744: 180434332031, 576460752303423488: 240687269066}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 288230376151711744: 180434332031, 576460752303423488: 240687269066}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 288230376151711744: 180434332031, 576460752303423488: 361122706223}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 288230376151711744: 240761979782, 576460752303423488: 361122706223}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 288230376151711744: 300807808959, 576460752303423488: 361122706223}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 288230376151711744: 300807808959, 576460752303423488: 481030891270}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 576460752303423488: 481030891270, 18014398509481984: 3778970303, 288230376151711744: 300807808959}
{144115188075855872: 29990941615, 1: 0, 72057594037927936: 15081010613, 576460752303423488: 601581508867, 18014398509481984: 3778970303, 288230376151711744: 300807808959}
{144115188075855872: 60154062078, 1: 0, 72057594037927936: 15081010613, 576460752303423488: 601581508867, 18014398509481984: 3778970303, 288230376151711744: 300807808959}
{144115188075855872: 60154062078, 1: 0, 72057594037927936: 15081010613, 576460752303423488: 722329915412, 18014398509481984: 3778970303, 288230376151711744: 300807808959}
{144115188075855872: 60154062078, 1: 0, 36028797018963968: 7538941801, 72057594037927936: 15081010613, 576460752303423488: 722329915412, 18014398509481984: 3778970303, 288230376151711744: 300807808959}
{144115188075855872: 60154062078, 1: 0, 36028797018963968: 7538941801, 72057594037927936: 15081010613, 576460752303423488: 722329915412, 18014398509481984: 3778970303, 288230376151711744: 361194335008}
{144115188075855872: 60154062078, 1: 0, 36028797018963968: 7538941801, 72057594037927936: 15081010613, 576460752303423488: 842941006149, 18014398509481984: 3778970303, 288230376151711744: 361194335008}
{144115188075855872: 60154062078, 1: 0, 36028797018963968: 7538941801, 72057594037927936: 15081010613, 576460752303423488: 963729409628, 18014398509481984: 3778970303, 288230376151711744: 361194335008}

Not sure what I'm doing wrong that I didn't do before.

test_show fails with return code 1

[   14s] ____________________________ SystemTests.test_show _____________________________
[   14s] 
[   14s] self = <pyperf.tests.test_system.SystemTests testMethod=test_show>
[   14s] 
[   14s]     def test_show(self):
[   14s]         args = [sys.executable, '-m', 'pyperf', 'system', 'show']
[   14s]         proc = get_output(args)
[   14s]     
[   14s]         regex = ('(Run "%s -m pyperf system tune" to tune the system configuration to run benchmarks'
[   14s]                  '|OK! System ready for benchmarking'
[   14s]                  '|WARNING: no operation available for your platform)'
[   14s]                  % os.path.basename(sys.executable))
[   14s]         self.assertRegex(proc.stdout, regex, msg=proc)
[   14s]     
[   14s]         # The return code is either 0 if the system is tuned or 2 if the
[   14s]         # system isn't
[   14s] >       self.assertIn(proc.returncode, (0, 2), msg=proc)
[   14s] E       AssertionError: 1 not found in (0, 2) : ProcResult(returncode=1, stdout='Show the system configuration\n\nSystem state\n============\n\nCPU: use 8 logical CPUs: 0-7\nPerf event: Maximum sample rate: 63000 per second\nASLR: Full randomization\nLinux scheduler: No CPU is isolated\nCPU Frequency: 0-7=min=1200 MHz, max=3900 MHz\nTurbo Boost (intel_pstate): Turbo Boost enabled\nIRQ affinity: Default IRQ affinity: CPU 0-7\nIRQ affinity: IRQ affinity: IRQ 0-15,56-59,70,72-79,81-82=CPU 0-7; IRQ 16=CPU 0,4; IRQ 27,32,36,47,61=CPU 4; IRQ 28,39,43,52=CPU 1; IRQ 29,33,42,46=CPU 3; IRQ 30,34,48,63=CPU 0; IRQ 31,35,40,44,80=CPU 7; IRQ 37,51,55,60,64=CPU 6; IRQ 38,49,53,62=CPU 2; IRQ 41,45,50,54,68=CPU 5; IRQ 65=CPU 1,5; IRQ 66=CPU 3,7; IRQ 67=CPU 2,6\n\nAdvices\n=======\n\nPerf event: Set max sample rate to 1\nLinux scheduler: Use isolcpus=<cpu list> kernel parameter to isolate CPUs\nLinux scheduler: Use rcu_nocbs=<cpu list> kernel parameter (with isolcpus) to not schedule RCU on isolated CPUs\nTurbo Boost (intel_pstate): Disable Turbo Boost to get more reliable CPU frequency\n\nErrors\n======\n\nCPU scaling governor (intel_pstate): Unable to read CPU scaling governor from /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor\n\nRun "python2 -m pyperf system tune" to tune the system configuration to run benchmarks\n', stderr='')
[   14s] 
[   14s] pyperf/tests/test_system.py:21: AssertionError
[   14s] =============== 1 failed, 160 passed, 1 skipped in 10.19 seconds ===============

Imo it is as the retcode 2 was expected the 1 should be added to the possible okay results.

test_collect_metadata fails on non-x86-ish arch

On ARM and PPC I see failures like

[   74s] self = <pyperf.tests.test_metadata.TestMetadata testMethod=test_collect_metadata>
[   74s] 
[   74s]     def test_collect_metadata(self):
[   74s]         metadata = perf_metadata.collect_metadata()
[   74s]     
[   74s]         for key in MANDATORY_METADATA:
[   74s] >           self.assertIn(key, metadata)
[   74s] E           AssertionError: 'cpu_model_name' not found in {'python_cflags': '-fno-strict-aliasing -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -g -DNDEBUG -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -g -DOPENSSL_LOAD_CONF -fwrapv', 'aslr': 'Full randomization', 'cpu_count': 8, 'perf_version': '1.6.1', 'python_version': '2.7.16 (64-bit)', 'python_hash_seed': 0, 'python_unicode': 'UCS-4', 'python_implementation': 'cpython', 'hostname': 'obs-arm-7', 'load_avg_1min': 0.6, 'timer': 'time.time()', 'runnable_threads': 1, 'platform': 'Linux-5.3.4-1-default-aarch64-with-glibc2.17', 'boot_time': '2019-10-12 18:55:14', 'date': '2019-10-12 18:56:02.918085', 'python_executable': '/usr/bin/python2', 'uptime': 48.918482065200806, 'mem_max_rss': 60080128, 'cpu_config': 'idle:none'}
[   74s] 
[   74s] pyperf/tests/test_metadata.py:25: AssertionError

Standardize on a storage format

I have a very similar project here: https://github.com/ionelmc/pytest-benchmark

It depends on pytest of course but what if it could load data from perf (or vice-versa)?

pytest-benchmark storage format is in files like <run_number>_<description>.json and each file contains multiple tests. Eg: https://github.com/ionelmc/pytest-benchmark/blob/master/tests/test_storage/0001_b87b9aae14ff14a7887a6bbaa9731b9a8760555d_20150814_190343_uncommitted-changes.json

The main differences I see to perf is the multiple tests in a single file and the raw data is optional (default is to only store the computed stats).

What do you think, could there be a common format to allow interoperability?

Mean error of distribution

Standard deviation doesn't have direct relation to median. It has relation to arithmetic mean. Arithmetic mean is a value that minimize the squared deviation, median is a value that minimize the mean absolute error. https://en.wikipedia.org/wiki/Median#Optimality_property

I think it would be more natural to report the mean absolute error of median rather of standard deviation.

Why is spawning a new process per benchmark more accurate?

I stumpled upon an explanation of the differences between timeit and pyperf here: https://pyperf.readthedocs.io/en/latest/cli.html#timeit-versus-pyperf-timeit

I am wondering why spawning a new process to benchmark is better?

Edit: I assume this has something to do with it? https://vstinner.github.io/journey-to-stable-benchmark-average.html

perf._utils.jitwarmupoptions()?

I think that to solve the warmup issue, you may need to add the following options when testing on pypy, e.g.:

pypy --jit threshold=1,function_threshold=1

You can find more details by checking the output of pypy --jit help, by default the JIT will only start working after 1039 loops.

That's why I was suggesting the title of this issue, these command line options would vary by JITted Python implementation, e.g.:

If I understand correctly under Jython under Oracle's JDK, the argument might be jython -J-Xcomp (you'll see that it takes its sweet time starting compared to a regular jython):

-Xcomp
Forces compilation of methods on first invocation. By default, the Client VM (-client) performs 1,000 interpreted method invocations and the Server VM (-server) performs 10,000 interpreted method invocations to gather information for efficient compilation. Specifying the -Xcomp option disables interpreted method invocations to increase compilation performance at the expense of efficiency.

If using Pyston, according to the README at https://github.com/dropbox/pyston, you should try -O (you can see it in action with pyston -O -v):

-O
Force Pyston to always run at the highest compilation tier. This doesn't always produce the fastest running time due to the lack of type recording from lower compilation tiers, but similar to -n can help test the code generator.

If running under IronPython, you probably could try with -X:CompilationThreshold 1

-X:CompilationThreshold The number of iterations before the interpreter starts compiling

PerfEP: More examples in the README

A Perf Enhancement Proposal: I stumbled across your project on the recommendation of pypy -mtimeit .... Viewing the README on PyPI left me a bit underwhelmed about trying it out. The bullets list of features is a bit dry.

Would you be interested in a PR with a new README? One written form the perspective of a new user. Naturally you would have final approval, and if you decided not to use it I would not be offended.

I wanted to run the idea by you first, before submitting something out the blue.

perf.timeit: histogram output

It might be useful (and fancy :) to add a flag to perf.timeit to make it output a histogram in addition to the median and stddev.

I know --json-file can be used to get this, but doing it that way includes extra steps and is clunky. The UX would be better if users could just ask for the histogram in one command.

If it was up to me, I'd display the histogram by default in perf.timeit.

Output the probability of being faster (slower) when compare results

When compare two results it would be helpful to output a probability of one result be faster then other.

If times1 and times2 are sets of measured times, then the probability of the first benchmark being faster than the second one is estimated as:

sum(x < y for x in times1 for y in times2)/len(times1)/len(times2)

Actually you can sort one of sets and use binary search for optimization.

Max time is not used/implemented

Good day.

As far as I can see max_time of Runner is not used.
First I was going to implement it but I've found out that there is also a commit da7ede7 which notes attribute as removed.

Are there any plans for that?

Best regards.

Can't loop over benchmarks?

I'm confused about why this script crashes:

import perf

setup = {
    'run1': 'data = [1, 2, 3]',
    'run2': 'data = [4, 5, 6]',
}

runner = perf.Runner()

for package in setup:
    print(package)
    runner.timeit(
        name=package,
        stmt="data[1]",
        setup=setup[package],
    )

For me, I get the following output:

run2
run2
run1
.run1
run2
Error when running timeit benchmark:

Statement:
'data[1]'

Setup:
'data = [4, 5, 6]'

Traceback (most recent call last):
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_timeit.py", line 203, in bench_timeit
    runner.bench_time_func(name, timer.time_func, **kwargs)
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_runner.py", line 458, in bench_time_func
    return self._main(task)
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_runner.py", line 428, in _main
    bench = self._master()
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_runner.py", line 551, in _master
    bench = Master(self).create_bench()
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_master.py", line 221, in create_bench
    worker_bench, run = self.create_worker_bench()
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_master.py", line 135, in create_worker_bench
    self.bench.add_runs(worker_bench)
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_bench.py", line 589, in add_runs
    self.add_run(run)
  File "/home/goldbaum/.virtualenvs/yt-dev/lib/python3.4/site-packages/perf/_bench.py", line 449, in add_run
    % (key, value, run_value))
ValueError: incompatible benchmark, metadata name is different: current=run2, run=run1

I gather that runner.timeit doesn't block until the tests it's running are done? That was surprising to me. Is there any way I can add a synchronization barrier between benchmark runs? Or is there really no way to loop over a set of benchmarks?

pyperf is incompatible with click

When a script calls pyperf.Runner.bench_func from a script that uses click to handle arguments, it raises an exception in pyperf's argument parsing logic. Here is a minimal reproducing script:

import click
import pyperf

@click.command()
def cli():
    def f():
        pass

    runner = pyperf.Runner()

    runner.bench_func("Example", f)

if __name__ == "__main__":
    cli()

Here's a requirements.txt, pinned to the versions I have tried this with:

click==7.1.1
pyperf==2.0.0

The error message is:

$ python my_benchmark.py 
Usage: my_benchmark.py [OPTIONS]
Try 'my_benchmark.py --help' for help.

Error: no such option: --worker

I believe that the issue is that pyperf.Runner is directly using argparse to add its own command line arguments, which is not really the behavior I would expect from a library.

It might be a lot to ask, but I think a better option would be to refactor this into a config object that can also be constructed automatically from the parser, something like this:

import attr

@attr.s(auto_attrib=True)
class RunnerConfig:
    verbose: bool = False
    quiet: bool = False
    pipe: int = None
    ...

    @classmethod
    def from_argparse(cls, argparser=None):
        if argparser is None:
            argparser = argparse.ArgumentParser()
        parser.description = "Benchmark"
        ....
        args = argparser.parse_args()
        return cls(verbose=args.verbose, quiet=args.quiet, pipe=args.pipe, ...)

To avoid backwards incompatibility issues, you can add a flag to Runner like use_argparse=True, which users of click could set to False to avoid this problem.

Support PyPy?

Using PyPy 5.3.1:

> pyperf timeit ""
Traceback (most recent call last):
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/_timeit.py", line 71, in main
    runner.bench_sample_func(sample_func, timer)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 719, in bench_sample_func
    return self._main(wrap_sample_func)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 696, in _main
    return self._worker(bench, sample_func)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 634, in _worker
    raw_sample = sample_func(loops)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 717, in wrap_sample_func
    return sample_func(loops, *args)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/_timeit.py", line 55, in sample_func
    return timer.inner(it, timer.timer)
AttributeError: Timer instance has no attribute 'inner'
Traceback (most recent call last):
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/_timeit.py", line 71, in main
    runner.bench_sample_func(sample_func, timer)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 719, in bench_sample_func
    return self._main(wrap_sample_func)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 698, in _main
    return self._spawn_workers(bench)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 830, in _spawn_workers
    run_suite = self._spawn_worker()
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 783, in _spawn_worker
    return _bench_suite_from_subprocess(cmd)
  File "/home/tin/pg/attrs/.venv-pypy/site-packages/perf/text_runner.py", line 46, in _bench_suite_from_subprocess
    % (args[0], proc.returncode))
RuntimeError: /home/tin/pg/attrs/.venv-pypy/bin/pypy failed with exit code 1

Can anything be done?

Kernel density estimation instead of histogram

I think it would be nice to include this as a complement to the existing histogram out of the box, as a KDE sidesteps the issues with selecting a correct bucket size for the histogram. I'm not really a stats person so I can't explain this very well, but Wikipedia has more info and you can see an example of this in use in the "criterion" benchmarking tool for Haskell: http://www.serpentine.com/criterion/fibber.html

Documentation links to wrong name on PyPI

https://pypi.org/project/pyperf/#history shows that a different project replaced this one in Dec 2018 of last year. The documentation for pyperf still points to this link and the pip installation instructions install the other project. (I found this out after installing it by accident.)

Has this project been abandoned? Deleted from PyPI? Is this a security issue?

Reconsidering min()?

I've done a little experiment: I run timeit.repeat 1000 times on a really simple statement. After that, I sorted the list and I grouped the results in time slices, counting the number of times a bench is inside a certain time slice. This way I have a sort of probability. Then I plot the result:

https://www.dropbox.com/s/f4naryc055k42cs/2020-07-24-080340_1366x768_scrot.png?dl=0

The result is quite interesting. As you can see, the distribution is not a normal distribution. It seems a multimodal normal distribution. Furthermore, the highest probability is near the minimum value, and not near the average (that is on the second peak).

Mean and stdev are only meaningful if the distribution is a normal distribution. It seems that the way you can have a sort of normal distribution is to consider only the first part of the plot.

Since pyperf returns mean and stdev, does it "cut" the slowest runs?

Add geometric mean

There are two approaches to compute a geometric mean:

compute the product and then the nth root
compute logs of values, divide by the number of values, compute the exponent

scipy uses log+exp: https://github.com/scipy/scipy/blob/bf3ea59982a552e65f0388d4dd17b256b962adbb/scipy/stats/stats.py#L349-L411

The geometric mean should help to compute N benchmarks with a single value.

For example, https://speed.pypy.org/ says "The geometric average of all benchmarks is 0.23 or 4.3 times faster than cpython."

psf / pyperf Goto Github PK

pyperf's People

Contributors

Stargazers

Watchers

Forkers

pyperf's Issues

The Problem

The Solution

Recommend Projects

Recommend Topics

Recommend Org