Enabling profile-guided optimization will provide some numbers that are the best they

Look at autofdo with <code class="notr

Enable PGO for benchmarks about rust_serialization_benchmark HOT 3 OPEN

djkoloski commented on July 21, 2024

Enable PGO for benchmarks

from rust_serialization_benchmark.

Comments (3)

zamazan4ik commented on July 21, 2024 3

I do ongoing PGO research on different applications - all results are available at https://github.com/zamazan4ik/awesome-pgo . I performed some PGO benchmarks on the rust_serialization_benchmark too and want to share my results here.

Test environment

Fedora 39
Linux kernel 6.7.6
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.76
Repo version: master branch on commit ce821970017832f43d00bc5110462ff1a2c38e17
Disabled Turbo boost for improving consistency across runs

Benchmark

Release benchmarks are done with tasket -c 0 cargo +nightly bench, PGO training phase with tasket -c 0 cargo +nightly pgo bench, PGO optimization with tasket -c 0 cargo +nightly pgo optimize bench. taskset -c 0 is used for better benchmark consistency. All PGO-related routines are done with cargo-pgo. All benchmarks are done on the same machine, with the same hardware/software during runs, with the same background "noise" (as much as I can guarantee, of course).

Results

Here are the results:

Release: https://gist.github.com/zamazan4ik/292d71316488e3e09bcb56f4e499f42e
PGO-optimized: https://gist.github.com/zamazan4ik/5cfb7a8acbe37ca3e876ba1a2468f5d5
(just for reference) PGO instrumented: https://gist.github.com/zamazan4ik/84ff553619aae1faab7a40fa33944d86

At least in the provided by project benchmarks, there are measurable improvements in many cases. However, also there are some regressions.

Look at autofdo with perf record.

I recommend starting with the regular PGO via instrumentation. AutoFDO is used for sampling the PGO approach. Starting with the instrumentation is generally a better idea since it has wider platform support and can be easily enabled for the project (compared to the sampling-based PGO).

from rust_serialization_benchmark.

finnbear commented on July 21, 2024 1

Great job presenting the results of PGO! In general, it seems like PGO increases average performance by ~5% but introduces noise. Not necessarily from run to run, but from benchmark to benchmark and version to version. Might make more sense to average PGO results over all datasets (and just live with the fact that there are only 4, so the average isn't totally immune to noise). Could also average over all crates, and just report a single PGO result. The goal is to give users an accurate answer for 1) which crate 2) whether to try PGO.

from rust_serialization_benchmark.

djkoloski commented on July 21, 2024

Look at autofdo with perf record.

from rust_serialization_benchmark.

Enable PGO for benchmarks about rust_serialization_benchmark HOT 3 OPEN

Comments (3)

Test environment

Benchmark

Results

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent