Comments (3)
I think I've misunderstood what the printed GLOP value is meant to reflect this whole time. It's not the expected GFLOP/s for my CPU, but rather the number of GLOPs used for each trial. Closing the issue with this comment.
from peakperf.
Let me start by calculating your theoretical peak performance (please correct me if I'm wrong):
PP (SP) = 8 (cores) * 4.0 (Approx. Freq in GHz) * 2 (has FMA) * 8 (AVX2) * 2 (2 x AVX2 Units) = 1024 GFLOP/s
However, you seem to get 2048 GFLOP/s in some cases with 16 threads while you get around 1024 with 8 threads. There are only two possible explanations:
- My maths are wrong.
- There is a bug in peakperf.
Can you please paste the complete output of peakperf with no arguments? (which will run using 16 threads).
from peakperf.
Here:
------------------------------------------------------
peakperf (https://github.com/Dr-Noob/peakperf)
------------------------------------------------------
CPU: AMD Ryzen 7 5800X 8-Core Processor
Microarch: Zen 3
Benchmark: Zen 3 (AVX2)
Iterations: 1.00e+09
GFLOP: 2048.00
Threads: 16
Nº Time(s) GFLOP/s
1 1.69877 1205.58 *
2 1.72237 1189.06 *
3 1.76517 1160.23
4 1.71196 1196.29
5 1.73344 1181.47
6 1.71466 1194.41
7 1.72736 1185.62
8 1.71485 1194.28
9 1.71440 1194.59
10 1.72596 1186.59
11 1.72048 1190.36
12 1.71808 1192.03
------------------------------------------------------
Average performance: 1187.50 +- 10.17 GFLOP/s
------------------------------------------------------
* - warm-up, not included in average
I've checked the code myself and added logs and it looks like you are using n_threads = 16 and op_per_it = 8. I was looking into this yesterday before I submitted the issue and assumed op_per_it/frequency was doubled to account for 2 units per core, but didn't realize that was already being accounted for in the threads being doubled. I don't know why I didn't notice this or I did but then quickly forgot, but it looks like that's the issue.
Edit: For hyperthreaded cpus in general, should compute_gflops use max(n_threads, n_cores)?
from peakperf.
Related Issues (20)
- helper_cuda.h not found when compiling with gpu support HOT 7
- [GPU] Support for tensor cores
- Contact HOT 1
- [GPU] Support for RT cores
- [GPU] Support for Ampere GPUs
- [CPU] Support for non-AVX variantes
- Separate CPU backends by latencies and ALUs, not uarchs HOT 1
- Found invalid uarch: 'Zen 3' HOT 1
- Wrong benchmark name in KNL
- Intel 13th Gen not is unknown HOT 3
- [GPU] cudaSetDevice not used when -g is specified
- Wrong compute architecture is being detected during build
- FLOPS in KNL HOT 1
- Add some feedback while benchmark is running
- Colored -l (list benchmarks) output HOT 1
- Build fails because -march is missing in old compilers HOT 1
- Supported benchmarks information may be wrong HOT 1
- Support for ARM?
- Hybrid mode
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peakperf.