tapplencourt / mkl-verbose-toolkit Goto Github PK
View Code? Open in Web Editor NEWTools to run and parse MKL verbose mode
Tools to run and parse MKL verbose mode
Right now we store everything in memory.
For large system this is problematic.
We should be able to 'stream' and keep only the first agregated data.
MKL_VERBOSE Intel(R) MKL 2019.0 Update 2 Product build 20190118 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 2.40GHz intel_thread
MKL_VERBOSE FFT(dcfi30x30x30,tLim:4,desc:0x1c9c140) 452.82us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcbi30x30x30,tLim:4,desc:0x1cb66c0) 118.36us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcfi30x30x30,tLim:4,desc:0x1cc0240) 165.08us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcbi30x30x30,tLim:4,desc:0x1cc9680) 84.77us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcfi30x30x30,tLim:4,desc:0x1cc0240) 102.58us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcbi30x30x30,tLim:4,desc:0x1cc9680) 83.00us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcfi30x30x30,tLim:4,desc:0x1cc0240) 102.86us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcbi30x30x30,tLim:4,desc:0x1cc9680) 81.79us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcfi30x30x30,tLim:4,desc:0x1cc0240) 99.21us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcbi30x30x30,tLim:4,desc:0x1cc9680) 83.10us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
MKL_VERBOSE FFT(dcfi30x30x30,tLim:4,desc:0x1cc0240) 99.13us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
python mkl_parse.py output
--LAPACK / Accumulation--
Function N Time (s) Time (%)
---------- ---- ---------- ----------
FFT 6240 0.610344 100%
--LAPACK / Function arguments--
Function Arg Min Max
---------- ----- ----- -----
--LAPACK / Function call by cummulative time--
Function Args Count Time (s) Time (%)
---------- ------ ------- ---------- ----------
FFT [] 6240 0.610344 100%
Misc 0 0%
Perhaps some of the formats have been updated.
The oneMKL GPU output has a different signature than classic MKL CPU output, and it looks like mkl_parser.py is missing the newer functions. For example, the output of running with MKL_VERBOSE=1
with oneMKL for a GEMM call is:
MKL_VERBOSE oneMKL 2022.0 Update 2 Product build 20220404 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of Intel(R) Deep Learning Boost (Intel(R) DL Boost), EVEX-encoded AES and Carry-Less Multiplication Quadword instructions, Lnx 2.40GHz ilp64 sequential
MKL_VERBOSE Detected GPU0 Intel(R)_Xe-HP Backend:Level_Zero VE:960 Stack:2 maxWGsize:1024
MKL_VERBOSE Detected GPU1 Intel(R)_Xe-HP Backend:Level_Zero VE:960 Stack:2 maxWGsize:1024
MKL_VERBOSE oneapi::mkl::blas::dgemm(0x7fffe44a77d0,ColumnMajor,NonTranspose,NonTranspose,2400,600,1200,1,0x7fffe44a7710,2400,0x7fffe44a7780,1200,0,0x7fffe44a7750,2400,0,0,0) 0.00s GPU0
Then if we try to parse it we get
> ./mkl_parse.py output
0 line [00:00, ? line/s]ERROR:root:Cannot parse line 0: MKL_VERBOSE oneMKL 2022.0 Update 2 Product build 20220404 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) with support of Intel(R) Deep Learning Boost (Intel(R) DL Boost), EVEX-encoded AES and Carry-Less Multiplication Quadword instructions, Lnx 2.40GHz ilp64 sequential.
ERROR:root:Cannot parse line 1: MKL_VERBOSE Detected GPU0 Intel(R)_Xe-HP Backend:Level_Zero VE:960 Stack:2 maxWGsize:1024.
ERROR:root:Cannot parse line 2: MKL_VERBOSE Detected GPU1 Intel(R)_Xe-HP Backend:Level_Zero VE:960 Stack:2 maxWGsize:1024.
5 line [00:00, 15592.21 line/s]
~= SUMMARY ~=
Count (#) Time (s)
------------- ----------- ----------
BLAS / LAPACK 0 0
FFT 0 0
(The classic MKL parsing still works fine, though!)
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.