Comments (3)
The length issues should be fixed on the branch kupsch/fix-vector-instruction-lengths. I validate libmkl_avx512.so.2 every instruction decoded against xed-ild and found vpermpd
, and vcvttps2qq
(along with its variant vcvttpd2qq
). Then I found some additional ave-512 enhanced libraries and found two more instruction, then I did a hand audit and found 3 more that were incorrect. All of them specified that all operand were register only, but that was incorrect. If there were additional bytes for the SIB or constants, then the length was incorrect in addition to the operand. The complete list is
vcvtpd2udq
vcvtss2usi
vcvttpd2qq
vcvtudq2pd
vcvtudq2ps
vpblendd
vpermpd
I believe the length should be correct now, but there are other issues with the decoding vector instructions that will get fixed when we switch to capstone.
@mwkrentel please test and let me know if this fixes the problem you were having.
from dyninst.
@kupsch Ok, I'm just finishing up a unit test for unknown instructions, incorrect lengths and their affect on unclaimed regions (gaps). I'll test it with that and let you know.
Thanks!
from dyninst.
To further investigate, I wrote a unit test, something that would be
more automatic than looking by hand. Basically, I added the unknown
instruction callback to one of my other tests that looks for gaps.
The new test looks for three things:
-
unknown instructions -- these are instructions that dyninst doesn't
recognize but XED says are vaild, using the callback function. -
bad length -- these are instructions that dyninst accepts but have
the wrong length according to XED. -
unclaimed regions (gaps) between basic blocks.
On "normal" binaries, not vector-heavy number-crunching apps, compiled
with gcc, there are essentially no errors. Dyninst correctly finds all
the instructions and the only gaps are small gaps between functions
for alignment. Fine.
On a harder binary, eg, libmkl_avx512.so.2 from Intel OneAPI 2023.1.0
with AVX512 instructions (sapphire rapids), there many examples of (1)
and (2), plus those errors create gaps (3), but the real problem is (1) or (2).
I used the libmkl_avx512.so.2 library from Intel OneAPI 2023.1.0.
I think this is publically downloadable from Intel.
$ ls -l libmkl_avx.so.2
-rwxr-xr-x 1 root root 53122080 Mar 7 11:32 libmkl_avx.so.2
$ md5sum libmkl_avx512.so.2
a6abe69a7e8a13574ce217cb5bcf8fe7 libmkl_avx512.so.2
I left a copy of this (named libmkl_avx512_iris.so) at:
machine: ufront.cs.rice.edu
file: /home/krentel/Files/libmkl_avx512_iris.so
Running dyninst master (from 6/2/2023, without the fix) reports:
file: libmkl_avx512_iris.so
threads: 1 fix valid: 1 fix troll: 0
funcs: 17457 blocks: 1023503 instns: 11066589 bytes: 60835203
unknown: 383529 valid(xed): 383453 troll: 76 error: 0
num bad length: 127
num gaps: 19120 size: 279797
under 16: 18535 size: 157955
under 64: 419 size: 15195
under 256: 105 size: 16170
other: 61 size: 90477
That's 383529 unknown instructions, most of which can be fixed via the
callback function. But the errors are:
-
76 trolls -- this is a buffer that dyninst doesn't understand and
XED says is invalid, but if you skip ahead a few bytes, then XED
finds a valid instruction. The likely problem is that the previous
instruction has the wrong length and dyninst gets out of alignment. -
127 bad length -- these are instructions that dyninst accepts but
XED reports a different length. -
At least 61 serious gaps covering 90K of instruction bytes from
(1) and (2) above.
But if I run this with the kupsch/fix-vector-instruction-lengths
branch (merged today into master), this fixes almost everything except
the AVX512 unknown instructions and I get this summary.
file: libmkl_avx512_iris.so
threads: 1 fix valid: 1 fix troll: 0
funcs: 17457 blocks: 1023537 instns: 11075012 bytes: 60894407
unknown: 383449 valid(xed): 383449 troll: 0 error: 0
num bad length: 0
num gaps: 19043 size: 220066
under 16: 18538 size: 157984
under 64: 408 size: 14730
under 256: 96 size: 15128
other: 1 size: 32224
Now we have zero bad length or trolled instructions.
So, problem solved and I support merging the fix into master
as a full solution to the problem. Yea!
Summary:
-
The
kupsch/fix-vector-instruction-lengths
PR fixes the bad length problem.
(And yes, there were several other instructions that were mishandled,
so congrats on a thorough analysis!) -
The issue of the unknown sapphire rapids AVX512 instructions remains,
but for hpctoolkit, we have a satisfactory workaround for now. -
There remains one anomaly with the vzeroupper instruction, but it
seems to be a separate problem and I'll open a new issue for that.
I plan to put my unit test into my repo on github.
After I do that, I'll drop a pointer here.
It will probably be: github.com/mwkrentel/dyninst-tests
Good job, good fix!
from dyninst.
Related Issues (20)
- Potential Infinite Loop in emitElf.C HOT 1
- Locking around Symtab::mod_lookup HOT 3
- how to instrument an user-defined function? HOT 4
- symtab doesn't understand dwarf inline from clang 14 HOT 1
- x86_64 'ret near' instruction formats incorrectly
- Fix x86_64 instruction formatting for >2 operands
- Intel X86_64 operand formatting is incorrect
- formatting x86_64 instruction incorrectly makes kN registers masks in some cases
- Symtab reads wrong debug inline info for Intel EM_INTELGT GPU binary HOT 12
- Dyninst doesn't continue parsing AMDGPU code after a procedure call
- `FunctionBase::getInlines()` Assertion fails for Fortran binary build HOT 3
- Unable to read all instructions of elf format HOT 12
- RoseOperation::extractOP dealt incorrectly
- Rose logicalNot implemented as 1's complement
- Dyninst generates unnecessary spill code (pnnl/memgaze)
- Fails to build with cmake when specifying custom elfutils dir HOT 2
- Segfault at exit of the profiled program when trying Tau with Dyninst 12.3.0 on IBM64 HOT 1
- Regression from 12.3.0 to master with line map and inline info HOT 12
- segmentation fault although doing nothing about instrumentaion HOT 3
- Get map file before and after instrumentation HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dyninst.