Comments (3)
FWIW, I expected the behavior to be to emit pandn
i.e.:
__m128i _mm_andnot_si128 (__m128i a, __m128i b)
from libsimdpp.
Hi, I suspect that the issue is not related to libsimdpp itself as AVX equipped CPU should support the instruction in question.
Processors that support AVX also support the version of vpandn that operates on XMM registers (ref). The v
in vpandn
only indicates that the instruction is using VEX prefix, which is a separate instruction encoding scheme introduced on AVX. Most SSE-SSE4 instructions are available in VEX-prefixed form on AVX. AVX2 is a requirement only for instructions that operate on YMM registers.
Using non VEX instructions along with VEX instructions introduces a large performance penalty, so compilers try to use vpandn
for _mm_andnot_si128
instead of older pandn
from AVX onwards.
Could you please check whether running any code that just uses __m128i _mm_andnot_si128 (__m128i a, __m128i b)
works on your CPU when that code is compiled for AVX? Not being able to run 128-bit vpandn
instruction on AVX is very strange. The fact that CAN_RUN_X86_AVX test succeeded is even more strange, because that means that it's possible to run certain instructions with VEX prefixes but not others.
from libsimdpp.
Upon further investigation, I cannot reproduce the SIGILL. The only plausible explanation I have is that while moving some source code around I brought along an old binary and failed to cleanly recompile before trying to execute it. Everything is copacetic now.
My primary confusion came from looking at the assembly and the faulty thinking that vpandn
was an AVX2 instruction. The Intel Intrinsics Guide doesn't list a vpandn
when filtered to AVX, heh. Thanks for the pointer to http://www.felixcloutier.com/x86/, I'll be sure to check there before filing any more errant bugs 😬.
Thanks for your time and the library!
from libsimdpp.
Related Issues (20)
- Compiler error with -std=c++17 on clang HOT 1
- to_uint32() unexpected behavior
- Compiler error with -std=c++17 on gcc HOT 2
- Docs issue: Dynamic Dispatch Example + CMake HOT 1
- loads and stores with vectors can read/write more than the vector size HOT 2
- possible comma misuse warning in shuffle2x2.h HOT 1
- Slowdown on several vector variations HOT 2
- Why not overload operator += ?
- For a store_first(ptr, vec, 1) with SSE2, it could result into a _mm_store_ss() ?
- Would you consider adding store_first_u?
- Horizontal add
- NEON64 implementation of floor(float64x2) incorrect
- sign(float64<N>) generates incorrect code on NEON64 using gcc 9.3.0 with -ffast-math
- Fused multiply-add/sub not emulated
- How to Use This Library? HOT 2
- How can we combine with Intel® SSE2 (Streaming SIMD Extensions 2)? HOT 4
- Are there any bench test for libsimdpp?
- SIGSEGV while running the program that checks runnability for ARM64_NEON
- Assessment of the difficulty in porting CPU architecture for libsimdpp
- WebAssembly direct instructions support?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libsimdpp.