Comments (3)
Please note that this is not the place to post programming questions. It is better to ask at stackoverflow.com using the tag vector-class-library.
Method 1 is better if index is changing often. Method 2 is better if the index is constant because of better branch prediction in the switch statement.
from version2.
This is not efficient.
_mm_extract_ps has a constant parameter so it needs the switch statement. It returns the result in an integer register so you have the extra cost of moving between different types of registers.
See https://stackoverflow.com/questions/5526658/intel-sse-why-does-mm-extract-ps-return-int-instead-of-float
from version2.
EDIT :-
I missed a more convenient macro below _mm_extract_ps
called _MM_EXTRACT_FLOAT
which just gets the job done.
Here's the addition to the code below
// CORRECT USAGE VERSION 3
std::cout << "Correct usage version 3\n" ;
float val_float4 ;
_MM_EXTRACT_FLOAT(val_float4, one2four, 0x01);
std::cout << "val float : " << val_float4 << '\n';
std::cout << "val float hex : " << std::hexfloat << val_float4
<< std::defaultfloat << '\n';
Previous :-
@AgnerF
Sorry to bump this issue again but after some testing I found we can use _mm_extract_ps
macro present in smmintrin.h
header. It returns a 32-bit int with the same bit pattern as the 32-bit float from the lane we specify using the second argument to the macro. So, we need to reinterpret_cast
the bits at the int address as a float address to store the value.
There's two ways to do it as below.
Here's a test cpp
#include <x86intrin.h>
#include <iostream>
int main()
{
__m128 one2four = _mm_setr_ps(1.0f, 2.0f, 3.0f, 4.0f);
// INCORRECT USAGE VERSION 1
std::cout << "Incorrect usage version 1\n" ;
int val_int = _mm_extract_ps(one2four, 0x01);
std::cout << "val int : " << val_int << '\n';
std::cout << "val int hex : " << std::hex << val_int << std::dec << '\n';
// INCORRECT USAGE VERSION 2
std::cout << "Incorrect usage version 2\n" ;
float val_float = _mm_extract_ps(one2four, 0x01);
std::cout << "val float : " << val_float << '\n';
std::cout << "val float hex : " << std::hexfloat << val_float
<< std::defaultfloat << '\n';
// CORRECT USAGE VERSION 1
std::cout << "Correct usage version 1\n" ;
float val_float2 = reinterpret_cast<float&>(val_int);
std::cout << "val float : " << val_float2 << '\n';
std::cout << "val float hex : " << std::hexfloat << val_float2
<< std::defaultfloat << '\n';
// CORRECT USAGE VERSION 2
std::cout << "Correct usage version 2\n" ;
float val_float3;
reinterpret_cast<int&>(val_float3) = _mm_extract_ps(one2four, 0x01);
std::cout << "val float : " << val_float3 << '\n';
std::cout << "val float hex : " << std::hexfloat << val_float3
<< std::defaultfloat << '\n';
}
The output:
Incorrect usage version 1
val int : 1073741824
val int hex : 40000000
Incorrect usage version 2
val float : 1.07374e+09
val float hex : 0x1p+30
Correct usage version 1
val float : 2
val float hex : 0x1p+1
Correct usage version 2
val float : 2
val float hex : 0x1p+1
from version2.
Related Issues (20)
- [hint] sse2neon HOT 1
- ...
- request to add horizontal_min/max without propagate NAN HOT 1
- blend16 clang jit HOT 5
- 64bit*64bit will be overflow HOT 4
- Any interest in using a build system ? HOT 2
- Vec4f SSE2 round returning incorrect values HOT 1
- Error when selecting betwen boolean vectors HOT 2
- nmul_sub and runtime gather functions HOT 2
- Issue with fmodulo on single precision with FMA/AVX2 (MSVS) HOT 2
- License terms HOT 1
- Lot of warning HOT 4
- Problem with optimization HOT 10
- vs2022 compilation error HOT 2
- Is better gather support possible? HOT 2
- Proposal - add CMake / Colcon support HOT 1
- Missing "static inline" on some half-precision functions HOT 1
- gcc13 warnings HOT 13
- Save 2 instructions in vec32cb &load_bits(uint32_t) & co.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from version2.