Git Product home page Git Product logo

Comments (3)

AgnerF avatar AgnerF commented on August 16, 2024 1

Please note that this is not the place to post programming questions. It is better to ask at stackoverflow.com using the tag vector-class-library.

Method 1 is better if index is changing often. Method 2 is better if the index is constant because of better branch prediction in the switch statement.

from version2.

AgnerF avatar AgnerF commented on August 16, 2024 1

This is not efficient.
_mm_extract_ps has a constant parameter so it needs the switch statement. It returns the result in an integer register so you have the extra cost of moving between different types of registers.
See https://stackoverflow.com/questions/5526658/intel-sse-why-does-mm-extract-ps-return-int-instead-of-float

from version2.

Akhil-CM avatar Akhil-CM commented on August 16, 2024

EDIT :-
I missed a more convenient macro below _mm_extract_ps called _MM_EXTRACT_FLOAT which just gets the job done.

Here's the addition to the code below

    // CORRECT USAGE VERSION 3
    std::cout << "Correct usage version 3\n" ;
    float val_float4 ;
    _MM_EXTRACT_FLOAT(val_float4, one2four, 0x01);
    std::cout << "val float : " << val_float4 << '\n';
    std::cout << "val float hex : " << std::hexfloat << val_float4
              << std::defaultfloat << '\n';

Previous :-

@AgnerF
Sorry to bump this issue again but after some testing I found we can use _mm_extract_ps macro present in smmintrin.h header. It returns a 32-bit int with the same bit pattern as the 32-bit float from the lane we specify using the second argument to the macro. So, we need to reinterpret_cast the bits at the int address as a float address to store the value.
There's two ways to do it as below.

Here's a test cpp

#include <x86intrin.h>
#include <iostream>

int main()
{
    __m128 one2four = _mm_setr_ps(1.0f, 2.0f, 3.0f, 4.0f);

    // INCORRECT USAGE VERSION 1
    std::cout << "Incorrect usage version 1\n" ;
    int val_int = _mm_extract_ps(one2four, 0x01);
    std::cout << "val int : " << val_int << '\n';
    std::cout << "val int hex : " << std::hex << val_int << std::dec << '\n';

    // INCORRECT USAGE VERSION 2
    std::cout << "Incorrect usage version 2\n" ;
    float val_float = _mm_extract_ps(one2four, 0x01);
    std::cout << "val float : " << val_float << '\n';
    std::cout << "val float hex : " << std::hexfloat << val_float
              << std::defaultfloat << '\n';

    // CORRECT USAGE VERSION 1
    std::cout << "Correct usage version 1\n" ;
    float val_float2 = reinterpret_cast<float&>(val_int);
    std::cout << "val float : " << val_float2 << '\n';
    std::cout << "val float hex : " << std::hexfloat << val_float2
              << std::defaultfloat << '\n';

    // CORRECT USAGE VERSION 2
    std::cout << "Correct usage version 2\n" ;
    float val_float3;
    reinterpret_cast<int&>(val_float3) = _mm_extract_ps(one2four, 0x01);
    std::cout << "val float : " << val_float3 << '\n';
    std::cout << "val float hex : " << std::hexfloat << val_float3
              << std::defaultfloat << '\n';
}

The output:

Incorrect usage version 1
val int : 1073741824
val int hex : 40000000
Incorrect usage version 2
val float : 1.07374e+09
val float hex : 0x1p+30
Correct usage version 1
val float : 2
val float hex : 0x1p+1
Correct usage version 2
val float : 2
val float hex : 0x1p+1

from version2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.