Git Product home page Git Product logo

Comments (5)

VladSerhiienko avatar VladSerhiienko commented on September 1, 2024

Hi, FrankStain!
Did you try to run the code with the suggested changes in the Release configuration? I think you might face memory-access-related crashes on some platforms (on Windows x64 most likely too). Also, if you require that sort of functionality, you can try Eigen library. As I remember, it has heading() template function, that returns reference.

from mathfu.

FrankStain avatar FrankStain commented on September 1, 2024

@VladSerhiienko , hey :)

It was long time away, so yep, i tried to make some tests around this issue.
So... running on x64 platform with enabled SIMD processing of vectors we facing one major restriction: the memory of any vector must be aligned to 8 bytes. Actually, this restriction is common for all platforms where SIMD processing is enabled.
Casting the memory from vec3 to vec2 we don't cause pointer deformation, but only declares that pointer must point to memory of less size.
Also, all representations of mathfu::Vector<T, N> consists of same __m128 member and covers memory of same size.

So, just requesting the xy component from xyz or xyzw causes no problems. But situation changes if we need to access yz component by reference. Yep, here we will have a problems.

Did i missed something more important?

By means accessing of xy component by reference i see the performance optimization - excluding of allocation and construction of another vector. When read-only vector of less dimension must be passed as argument or used in calculation, such optimization can save some ticks to let you spend it on more usefull operations. :)

from mathfu.

stewartmiles avatar stewartmiles commented on September 1, 2024

Hi, regarding returning references rather than values for subcomponents,
check out
https://github.com/google/mathfu/blob/master/include/mathfu/vector_3.h#L323

Then take a look at the simd4f data type on each platform here:
https://github.com/scoopr/vectorial/tree/ae7dc88215e14c812786dc7a1f214610ae8540f5/include/vectorial

It's pretty different, in some cases (e.g NEON) this uses a type like
float32x4_t which requires the user to load and store the type using
intrinsics
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0491f/BABGABJH.html
to avoid undefined behavior. Now I know in practice when the NEON value is
then stored in memory we're probably going to be ok but it's very likely -
if you're using a chain of operations - the compiler will try to keep the
vector in a register in which case you're going to be out of luck with the
proposed optimization.

In general, I really don't recommend using reinterpret_cast with mathfu
data types for this type of reason, you may end up with code that works ok
on your beefy x86_64 CPU but results in some really strange behavior
elsewhere.

Cheers,
Stewart

On Fri, May 20, 2016 at 5:23 AM, Franken [email protected] wrote:

@VladSerhiienko https://github.com/VladSerhiienko , hey :)

It was long time away, so yep, i tried to make some tests around this
issue.
So... running on x64 platform with enabled SIMD processing of vectors we
facing one major restriction: the memory of any vector must be aligned to 8
bytes. Actually, this restriction is common for all platforms where SIMD
processing is enabled.
Casting the memory from vec3 to vec2 we don't cause pointer deformation,
but only declares that pointer must point to memory of less size.
Also, all representations of mathfu::Vector<T, N> consists of same __m128
member and covers memory of same size.

So, just requesting the xy component from xyz or xyzw causes no problems.
But situation changes if we need to access yz component by reference.
Yep, here we will have a problems.

Did i missed something more important?

By means accessing of xy component by reference i see the performance
optimization - excluding of allocation and construction of another vector.
When read-only vector of less dimension must be passed as argument or used
in calculation, such optimization can save some ticks to let you spend it
on more usefull operations. :)


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#11 (comment)

from mathfu.

FrankStain avatar FrankStain commented on September 1, 2024

@stewartmiles , hi!

Sadly, i can't clearly understand some of your words.

Did you mean that float32x4_t can't be safely reinterpretated to float32x2_t?
Also i see the same union inside vec3 and vec4:
https://github.com/google/mathfu/blob/master/include/mathfu/vector_2.h#L225
https://github.com/google/mathfu/blob/master/include/mathfu/vector_4.h#L37
It means the memory mapping is same for vec3 and vec4 on each of supported platforms.

Now I know in practice when the NEON value is
then stored in memory we're probably going to be ok but it's very likely -
if you're using a chain of operations - the compiler will try to keep the
vector in a register in which case you're going to be out of luck with the
proposed optimization.

Can you clarify this point widely?

from mathfu.

stewartmiles avatar stewartmiles commented on September 1, 2024

That's correct, float32x4_t can't safely be reintepreted as float32x2_t.
vectorial does have a union that maps the types to float but this isn't
actually used since the compiler is likely to keep the type in a NEON
register and not write the data back out to memory so that it can be
reinterpreted as a float. All operations on the simd* types need to be
performed using compiler intrinsics.

The compiler intrinsics for NEON's simd2f are located in
https://github.com/scoopr/vectorial/blob/ae7dc88215e14c812786dc7a1f214610ae8540f5/include/vectorial/simd2f_neon.h#L17
. See
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0205g/BABGABJH.html
for more information.

On Fri, May 20, 2016 at 9:26 AM, Franken [email protected] wrote:

@stewartmiles https://github.com/stewartmiles , hi!

Sadly, i can't clearly understand some of your words.

Did you mean that float32x4_t can't be safely reinterpretated to
float32x2_t?
Also i see the same union inside vec3 and vec4:
https://github.com/google/mathfu/blob/master/include/mathfu/vector_2.h#L225
https://github.com/google/mathfu/blob/master/include/mathfu/vector_4.h#L37
It means the memory mapping is same for vec3 and vec4 on each of
supported platforms.

Now I know in practice when the NEON value is
then stored in memory we're probably going to be ok but it's very likely -
if you're using a chain of operations - the compiler will try to keep the
vector in a register in which case you're going to be out of luck with the
proposed optimization.
Can you clarify this point widely?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#11 (comment)

from mathfu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.