Git Product home page Git Product logo

Comments (2)

mosra avatar mosra commented on April 29, 2024 2

Closing.

  • The methods mentioned here would lead to a "naive SIMD" approach, which (according to various benchmarks around the web) is not always the right solution.
  • Besides that, in my code I'm almost always using two- or three-component vectors, not four-component, thus the native approach of packing everything into _m128 or the like is useless anyway.
  • Proper way to SIMD is to make larger functions processing larger chunks of data, not just a single matrix multiplication at a time.
  • The compilers (in some cases) (might be able to) vectorize the code anyway.
  • Keeping things simple and maintainable. Having five different matrix multiplication implementations which need to be tested for correctness, performance regressions, precision regressions etc. on dozed different obscure machines of varying release date and SDK qualities doesn't help with that. I have enough issues with GL alone :)
  • If the user needs to process large amount of data and the CPU seems too slow for that, there is the GPU. This is also why this project exists :)
  • If the user still wants to invert 400x400 matrices on the CPU, it's always possible to integrate another library such as Eigen with more features and better performance.

from magnum.

mosra avatar mosra commented on April 29, 2024

More things to consider:

  • How about two- and three-component vectors? There are very few places in the engine where four-component vectors are really used (I can't think of anything else except MeshTools::transform() classes, which should have GPU implementation anyway). Don't know how to handle these efficiently, treating them as four-component vectors would be bad for memory performance (and computation performance won't me much better).
  • How about packing/unpacking SIMD vectors from/to floats? That will hurt memory performance even more, if not done properly. The compilers are already producing SSE-enabled x86 code when optimization is enabled (or at least my GCC on x86-64 does that with -O2, need to investigate if any additional flags are needed for other architectures), wouldn't be better to live with scalar code by default and do SIMD optimizations only for large functions where it's possible to use SOA instead of AOS (e.g. various bulk collision tests in Shapes namespace)?

Also worth reading: http://www.reedbeta.com/blog/2013/12/28/on-vector-math-libraries/

from magnum.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.