Comments (24)
Regarding your first observation, I'm not fully sure I understood it correctly. If I misunderstood and my answers don't actually explain the situation, could you clarify your question?
The reason why each supported architecture is supplied as a separate flag is that platforms may be combined one with another. For example, -DSIMDPP_DISPATCH_ARCH1=SIMDPP_ARCH_X86_AVX2,SIMDPP_ARCH_X86_FMA3
is a valid platform which is distinct from both AVX2 and FMA3. I haven't figured out a maintainable way to supply this information as a single flag. I'm open to suggestions of how this could be improved.
Unfortunately, there's no way to avoid this information being supplied. At least one compilation unit needs to be aware of all architectures that will be dispatched, because there's no cross platform way to statically register them. If we try to rely on constructors of static objects doing this work, then there will be no hard link-time dependency on the compile units the constructors sit in and the linker will strip them out on lots of platforms.
Regarding the template functions, are the required template instantiations known beforehand? That is, if you wrote your template code without using libsimdpp, could you put the definition of the templates into separate .cpp files and force the instantiation explicitly? If yes, then it's possible in theory to support your use case, I just need some time to think about how it could work. If not, unfortunately it's not possible to use libsimdpp dispatcher for this task, because libsimdpp relies on code for different architectures to be put in separate compile units. The compiler needs to know what code to generate for the templates and the only way to do so in such situation is instantiating explicitly in one way or another. All SIMD libraries will have this problem, because only some compilers support generation of code for different instruction sets in the same compile unit.
from libsimdpp.
I do instantiate everything I need beforehand (https://github.com/mbrucher/AudioTK/blob/feature/SIMDBasicFilters/ATK/Core/SIMD/QuaternionConvertFilter.cpp#L92). I already have the functions creating the objects, but how to do the rest...
I wonder if with Boost tuples, we could manage to generate this properly... Does the compilable architecture function retrieves all the combinations, or is it something else that has them?
I do agree that there is an issue for runtime discovery with static libraries, but they should get stripped on shared libraries, shouldn't they? I'm wondering if 2 options, static and runtime dispatches, could solve the problem for complex classes like the one I have. It's also problematic that the CMake files don't seem to support the dispatcher (yet). But once we have a final solution, I can help with the creation of the required functions, no problem.
I could write the dispatcher as std::unique_ptr<> something_float_simd, but it seems a little bit off, as the rest of the library uses template arguments. I could do with if there is really no other way though.
from libsimdpp.
Dispatching via shared libraries is actually quite easy -- compile several libraries with different SIMDPP_ARCH_* flags and then just select correct library according to the cpuid results and dlsym symbols from it.
I think I've figured a way to support template functions in the dispatcher. I'll have some code in a custom branch after several days.
Regarding
CMake files don't seem to support the dispatcher
Did you look into simdpp_multiarch? Is it adequate for your use case?
from libsimdpp.
Obviously I haven't looked enough!
So there is one variable DISPATCHER_FILE for indicating the dispatcher file. What does this mean? Do I still need the full DISPATCH macro in all files, and I set DISPATCHER_FILER to one of them? How do O chose it as I don't know there full names?
Should probably be a list of files actually, instead of just one, as each argument file could require a dispatch file.
Glad to here about your potential solution, that would be great!
from libsimdpp.
OK, the DISPATCHER part is done automatically, I should read the cmake file better! I think I start understanding this :)
from libsimdpp.
Did you start working on the implementation? Or if you have an idea, I can participate in the implementation. I will definitely try it as soon as you have a first version ;)
from libsimdpp.
Sorry for a long delay in responding. Yes, I did start working on the implementation and it's almost done. It turned out to be more complex than I anticipated, so I decided not to push anything unfinished, as it would very likely waste your time. It's hard to predict when I have something for you to use, but it's likely that during the weekend I'll have more time to wrap things up.
from libsimdpp.
@mbrucher I've pushed an experimental implementation on the dispatch-templates branch. Some docs are in the source an several examples can be seen in tests. It would be great to hear any issues, suggestions or ideas that you might have once you start using the new functionality.
I've only tested the implementation on GCC, so the implementation could still contain bugs. More testing results will be available soon.
from libsimdpp.
Excellent, thanks a lot!
I'll try it with clang and VS 2017 as soon as I have time playing with it (might take some days, I'm stuck with another project to finish first!).
from libsimdpp.
Currently trying to implement this for the following function:
template std::unique_ptr createRealToQuaternionFilter<float,simdpp::float32<4> >(std::size_t);
Seems like it tries to to deduce the arguments from the following declaration:
SIMDPP_MAKE_DISPATCHER((template<typename DataType_, typename SIMDType>) (std::unique_ptr) (createRealToQuaternionFilter)
((std::size_t) nb_channels))
and can't figure out DataType_ (and probably SIMDType).
I thought the MAKE_DISPATCHER call would actually return calls to createRealToQuaternionFilter<DataType_, SIMDType>, as it seemed from the doc that it should have worked? (https://github.com/p12tic/libsimdpp/blob/dispatch-templates/simdpp/dispatch/make_dispatcher.h#L212)
from libsimdpp.
Thanks for feedback. It looks like I made an error when writing documentation and the implementation indeed does not work with non-deducible arguments. I've merged your PR with one modification - the new macro parameter was moved to the second position in the argument list (see 7ea917d). Having template-related arguments together reduces the chance of user errors which in this case result in quite nasty compiler output.
from libsimdpp.
Excellent, thanks a lot!
And really great work with the library. It's really great to have a way of not writing intrinsics
and still support all these different platforms ;) I like portable code.
from libsimdpp.
FYI, I saw errors when compiling dispatch tests on MSVC on the dispatch-templates branch, so I think we can keep the issue open for the time being :-)
from libsimdpp.
OK, I'll have a look as well.
from libsimdpp.
Seems to be building on VS2017 for the template dispatchers I use.
from libsimdpp.
I noticed something strange. On AVX, if I do a vector<>.assign() with the zero value in some conditions, it triggers a segmentation fault. At least, that's what happens on Linux, not on macOS or VS (that I noticed). Is there something special about AVX registers compared to the others?
from libsimdpp.
Could you give more information about the circumstances of the crash? Does this happen in dispatched code? What instruction is being executed when the crash happens?
from libsimdpp.
Seems like it might be an alignment issue in my code (error signal: SIGSEGV, si_code: 0), so I'm investigating this lead more.
from libsimdpp.
OK, was my problem with a bad understanding of std::align...
from libsimdpp.
I think I managed to reproduce the compilation issue on Windows, I'll have a look at it.
from libsimdpp.
Seems like all the sequence stuff doesn't work with VS2017, but I don't know much about preprocessor :/
It's just as if it would process the macro SIMDPP_DETAIL_EXTRACT_PARENS_IGNORE_REST keeps the rest of the call instead of ditching it. So even the simple instantiation fails :/
from libsimdpp.
Seems like all the sequence stuff doesn't work with VS2017, but I don't know much about preprocessor :/
It's just as if it would process the macro SIMDPP_DETAIL_EXTRACT_PARENS_IGNORE_REST keeps the rest of the call instead of ditching it. So even the simple instantiation fails :/
from libsimdpp.
I think I've fixed the issues on MSVC in 2822508 and merged the branch to master. Could you please check whether the code on master branch fixes the problem in your code base?
from libsimdpp.
Compiles properly, thx a lot!
from libsimdpp.
Related Issues (20)
- Compiler error with -std=c++17 on clang HOT 1
- to_uint32() unexpected behavior
- Compiler error with -std=c++17 on gcc HOT 2
- Docs issue: Dynamic Dispatch Example + CMake HOT 1
- loads and stores with vectors can read/write more than the vector size HOT 2
- possible comma misuse warning in shuffle2x2.h HOT 1
- Slowdown on several vector variations HOT 2
- Why not overload operator += ?
- For a store_first(ptr, vec, 1) with SSE2, it could result into a _mm_store_ss() ?
- Would you consider adding store_first_u?
- Horizontal add
- NEON64 implementation of floor(float64x2) incorrect
- sign(float64<N>) generates incorrect code on NEON64 using gcc 9.3.0 with -ffast-math
- Fused multiply-add/sub not emulated
- How to Use This Library? HOT 2
- How can we combine with Intel® SSE2 (Streaming SIMD Extensions 2)? HOT 4
- Are there any bench test for libsimdpp?
- SIGSEGV while running the program that checks runnability for ARM64_NEON
- Assessment of the difficulty in porting CPU architecture for libsimdpp
- WebAssembly direct instructions support?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libsimdpp.