Comments (7)
My question is now if this SSE dependency is actually uneccessary and if there is any interest in a PR that enables libdivide to build on ARM devices (without any SIMD support)?
Yes, the SSE2 dependency is optional. By default even on x86 CPUs SSE2 is not enabled (if you simply include libdivide.h in your program). SSE is now actually already considered legacy, the newest vector instruction set for x86 CPUs beeing AVX512. I guess there are very few people out there that are still using the SSE2 libdivide feature, but it is kept for backwards compatibility.
The Makefile will work fine for most people as most developers have an x86 CPU. The problem that I see is if you want to add functionality to the Makefile to detect whether the CPU is an x86 CPU it will probably be using some king of dirty hack?! I already thought about using CMake instead the current Makefile where CPU detection should be much simpler.
What's your suggestion for fixing the build system on ARM?
from libdivide.
What's your suggestion for fixing the build system on ARM?
You are right, I was thinking about some detection inside the Makefile to be minimally invasive. But since you brought it up I'd rather prefer CMake. I'd need that anyway for another project so I'd be willing to give it a try if you don't mind.
from libdivide.
Yes let's allow building on ARM by default. The build system is up to whoever wants to put in the work :)
from libdivide.
But since you brought it up I'd rather prefer CMake.
Great choice :-) The good thing about using CMake instead of a plain Makefile is that we can also add support for Microsoft's Visual C++ compiler.
As a starting point you can re-use the CMakeLists.txt I wrote for my libpopcnt project.
Then you actually don't need to check the CPU architecture, instead you can check whether the compiler supports -msse2
on the current CPU architecture. If the compiler supports -msse2
then you add -msse2 -DLIBDIVIDE_USE_SSE2=1
to the compiler flags.
include(CheckCXXCompilerFlag)
include(CMakePushCheckState)
cmake_push_check_state()
set(CMAKE_REQUIRED_FLAGS -Werror)
check_cxx_compiler_flag(-msse2 msse2)
cmake_pop_check_state()
if(msse2)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse2")
add_definitions(-DLIBDIVIDE_USE_SSE2=1)
endif()
This is more portable than checking the CPU architecture because e.g. Microsoft's Visual C++ compiler does not support -msse2
on x86 CPUs. The other option is to google for a CMake module for detecting CPU architecture instruction sets (i.e. SSE, AVX, AVX2, NEON, ...). Personally I would try to keep the build system as simple as possible (only one CMakeLists.txt with no other modules), hence I favour the first option.
from libdivide.
As a starting point you can re-use the CMakeLists.txt I wrote for my libpopcnt project.
@kimwalisch Thanks for your hint and sorry I did not read that earlier. Detecting SSE2 was the only thing I was still struggling with. I was thinking about CMakes try_compile()
with the -msse2
option enabled, but since you already have a script that is working I'll be gladly looking into that. Thanks.
from libdivide.
But it does not detect SSE2 support for MSVC
We don't need that for now, it is just important that we don't use -msse
when compiling using MSVC ;-)
from libdivide.
Fixed by switching build system to CMake, see CMakeLists.txt#L29.
from libdivide.
Related Issues (20)
- NEON is not for AArch32
- Faster divlu HOT 10
- error: a function declaration without a prototype is deprecated in all versions of C HOT 1
- upgrade int types HOT 5
- operator/= to return reference HOT 4
- Question: why not support SSE2, AVX, and AVX512 all in the same time? (or am I missing something?) HOT 2
- Status update on +-1 branchfree dividers HOT 1
- clang-cl: error LNK2019: unresolved external symbol __udivti3 HOT 2
- size_t support HOT 6
- CMake cross compilation error HOT 1
- Consider automatically defining LIBDIVIDE_SSE2 et al HOT 1
- Libdivide is using exit instead of abort on errors. HOT 3
- Greater magic/shift value than with gcc HOT 2
- On my mac, system divide gives the fastest result HOT 10
- Appveyor build inconsistent with cmake
- Incorrect NEON function signature
- Unnecessary static linkage
- libdivide.h:1691:59: error: βnumersβ may be used uninitialized in this function HOT 2
- Special support for 63-bit division (unsigned)? HOT 2
- Regression: ptrdiff_t/size_t on macOS don't work anymore HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libdivide.