Comments (2)
Yes and even better: if your numerator is N-1 bits, then we only need an N bit magic number, and we can do without the add path entirely:
uint64_t q = libdivide_mullhi_u64(denom->magic, numer);
return t >> denom->more;
I'm not sure how to evaluate whether N-1 bit division is broadly useful. The API surface area is already large and it's becoming unwieldy to maintain. N-1 bit division would nearly double the size.
from libdivide.
That's fair. I'll close this thread. If N-1 bit division is indeed broadly useful, someone will eventually re-open this thread and describe their use case.
Just for the record, I was just playing around with the idea of keeping the API the same and making a dynamic decision whether to use N-1 bit division. In order to minimize branch mispredictions, this decision would have to be sticky: once we've seen a numerator with bit N-1 set, the next 100 or so invocations of the division function would use the full N bit division, even if the numerators were small. The necessary state would be opportunistically kept on the stack (hoping that it would be preserved across function invocations, with should be true in hot loops, and shouldn't matter otherwise). Unfortunately, this approach turned out to generate way too much overhead (at least in my implementation).
from libdivide.
Related Issues (20)
- NEON is not for AArch32
- Faster divlu HOT 10
- error: a function declaration without a prototype is deprecated in all versions of C HOT 1
- upgrade int types HOT 5
- operator/= to return reference HOT 4
- Question: why not support SSE2, AVX, and AVX512 all in the same time? (or am I missing something?) HOT 2
- Status update on +-1 branchfree dividers HOT 1
- clang-cl: error LNK2019: unresolved external symbol __udivti3 HOT 2
- size_t support HOT 6
- CMake cross compilation error HOT 1
- Consider automatically defining LIBDIVIDE_SSE2 et al HOT 1
- Libdivide is using exit instead of abort on errors. HOT 3
- Greater magic/shift value than with gcc HOT 2
- On my mac, system divide gives the fastest result HOT 10
- Appveyor build inconsistent with cmake
- Incorrect NEON function signature
- Unnecessary static linkage
- libdivide.h:1691:59: error: βnumersβ may be used uninitialized in this function HOT 2
- Regression: ptrdiff_t/size_t on macOS don't work anymore HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libdivide.