Comments (9)
I've now implemented a cache-friendly version here: https://github.com/haampie/FastPrimeSieve.jl. It generates the unrolled sieving loop with macro's, which is kinda neat.
I'm getting a 12x speedup 🚀 for the prime-counting function for n = 1_000_000_000
.
Last things are to see how to contribute it back to this package and consider multithreading.
from primes.jl.
Would you be willing to submit a PR for this? cc @pabloferz, who I think is quite familiar with the sieve code.
from primes.jl.
I could definitely look into it, but I'll have to experiment with the (very cool) wheel-idea first.
One issue: primesmask
is exported as well and returns a boolean array over the whole interval [lo, hi]
, so that function can clearly not benefit from memory reduction. It could still benefit from cache optimization, but right now I think it is not completely trivial to use the same segmented sieve code for both primes
and primesmask
.
Also, I think I should show some numbers, otherwise it's just speculation :p
from primes.jl.
I wanted to implement a segmented sieve once, the only problem is, AFAIK, that there is no native way in Julia to find L1 cache sizes for any system, of course there are a couple of packages that we could use, but I don't fancy the idea of depending on them (I'd like to keep Primes.jl
lightweight on dependencies).
An option is to chose a fixed size that certainly fits on L1 cache for the architectures that julia supports, but that would miss some performance on systems with bigger L1 cache. And yeah, in this case primesmask
and primes
would probably need to rely on different implementations, though I believe there's a way of mixing the current approach with a segmented sieve (I haven't really thought about it).
If you want to submit a PR, I can certainly have a look.
from primes.jl.
An optimistic (over)estimation of the cache size would still reduce cache misses with respect to the current implementation, but indeed it is messy to have a constant around. It would be interesting to see a cache-oblivious algorithm, but I doubt it exists.
One idea is to let the user decide the segment size and provide a reasonable default. Maybe a signature like
primes(lo, hi; segment_size = 32 * 1024)
from primes.jl.
Is there a way to see if a package is installed, and use it if it exists?
from primes.jl.
from primes.jl.
@oscardssmith just for reference, that kind of question is better suited for a site like https://discourse.julialang.org/
from primes.jl.
Since this is the first issue I have ever opened related to Julia, I feel like I should close it at some point as well.
At the moment I have implemented an absolute monstrosity that unrolls the inner loop of the {2,3,5}-prime wheel sieve with @goto
statements such that it removes 8 multiples of a prime per iteration. It's roughly 9x faster than the current Primes._primesmask
function for n
around 500_000. Probably I could generate this code from a macro.
The next step is to reduce the memory from O(n) to O(√n / log n) by saving just an O(1) chunk of memory that fits in L1 cache, and keeping an O(√n / log n) list of siever primes up to √n. I'm hoping that would also be 5x to 10x faster for large n...
Lastly with Julia 1.3 it should not be hard to make it multithreaded, right?
from primes.jl.
Related Issues (20)
- We need a new release tagged
- TagBot trigger issue HOT 6
- make factor fast for small integers by memoizing HOT 8
- Consider Montgummary multiplication for Polard Rho HOT 2
- missing type annotation on argument of `radical`
- possible number theory functions to add HOT 6
- nsmooth numbers
- isprime(n) allocates HOT 2
- I would like to re-use Primes.Factorization for polynomials HOT 10
- print(factor(24)) is not executable
- Documentation on website is outdated
- Clarify `isprime` documentation HOT 2
- primes should return an iterator HOT 1
- Option for output of factor(n) as string HOT 1
- UnitRange as argument, feature proposal
- factor a not very big number takes forever HOT 3
- Cool applications
- nextprime of a prime returns the same prime HOT 1
- Feature proposal: Factorization as an abstract integer? HOT 3
- n-th prime is too slow. Consider using primecount as a backend HOT 21
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from primes.jl.