Git Product home page Git Product logo

Comments (21)

oscardssmith avatar oscardssmith commented on August 11, 2024 1

I think it's possible to hit a better point on the speed/complexity graph. I'll take a shot at simplifying this.

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

It would be much better (in my opinion) to get a fastish Julia implementation. The current implementation is a loop that calls nextprime, which is very bad. The decent simple implementation would be a segmented prime sieve that counts up as it finishes the segment. This should be relatively simple to implement and fast.

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

Thanks for the response. I am interested in implementing the Meissel-Lehmer algorithm. It will take months of part-time work.
Is anyone else working on an implementation of prime-count?

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

#102 is my 1 hour effort. It computes prime(10^7) in .14 seconds which isn't optimal, but is a lot better.

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

Yes, it is certainly much better. However primecount is instantaneous even at 10e14, which is really impressive. I am wondering if there is interest in an implementation of the Messel - Lehmer method.

https://www.ams.org/journals/mcom/1996-65-213/S0025-5718-96-00674-6/S0025-5718-96-00674-6.pdf

Is having best of class algorithms in the scope of this project?

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

I'd consider it in scope (although finding a reviewer with sufficient knowledge to approve it might be hard). Also, it seems a little premature to me given that we don't even have a segmented prime sieve yet (or efficient primality checking).

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

It is true that the algorithm needs a segmented sieve and it is a priority. I could work on that.
Is there a specification or a potential interface for a segmented sieve? Also, it seems that the author of the following project

https://github.com/haampie/FastPrimeSieve.jl

is considering to push it to Primes.jl but is currently lacking an interface.

In general what would be the plan for a segmented prime sieve?

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

@haampie would it be reasonable to make Primes.jl depend of FastPrimeSieve?

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

If I might, could I ask again what would be the objection in implementing an API to the primecount library? It is very actively maintained (>4000 commits in 5 days) and it is faster than mathematica it self. What are the chances that a good implementation in pure julia will be within an order of magnitude from this?

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

Honestly, I think it is probably too big of a dependency, but if you want to use this in Julia, https://github.com/JuliaBinaryWrappers/primecount_jll.jl already exists as a wrapper library. I don't think there is any technical reason why a Julia version would be slower, although it would require a significant amount of effort. One potentially interesting approach would be to use would be a GPU based segmented sieve (eg https://github.com/curtisseizert/CUDASieve).

from primes.jl.

haampie avatar haampie commented on August 11, 2024

#87 notice this PR, but I did not add many tests :( so that's probably why it was not merged

from primes.jl.

haampie avatar haampie commented on August 11, 2024

I'm also afraid that the code is too hard to follow, so that if there's ever an issue with it, I'm the one to fix it, but I may not have the time for it

from primes.jl.

haampie avatar haampie commented on August 11, 2024

The segmented sieve + wheel concept is very nice though :) also the hand macro based unrolled sieving loop. I vaguely remember though that codegen is not optimal compared to primesieve's inner loop (something where LLVM can't be forced to generate a jump table). May have to revisit now that Julia uses LLVM 12

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

Do you have a copy of the non macro based version? IMO, that would already be a major simplification.

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

If you replace the first line with function segmented_sieve(limit::Int64, L1D_CACHE_SIZE=128*1024) it's 70x faster. Your version has L1D_CACHE_SIZE as a non-const global variable.

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

With that change, I'm seeing this code as 5-10x slower than FastPrimeSieve which is pretty good given that this can be significantly optimized.

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

I deleted my previous post, because I found a mistake (L1D_CACHE_SIZE must be an argument, not a global)

I translated into julia a C++ code from primesieve.org for a simple segmented prime sieve

https://github.com/kimwalisch/primesieve/wiki/Segmented-sieve-of-Eratosthenes

The translation is not very thorough, but the results are encouraging. It is 70% slower than the C++ version. I am wondering if this can be optimized to reach the speed of the C++ version.

I guess the argument is, does it make sense to implement something complex from scratch only to find out that julia gives you a 70% penalty?

function segmented_sieve(limit::Int64, L1D_CACHE_SIZE = 128*1024)

    sqrt = isqrt(limit)
    segment_size = max(sqrt, L1D_CACHE_SIZE)
    count = (limit < 2) ? 0 : 1;

    i = 3
    n = 3
    s = 3

    sieve = zeros(Bool, L1D_CACHE_SIZE)
    is_prime = ones(Bool, sqrt)

    primes = Vector{Int64}()
    multiples = Vector{Int64}()

    for low in 0:segment_size:limit

        fill!(sieve, true)
        high = min(limit, low + segment_size - 1)

        while i*i <= high
            if is_prime[i]
                is_prime[ i*i : i : sqrt ] .= false

            end
            i += 2
        end

        while s*s <= high

            if is_prime[s]
                push!(primes, s)
                push!(multiples, s*s - low)
            end
            s += 2
        end

        for a in 1 : length(primes)

            p = primes[a]
            m = multiples[a]

            j = m
            k = 2*p
            while j < segment_size
                sieve[j] = false
                j += k
            end
            multiples[a] = j - segment_size
        end

        while n <= high
            if sieve[n - low]
                count += 1
            end
            n += 2
        end

    end

    println("Found ", count, " primes")

end

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

@inbounds for low in 0:segment_size:limit makes this about 2x faster as a simple optimization.

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

It is true that it is a bit faster, but not 2 times. However, I got to 50% away from the C++ version, which is something good. I guess, if I can get with 10%, then it is a deal.

from primes.jl.

oscardssmith avatar oscardssmith commented on August 11, 2024

what number are you testing up to? I saw a reduction from 15 to 8 seconds for segmented_sieve(10^10)

from primes.jl.

galanakis avatar galanakis commented on August 11, 2024

Hmmm in mine it dropped from 8.9 to 7.5 (with @inbounds) for the same number. The C++ version finishes in 5.26 seconds.

from primes.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.