Git Product home page Git Product logo

Comments (8)

devmotion avatar devmotion commented on August 31, 2024 1

To answer your question about the performance, yes, it does affect performance in the following case:

The example is a bit contrieved since ideally you would use the sampler if you want to sample from the same distribution in a loop (similar to how the Random docs show that you should construct and use a Random.Sampler if you want to call rand repeatedly - BTW there's an issue to unify sampler with Random.Sampler a bit more: #1316).

The single variate sampling does not use samplers, hence it behaves differently.

from distributions.jl.

devmotion avatar devmotion commented on August 31, 2024

The reason for this observation is that sampler(Beta(...)) is not type stable: Depending on the parameters, a different algorithm is used for sampling. #1350 "fixed" the performance and allocation issues arising from this type instability with a function barrier. Based on the benchmarks in that PR it seems it doesn't harm performance anymore.

from distributions.jl.

bvdmitri avatar bvdmitri commented on August 31, 2024

Ok, that explains a bit the situation. The real issue is that sampler(Beta(..., ...)) is type-unstable and depends on the parameters. But can it always return the same object, which would simply have an if inside depending on the parameters? E.g.

sampler(d::Beta) = BetaSampler(d, other_fields...) # type-stable, always returns the same

function rand!(rng, sampler::BetaSampler, container)
    if some_condition(sampler)
        one_algorithm!(rng, container)
    else
        another_algorithm!(rng, container)
    end
end

Or a specialized method for rand! for Beta would also solve this issue entirely, e.g.

function rand!(rng, dist::Beta, n)
    if some_condition(dist)
        rand!(rng, OnerBetaSampler(dist), n)
    else 
        rand!(rng, AnotherBetaSampler(dist), n)
    end
end

from distributions.jl.

devmotion avatar devmotion commented on August 31, 2024

But can it always return the same object

The motivation for sampler is to precompute algorithm-dependent quantities when drawing multiple variates. The number and type of these quantities are quite different for different algorithms, as you can see in https://github.com/JuliaStats/Distributions.jl/blob/master/src/samplers/gamma.jl. Always computing all of these would be a bit wasteful I think.

from distributions.jl.

devmotion avatar devmotion commented on August 31, 2024

Or a specialized method for rand! for Beta would also solve this issue entirely

I still wonder, is there any issue here? It seems the function barrier in #1350 fixed the performance issues.

from distributions.jl.

bvdmitri avatar bvdmitri commented on August 31, 2024

It shouldn't recompute algorithm-dependent quantities in my second proposal with a specialized rand! since it basically exactly the same code with an explicit static if statement (multiple dispatch is really just a sophisticated runtime if statement).

To answer your question about the performance, yes, it does affect performance in the following case:

julia> foo(x, dist) = foreach(1:100_000) do _
           rand!(dist, x)
       end

julia> x = zeros(10);

julia> dist = Beta(1, 1)
Beta{Float64}=1.0, β=1.0)

julia> sr = sampler(dist)

julia> @btime foo($x, $dist);
  30.232 ms (100000 allocations: 6.10 MiB)

julia> @btime foo($x, $sr);
  27.726 ms (0 allocations: 0 bytes)

It's not much in this particular example, but it does accumulate in our larger program and it also allocates extra (while the whole idea of rand! is to sample in place)

from distributions.jl.

bvdmitri avatar bvdmitri commented on August 31, 2024

Another thing, I mentioned that sampling a single point is fine and is in fact type-stable.

julia> @report_opt rand(Beta(1, 1))
No errors detected

That suggests that there is a potential for improvement for the in-place version

from distributions.jl.

bvdmitri avatar bvdmitri commented on August 31, 2024

In the real code the distribution is not the same unfortunately

from distributions.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.