Git Product home page Git Product logo

sciml / quasimontecarlo.jl Goto Github PK

View Code? Open in Web Editor NEW
99.0 9.0 24.0 3.35 MB

Lightweight and easy generation of quasi-Monte Carlo sequences with a ton of different methods on one API for easy parameter exploration in scientific machine learning (SciML)

Home Page: https://docs.sciml.ai/QuasiMonteCarlo/stable/

License: MIT License

Julia 100.00%
sciml scientific-machine-learning physics-informed-learning quasi-monte-carlo parameter-estimation julia

quasimontecarlo.jl's Issues

Randomized Halton sequence?

Question❓

The documentation states that the base for scrambling should match the base of the sequence.

In principle, the base b used for scrambling methods ScramblingMethods(b, pad, rng) can be an arbitrary integer. However, to preserve good Quasi Monte Carlo properties, it must match the base of the sequence to scramble.

How does this relate to the Halton sequence, where the next d primes are used as bases? Are the randomizations applicable here? If so, which base would yield the best results?

What is the base of the LatticeRule? 2?

Remove `Distributions` from dependencies

OK, so I originally thought Distributions was here so you could use QMC sequences to sample from a distribution, but apparently that just doesn't work.

julia> import QuasiMonteCarlo as QMC

julia> using Distributions

julia> QMC.sample(1024, zeros(10), ones(10), SobolSample(), Normal())
ERROR: MethodError: no method matching sample(::Int64, ::Vector{Float64}, ::Vector{Float64}, ::SobolSample, ::Normal{Float64})
Closest candidates are:
  sample(::Integer, ::Union{Number, Tuple, AbstractVector}, ::Union{Number, Tuple, AbstractVector}, ::SobolSample) at ~/.julia/packages/QuasiMonteCarlo/iBmQ6/src/QuasiMonteCarlo.jl:205
  sample(::Integer, ::Union{Number, Tuple, AbstractVector}, ::Union{Number, Tuple, AbstractVector}, ::QuasiMonteCarlo.SamplingAlgorithm) at ~/.julia/packages/QuasiMonteCarlo/iBmQ6/src/QuasiMonteCarlo.jl:146
  sample(::Integer, ::Union{Number, Tuple, AbstractVector}, ::Union{Number, Tuple, AbstractVector}, ::FaureSample) at ~/.julia/packages/QuasiMonteCarlo/iBmQ6/src/Faure.jl:45
  ...

Is there any reason to keep Distributions hanging around, or should we just remove the dependency?

Explanation/Tutorial on effective dimension in documentation

Effective dimension is a very important concept in QMC; it's necessary to understand what points you should use and how to . We should probably create a tutorial explaining them. (Otherwise people are likely to assume QMC points are a drop-in replacement for Monte Carlo points that can be used without additional thought. I know I did before learning more about them 😅 )

Base `b` Gray code

Having a Gray code implementation for arbitrary bases should speed up point set generation.

GridSample appears to sample on a line

It looks like GridSample is generating points in a line. I would expect the points to form a grid.

> using QuasiMonteCarlo
> QuasiMonteCarlo.sample(5, [-6.0, -6.0], [6.0, 6.0], GridSample())
2×5 Matrix{Float64}:
 -4.8  -2.4  0.0  2.4  4.8
 -4.8  -2.4  0.0  2.4  4.8

`GoldenSample` Incorrect

I compared the results here to those given by GoldenSequences.jl, and the implementation here looks wrong. It looks like the mistake is that each is based on φ[i]^i for i in 1:d, rather than powers of the same ratio φ[d]^i (as they should be).

Sobol' Sequence Missing First 4 Points

Running the following code

using QuasiMonteCarlo
QuasiMonteCarlo.sample(4,0,1,SobolSample())

yields

4-element Vector{Float64}:
 0.375
 0.875
 0.625
 0.125

However, the first 4 points in a 1 dimensional Sobol' sample should be [0, .5, .25, .75]. The output are points at index 5 to 8.
I am looping in my collaborator @fjhickernell, an expert in QMC methods who will be valuable to this discussion.

typo in README

instead
As = QuasiMonteCarlo.generate_design_matrices(n,lb,ub,sample_method,k=2)

it works with

k=2
As = QuasiMonteCarlo.generate_design_matrices(n,lb,ub,sample_method,k)

Incorrect Halton Sequence

In v0.3 the numbers returned by the HaltonSample() changed and I believe are now incorrect.

The first 4 points in d = 2 used to return (in v0.2)

julia> s = QuasiMonteCarlo.sample(4, zeros(2), ones(2), LowDiscrepancySample([2, 3], false))
2×4 Matrix{Float64}:
 0.5       0.25      0.75      0.125
 0.333333  0.666667  0.111111  0.444444

which are the correct four points for the Halton Sequences in d=2 with bases [2, 3].

In the new version, the bases are internally selected with bases = nextprimes(one(n), d) which returns [2, 3] for two dimensions.

However, the points returned are entirely different and the sequence changes based on the number of samples requested like

julia> s = QuasiMonteCarlo.sample(4, 2, HaltonSample())
2×4 Matrix{Float64}:
 0.125  0.625     0.375     0.875
 0.125  0.458333  0.791667  0.236111

and

julia> s = QuasiMonteCarlo.sample(5, 2, HaltonSample())
2×5 Matrix{Float64}:
 0.1  0.6       0.35      0.85      0.225
 0.1  0.433333  0.766667  0.211111  0.544444

use SafeTestsets

The tests for this package are a bit unwieldy.
Best to split them up using SafeTestsets.

Error in sample() usage

Documentation says this method signature should work.

julia> using QuasiMonteCarlo
julia> d, n = 4, 2^15
(4, 32768)
julia> Z = QuasiMonteCarlo.sample(n, d, SobolSample());
ERROR: MethodError: no method matching sample(::Int64, ::Int64, ::SobolSample)

Closest candidates are:
  sample(::Integer, ::Union{Number, Tuple, AbstractVector}, ::Union{Number, Tuple, AbstractVector}, ::FaureSample)
  ...

Halton Sample not defined

Hi,

Thanks for this great package.

It looks like I can't use HaltonSample. If I run your example code

using QuasiMonteCarlo, Distributions
lb = [0.1,-0.5]
ub = [1.0,20.0]
n = 5
d = 2
s = QuasiMonteCarlo.sample(n,lb,ub,HaltonSample([10,3], false))

I get the following error: UndefVarError: HaltonSample not defined. Is this a known problem or am I missing something?

Thanks

Lazy iteration over points

It's generally a lot better for performance to iterate lazily over QMC points, especially when the number of points is quite large. When users need the whole set, there's nothing stopping them from calling collect.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

skip=n argument for parallelization?

It would be nice to support parallelization, simply by exposing the skip feature of generators like Sobol.jl. That is, if you want to evaluate on 1000 processors with 2^15 points each, you could pass skip=2^15 * i on processor i, and then average the results.

Caching in sampling to have `sample!`

Is your feature request related to a problem? Please describe.

Implement caching in the sample function using preallocated memory.

Describe the solution you’d like

Define a sample! API, which takes a sample matrix as the argument and similar arguments as the sample one.

`LatticeRuleSample` documentation vague

The documentation for LatticeRuleSample provides no information or references as to how the generating vector is chosen or why it should be expected to perform well, which is especially weird given users can't use their own.

Evenly distributed samples?

Please try the following case:

s1 = QuasiMonteCarlo.sample(25, [0.0], [1.0], LatinHypercubeSample())
sort(s1[1,:])’
s2 = QuasiMonteCarlo.sample(25, [0.0], [1.0], LatinHypercubeSample())
sort(s2[1,:])’
s3 = QuasiMonteCarlo.sample(25, [0.0], [1.0], LatinHypercubeSample())
sort(s3[1,:])’
s4 = QuasiMonteCarlo.sample(25, [0.0], [1.0], LatinHypercubeSample())
sort(s4[1,:])’

Output:
(sort(s1[1, :]))' = [0.02 0.06 0.1 0.13999999999999999 0.18 0.22 0.26 0.30000000000000004 0.34 0.38 0.42000000000000004 0.46 0.5 0.54 0.5800000000000001 0.62 0.66 0.7000000000000001 0.74 0.78 0.8200000000000001 0.86 0.9 0.9400000000000001 0.98]
(sort(s2[1, :]))' = [0.02 0.06 0.1 0.13999999999999999 0.18 0.22 0.26 0.30000000000000004 0.34 0.38 0.42000000000000004 0.46 0.5 0.54 0.5800000000000001 0.62 0.66 0.7000000000000001 0.74 0.78 0.8200000000000001 0.86 0.9 0.9400000000000001 0.98]
(sort(s3[1, :]))' = [0.02 0.06 0.1 0.13999999999999999 0.18 0.22 0.26 0.30000000000000004 0.34 0.38 0.42000000000000004 0.46 0.5 0.54 0.5800000000000001 0.62 0.66 0.7000000000000001 0.74 0.78 0.8200000000000001 0.86 0.9 0.9400000000000001 0.98]
(sort(s4[1, :]))' = [0.02 0.06 0.1 0.13999999999999999 0.18 0.22 0.26 0.30000000000000004 0.34 0.38 0.42000000000000004 0.46 0.5 0.54 0.5800000000000001 0.62 0.66 0.7000000000000001 0.74 0.78 0.8200000000000001 0.86 0.9 0.9400000000000001 0.98]

All samples are located at the centers of 25 evenly divided subregions: such as [0,0.04], [0.04, 0.08], ...., [0.96, 1.0].
Meanwhile, the results from four times sampling are same except their orders before sorting.

`KroneckerSample` problems

I can't work out how KroneckerSample is supposed to work. I'm not sure whether this is because the docstring is extremely unclear, or if the code is actually broken.

Better Sampling Interface

The sampling interface is a bit of a mess at the moment. The way users are asked to specify sample sizes is clunky and unnatural, demanding that they perform unusual calculations to avoid shooting themselves in the foot. Sobol nets, for example, only exist for powers of two, but users can request any sample size. At the moment we just give users these sample sizes by truncating the sequence, even though these truncated sequences are not guaranteed to converge to the correct answer, and will typically perform worse than Monte Carlo point sets. For Faure nets, the situation is safer (we error if the sample size is inappropriate) but even more annoying for users (they have to calculate multiples of powers of prime numbers).

For digital nets, a better approach is outlined here, and used by Art Owen in his QMC packages. Rather than request a sample size, users can provide parameters for a point set's stratification properties (m), how many independent replications they'd like (λ), and a base (b), before providing them λ*b^m points. I'd also suggest a feature where users can set an approximate sample size by using a keyword, then receive a net with the smallest possible net larger than the requested sample size.

This would be a breaking change for sure.

QMC vs Randomized QMC

Hello,
I was wondering if I was misunderstanding some things about this package:

  1. It gathers different option to fill d-dimensional intervals with sequences.
  2. Quasi Monte Carlo uses deterministic sequences.
  3. To me the term sample expect some kind of randomness. Hence, it must refer to Randomized Quasi Monte Carlo.
    Indeed, for some sequences e.g. GridSample, LatticeRuleSample this is what happens.
    For example the LatticeRuleSample are generated and then shifted randomly.
    However, some sequences are generated deterministically, e.g. SobolSample, LowDiscrepancySample using the same sample function.

Am I missing something, or the sample function is used both for deterministic Quasi Monte Carlo sequences and random Quasi Monte Carlo sequences?

Citation

I am finishing revising a paper where I use QuasiMonteCarlo.jl.
I read about how to cite Julia packages (here).
What is the common practice for SciML packages (guess answer is here)?
As there are no paper yet, can something that work?

@misc{2019_QMC_jl,
title = {\texttt{QuasiMonteCarlo.jl}},
author = {{Various contributors}},
year = {2019},
howpublished = {\url{https://github.com/SciML/QuasiMonteCarlo.jl}}
}

BTW, any paper plans? Since 1/4 of the code comes from various packages (Sobol, Lattice, Latin, I am not sure if this is a legit idea or not,

Error when the number of samples is zero

MWE:

julia> QuasiMonteCarlo.sample(0, [0.0], [1.0], SobolSample())
ERROR: MethodError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer
Stacktrace:
  [1] reduce_empty(op::Base.MappingRF{typeof(eltype), typeof(promote_type)}, #unused#::Type{Vector{Float64}})
    @ Base ./reduce.jl:356
  [2] reduce_empty_iter
    @ ./reduce.jl:379 [inlined]
  [3] mapreduce_empty_iter(f::Function, op::Function, itr::Vector{Vector{Float64}}, ItrEltype::Base.HasEltype)
    @ Base ./reduce.jl:375
  [4] _mapreduce
    @ ./reduce.jl:427 [inlined]
  [5] _mapreduce_dim
    @ ./reducedim.jl:365 [inlined]
  [6] #mapreduce#765
    @ ./reducedim.jl:357 [inlined]
  [7] mapreduce
    @ ./reducedim.jl:357 [inlined]
  [8] reduce(#unused#::typeof(hcat), A::Vector{Vector{Float64}})
    @ Base ./abstractarray.jl:1653
  [9] sample(n::Int64, lb::Vector{Float64}, ub::Vector{Float64}, #unused#::SobolSample)
    @ QuasiMonteCarlo ~/.julia/packages/QuasiMonteCarlo/L0JHj/src/QuasiMonteCarlo.jl:186
 [10] top-level scope
    @ REPL[13]:1

Shouldn't this error be a proper exception @ChrisRackauckas?

cc: @AnasAbdelR

respect the element type of the bounds

When the bounds are of Float32, the sampled data should have element type Float32 as well.

julia> QuasiMonteCarlo.sample(10,[-1f0,-1f10],[1f0,1f0],UniformSample())
2×10 Matrix{Float64}:
  0.467086    0.37927    -0.781625    0.36063       0.502589   -0.836102   -0.362692
 -9.25426e7  -3.14634e9  -9.59386e9  -6.39744e9     -1.94932e9  -5.11313e9  -4.61389e9

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.