Git Product home page Git Product logo

spacefillr's Introduction

spacefillr

R build status R-CMD-check

spacefillr is a package for generating random and quasi-random space-filling sequences. Supports the following sequences: ‘Halton’, ‘Sobol’, ‘Owen’-scrambled ‘Sobol’, ‘Owen’-scrambled ‘Sobol’ with errors distributed as blue noise, progressive jittered, progressive multi-jittered (‘PMJ’), ‘PMJ’ with blue noise, ‘PMJ02’, and ‘PMJ02’ with blue noise. Includes a ‘C++’ ‘API’. Methods derived from “Constructing Sobol sequences with better two-dimensional projections” (2012) S. Joe and F. Y. Kuo, and “Progressive Multi‐Jittered Sample Sequences” (2018) Christensen, P., Kensler, A. and Kilpatrick, C, and “A Low-Discrepancy Sampler that Distributes Monte Carlo Errors as a Blue Noise in Screen Space” (2019) E. Heitz, B. Laurent, O. Victor, C. David and I. Jean-Claude.

Installation

You can install the released version of spacefillr from Github with the following:

remotes::install_github("tylermorganwall/spacefillr")

Example

Generating random sequences is an important topic when trying to solve problems using Monte Carlo methods. Careful choice of the correct random (or quasi-random) sequence can result in much faster convergence to the final answer. Let’s take the problem of solving for the value of Pi by counting the number of values that fall within the unit circle. If we generate random values between 0 and 1, count how many fall within the unit circle, multiply by 4 and divide by the number of samples, we will get an estimate for Pi.

Let’s first try this with purely uniform random numbers:

library(ggplot2)
library(dplyr)
set.seed(1)
results_rand = as.data.frame(matrix(runif(2000),ncol=2))
colnames(results_rand) = c("x","y")

results_rand %>% 
  mutate(inside = x^2 + y^2 < 1) %>% 
  ggplot() + 
  geom_point(aes(x=x,y=y,color=inside)) +
  coord_fixed()

We can plot the convergence:

results_rand_convergence = results_rand %>% 
  mutate(inside = x^2 + y^2 < 1, iteration = 1:n()) %>% 
  mutate(pi_estimate = 4*cumsum(inside)/iteration) 

ggplot(results_rand_convergence) +
  geom_hline(yintercept = pi, alpha=0.25,color="black",size=2) +
  geom_line(aes(x=iteration,y=pi_estimate))

Uniform random numbers, however, have poor convergence properties. We can do better by using spacefillr to generate sequences that are more evenly distributed throughout space, which helps these Monte Carlo integrals converge more quickly.

We’ll first start with a Halton sequence:

library(spacefillr)

results_halton = as.data.frame(generate_halton_random_set(1000,2,seed=1))
colnames(results_halton) = c("x","y")

results_halton %>% 
  mutate(inside = x^2 + y^2 < 1) %>% 
  ggplot() + 
  geom_point(aes(x=x,y=y,color=inside)) +
  coord_fixed()

These samples are much more evenly distributed. Let’s compare the convergence of this sequence to random uniform numbers:

results_halton_convergence = results_halton %>% 
  mutate(inside = x^2 + y^2 < 1, iteration = 1:n()) %>% 
  mutate(pi_estimate = 4*cumsum(inside)/iteration) 

ggplot() +
  geom_hline(yintercept = pi, alpha=0.25,color="black",size=2) +
  geom_line(data=results_rand_convergence,aes(x=iteration,y=pi_estimate), color="red") +
  geom_line(data=results_halton_convergence,aes(x=iteration,y=pi_estimate), color="black")

We can see this set of quasi-random numbers converges significantly faster than random uniform. Let’s now use the gold-standard quasi-random sequence: Owen-scrambled Sobol. (note: another function generate_sobol_owen_fast_set() is also included in the package that outputs near-ideal Owen scrambled Sobol numbers, but is much faster).

results_owen_sobol = as.data.frame(generate_sobol_owen_set(1000,2,seed=1))
colnames(results_owen_sobol) = c("x","y")

results_owen_sobol %>% 
  mutate(inside = x^2 + y^2 < 1) %>% 
  ggplot() + 
  geom_point(aes(x=x,y=y,color=inside)) +
  coord_fixed()

Let’s look at the convergence properties of this set.

results_owen_sobol_convergence = results_owen_sobol %>% 
  mutate(inside = x^2 + y^2 < 1, iteration = 1:n()) %>% 
  mutate(pi_estimate = 4*cumsum(inside)/iteration) 

ggplot() +
  geom_hline(yintercept = pi, alpha=0.25,color="black",size=2) +
  geom_line(data=results_rand_convergence,aes(x=iteration,y=pi_estimate), color="red") +
  geom_line(data=results_halton_convergence,aes(x=iteration,y=pi_estimate), color="black") + 
  geom_line(data=results_owen_sobol_convergence,aes(x=iteration,y=pi_estimate), color="green") +
  scale_y_continuous(limits=c(3,4))

Owen-scrambled Sobol converges the fastest out of all of the sequences. Also included are the progressive multijittered sequences, of which pmj02 has similar convergence properties to Owen-scrambled Sobol.

results_owen_pmj02 = as.data.frame(generate_pmj02_set(1000,seed=1))
colnames(results_owen_pmj02) = c("x","y")

results_owen_pmj02 %>% 
  mutate(inside = x^2 + y^2 < 1) %>% 
  ggplot() + 
  geom_point(aes(x=x,y=y,color=inside)) +
  coord_fixed()

We can compare this to Owen-scrambled Sobol:

results_owen_pmj02_convergence = results_owen_pmj02 %>% 
  mutate(inside = x^2 + y^2 < 1, iteration = 1:n()) %>% 
  mutate(pi_estimate = 4*cumsum(inside)/iteration) 

ggplot() +
  geom_hline(yintercept = pi, alpha=0.25,color="black",size=2) +
  geom_line(data=results_owen_pmj02_convergence,aes(x=iteration,y=pi_estimate), color="purple") +
  geom_line(data=results_owen_sobol_convergence,aes(x=iteration,y=pi_estimate), color="green") +
  scale_y_continuous(limits=c(2.4,3.6))
#> Warning: Removed 2 row(s) containing missing values (geom_path).

#> Warning: Removed 2 row(s) containing missing values (geom_path).

An interesting properties of pmj02 sequences (absent from Sobol sequences) is that they can be generated with blue noise properties.

results_owen_pmj02bn = as.data.frame(generate_pmj02bn_set(1000,seed=1))
colnames(results_owen_pmj02bn) = c("x","y")

results_owen_pmj02bn %>% 
  mutate(inside = x^2 + y^2 < 1) %>% 
  ggplot() + 
  geom_point(aes(x=x,y=y,color=inside)) +
  coord_fixed()

This does not help convergence, but has some applications in some types of Monte Carlo problems.

C++ API

This package also includes a C++ API for accessing Halton and Sobol sequences, which allows you to generate these sequences quickly in your C++ code. You can access a halton sequence by creating a Halton object:

#include "halton_sampler.h"

spacefillr::Halton_sampler hs;
hs.init_faure();
//This generates the 10th value of the 2nd dimension of the Halton sequence
double val = hs.sample(10,2);
//This generates a 3D point in space
double val[3] = {hs.sample(1,1), hs.sample(1,2), hs.sample(1,3)};

To access Sobol values, call the following functions (no object is needed):

#include "sobol.h"

//Sobol, no Owen-scrambling: 1st value, 10th dimension
double val1 = spacefillr::sobol_single(1, 10)

//Fast generator: 1st value, 10th dimension,  setting the random seed for that sequence to 1234
double val3 = spacefillr::sobol_owen_single(1, 10, 1234)

//This generates a 3D point in space
double val[3] = {spacefillr::sobol_owen_single(1,1, 10),
                 spacefillr::sobol_owen_single(1,2, 10), 
                 spacefillr::sobol_owen_single(1,3, 10)};

spacefillr's People

Contributors

tylermorganwall avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

spacefillr's Issues

Use updated fast Owen hash.

Looking at sobol.h, it appears that you're using the old hash from my post https://psychopath.io/post/2021_01_30_building_a_better_lk_hash

I'm really glad you're using it, by the way! No problem with that at all--that's why I blogged about it. However, someone found a potentially serious problem with those hashes. I've fixed the problem in a new hash, which I've included in an update to the blog post, along with an explanation of the problem and the fix.

Anyway, here's the new hash:

x ^= x * 0x3d20adea
x += seed
x *= (seed >> 16) | 1
x ^= x * 0x05526c56
x ^= x * 0x53a22864

It's much better, and I strongly recommend switching to it.

Installation error on Linux

Trying to install spacefillr on linux I get an error

I run remotes::install_github("tylermorganwall/spacefillr")

I get 3000 + errors similar to this last one.

../inst/include/halton_sampler.h:3286:61: error: exponent has no digits
 3286 |             m_perm1619[(index / 2621161u) % 1619u]) * float(0x1.fffffcp-1 / 4243659659u); // Results in [0,1).
      |                                                             ^~~~~~~~~~~
In file included from generate_point_functions.cpp:8:
../inst/include/sobol.h: In function ‘float u32_to_0_1_f32(uint32_t)’:
../inst/include/sobol.h:11:29: error: unable to find numeric literal operator ‘operator""f’
   11 |   return(std::fmin(n * 0x1p-32f /* 1/2^32 */,
      |                             ^~~
../inst/include/sobol.h:11:29: note: use ‘-fext-numeric-literals’ to enable more built-in suffixes
make: *** [/usr/lib/R/etc/Makeconf:177: generate_point_functions.o] Error 1
make: *** Waiting for unfinished jobs....
ERROR: compilation failed for package ‘spacefillr’

Sys.info():

sysname 
                                                           "Linux" 
                                                           release 
                                             "5.13.0-7614-generic" 
                                                           version 
"#14~1631647151~21.04~930e87c-Ubuntu SMP Fri Sep 17 00:24:58 UTC " 
                                                          nodename 
                                                          "pop-os" 
                                                           machine 
                                                          "x86_64" 

And R.version

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          1.0                         
year           2021                        
month          05                          
day            18                          
svn rev        80317                       
language       R                           
version.string R version 4.1.0 (2021-05-18)
nickname       Camp Pontanezen             

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.