Comments (15)
Looks great! Thank you very much for suggesting this method to sample from all floats in [0,1).
I made some comments in your PR, and can you also try to make the CI tests pass?
from randomnumbers.jl.
@eapax, I finally managed to write an algorithm for this problem in essentially 3 lines :)
from randomnumbers.jl.
Okay, rand(UInt128)
introduces a bit of overhead, doing rand(UInt64)
instead shaves off another 1.5ns
ui = rand(Xoroshiro128P,UInt64) >> 1 # first bit always 0
# count leading zeros of UInt128 in two steps: 2. only if
lz = ui == 0 ? # 2 leading zeros is at 50%, 3 at 25%, etc.
64+leading_zeros(rand(Xoroshiro128P,UInt64) | 0x1) : leading_zeros(ui)
e = ((127 - lz) % UInt32) << 23 # convert lz to exponent bits of float32
then yields
julia> @btime randfloat($Float32)
3.060 ns (0 allocations: 0 bytes)
So 2x faster than MersenneTwister and only 2x slower than rand(Xoroshiro128Plus,Float32)
. The "I don't care about number smaller than 5e-20"-version which avoids the ?-branching is also with 2.75ns probably only irrelevantly faster.
from randomnumbers.jl.
Turned this into a PR https://github.com/sunoru/RandomNumbers.jl/pull/73
from randomnumbers.jl.
Can you comment on whether it's possible with, say, Xoroshiro128Plus
(but also other RNGs) to hit UInt64(0)
? I'm not sure whether that's the reason why it's not allowed as seed.
from randomnumbers.jl.
After ui = rand(RNG,UInt64) >> 1
the ui==0
case can also be hit with ui=0x1
, so for Float32 that should be fine. For Float64, however, the idea is to sample several ui = rand(RNG,UInt64)
and only sample the next when the previous one is ui==0
but would that ever happen?
from randomnumbers.jl.
That's also a nice quality test
julia> using RandomNumbers
julia> x = rand(Float32,100_000_000);
julia> y = randfloat(Float32,100_000_000);
julia> length(Set(x))
8388552
julia> length(Set(y))
32781328
2^23=8388608
is the size of the set that rand(Float32) draws from, so on when you draw 1e8 float32s then you hit almost all of them and on average each of them 10 times. This problem is clearly mitigated with randfloat
and probably as good as possible
julia> z = Float32.(rand(100_000_000));
julia> length(Set(z))
32784392
from randomnumbers.jl.
Can you comment on whether it's possible with, say,
Xoroshiro128Plus
(but also other RNGs) to hitUInt64(0)
? I'm not sure whether that's the reason why it's not allowed as seed.
All RNGs in xorshift family cannot hit UInt64(0)
, since the rotation and xor operations on zero values do not have any effects. This is also the reason why it's not allowed as seed, since there is not much difference between a seed and a state in xorshift.
After
ui = rand(RNG,UInt64) >> 1
theui==0
case can also be hit withui=0x1
, so for Float32 that should be fine. For Float64, however, the idea is to sample severalui = rand(RNG,UInt64)
and only sample the next when the previous one isui==0
but would that ever happen?
I am not sure if we really need to resample a random number if we hit ui==0
. The probability is just too small. As a result, a large number of numbers between 0-2^-64, around 2^23 * 63 = 528482304 states (for Float32) have to be wasted but I think it is OK?
from randomnumbers.jl.
And the quality test looks good!
from randomnumbers.jl.
I am not sure if we really need to resample a random number if we hit
ui==0
. The probability is just too small. As a result, a large number of numbers between 0-2^-64, around 2^23 * 63 = 528482304 states (for Float32) have to be wasted but I think it is OK?
Yes, but as I said earlier
The "I don't care about number smaller than 5e-20"-version which avoids the ?-branching is also with 2.75ns probably only irrelevantly faster.
However, we could also use the last 23 bits of ui
for the significand, if we already say that's too unlikely to be important, which might speed up things further?
from randomnumbers.jl.
function randfloatx(rng::AbstractRNG,::Type{Float32})
ui = rand(rng,UInt64) >> 1 # first bit always 0
e = ((127 - leading_zeros(ui)) % UInt32) << 23
return reinterpret(Float32,e | ((ui % UInt32) & 0x007f_ffff))
end
times as
julia> @btime randfloatx($Float32)
1.904 ns (0 allocations: 0 bytes)
That's with Xoroshiro128Plus and reusing the last 23bits for the significand. It still samples from 5e-20 to prevfloat(1f0)
and only 5e-20 to 5e-13 will be affected with the reusing of the last bits for the significand.
from randomnumbers.jl.
It looks correct. I am not sure if this influence could be important in some cases, but I guess it's not. And the speed is nice, maybe we can rerun the TestU01 big crush tests on it to see the statistical performance.
from randomnumbers.jl.
I guess that if rand(Float32)/rand(Float64) already passes TestU01 then randfloat should so too. Unless, currently there's 0 chance that randfloat produces 0f0/0. Is that important?
from randomnumbers.jl.
I think a better way to do things is
function randfloat(rng::AbstractRNG,::Type{Float32})
ui = rand(rng,UInt64)
lz = leading_zeros(ui)
e = ((126 - lz) % UInt32) << 23
ui = lz > 40 ? rand(rng,UInt64) : ui
return reinterpret(Float32,e | ((ui % UInt32) & 0x007f_ffff))
end
The ?-branch doesn't introduce any overhead for me and avoids the bits to be double used. Then randfloat really samples all 64*2^23=536870912 floats between 2.7105054f-20 and prevfloat(1f0). That's 50% of all float32s in [0,1). rand(Float32), instead, only samples from 2^32=8388608 (<1%).
Similar for Float64: randfloat samples from 64x as many float64s in [0,1) than rand. That's an improvement from 0.1% to 6% of all floa64s in [0,1). Timings are now
julia> @btime rand($Xoroshiro128P,$Float64)
1.849 ns (0 allocations: 0 bytes)
julia> @btime rand($Xoroshiro128P,$Float32)
1.544 ns (0 allocations: 0 bytes)
with rand, vs
julia> @btime randfloat($Xoroshiro128P,$Float64)
2.149 ns (0 allocations: 0 bytes)
julia> @btime randfloat($Xoroshiro128P,$Float32)
1.763 ns (0 allocations: 0 bytes)
with randfloat.
from randomnumbers.jl.
I'll close this as it got merged in https://github.com/sunoru/RandomNumbers.jl/pull/73. The descriptions here can be used for the docs, if there's need to update those.
from randomnumbers.jl.
Related Issues (20)
- Error when trying to compile as dependency HOT 2
- zero seeds give zeros HOT 4
- Documentation link on README is broken atm? HOT 1
- Please consider implementing xoshiro256** (and/or xoshiro256+) HOT 4
- Move Random123 to a separate package? HOT 5
- ArgumentError: Package RandomNumbers does not have Requires in its dependencies: HOT 8
- Wrong implementation of Mersenne Twister HOT 2
- Storing RNG state HOT 2
- New release HOT 5
- FYI: Xoroshiro128Plus 2.9 times slower in real-world app (and Julia's default only 1.9 times slower) HOT 4
- chaotic prng
- Testing PRNGs to help in the selection of high-quality PRNGs
- possible test failure in upcoming Julia version 1.5
- Xoroshiro128Plus slows down relative to Base MT with increasing sample size HOT 1
- Very cheap PRNG HOT 8
- Why is the Mersenne Twister in Base faster? HOT 1
- Xorshift128Plus has several catastrophically bad seeds HOT 4
- TagBot trigger issue HOT 4
- Ambiguity error with complex numbers HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from randomnumbers.jl.