bryc / code Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
With minimal additional code, the string-based hash functions, like cyrb53
, could work with any numeric array-like input, like Uint8Array, Uint16Array, Buffer etc. Below you can find a sample script where I did this for cyrb53a
. Running the script as node cyrb53a.js 0
will run a ~20sec benchmark for the original version, running node cyrb53a.js 1
will do the same for the modified version. On my system (Ubuntu 16.04, AMD Ryzen 7 1700) with Node.js v12.19 and v14.15 both versions perform about the same, i.e. the additional code does not seem to hinder V8's Turbofan from optimizing the function like the original one.
PS: Keep up the good work. This repository is a real treasure trove for hash enthusiasts :)
// ORIGINAL VERSION
const cyrb53a = function(str, seed = 0) {
let h1 = 0xdeadbeef ^ seed, h2 = 0x41c6ce57 ^ seed;
for(let i = 0, ch; i < str.length; i++) {
ch = str.charCodeAt(i);
h1 = Math.imul(h1 ^ ch, 0x85ebca77);
h2 = Math.imul(h2 ^ ch, 0xc2b2ae3d);
}
h1 ^= Math.imul(h1 ^ (h2 >>> 15), 0x735a2d97);
h2 ^= Math.imul(h2 ^ (h1 >>> 15), 0xcaf649a9);
h1 ^= h2 >>> 16; h2 ^= h1 >>> 16;
return 2097152 * (h2 >>> 0) + (h1 >>> 11);
};
// MODIFIED VERSION
const cyrb53a_1=function(x,seed=0){
var i,c,l=x.length,imul=Math.imul,h1=0xdeadbeef^seed,h2=0x41c6ce57^seed;
if (typeof(x)==="string") {
for (i=0;i<l;i++) {
c=x.charCodeAt(i);
h1=imul(h1^c,0x85ebca77);
h2=imul(h2^c,0xc2b2ae3d);
}
} else {
for (i=0;i<l;i++) {
c=x[i];
h1=imul(h1^c,0x85ebca77);
h2=imul(h2^c,0xc2b2ae3d);
}
}
h1^=imul(h1^(h2>>>15),0x735a2d97);
h2^=imul(h2^(h1>>>15),0xcaf649a9);
h1^=h2>>>16;h2^=h1>>>16;
return 2097152*(h2>>>0)+(h1>>>11);
}
// BENCHMARKING
var i,j,l,s,a,
fromCc=String.fromCharCode,
rnd=Math.random,trunc=Math.trunc,
cl=console.log,ct=console.time,cte=console.timeEnd,
f=(process.argv[2]==="1"?cyrb53a_1:cyrb53a);
// generate random string (100K)
a=new Array(l=1e5);
for (i=0;i<l;i++) a[i]=trunc(rnd()*65536);
s=fromCc.apply(String,a);
// start benchmark after 5s process relaxation
cl("benchmarking",f.name,"...");
setTimeout(()=>{
// start warmup run (10K*100K=1G)
l=1e4;
for (i=0;i<l;i++) f(s);
// start actual benchmark run (100K*100K=10G)
l=1e5;
ct(":");
for (i=0;i<l;i++) f(s);
cte(":");
},5000);
re: https://github.com/bryc/code/blob/master/jshash/PRNGs.md
What a great, concise listing of PRNGs for Javascript! Is there any way you could link to the original sources (where possible)? It would help make it easy to verify that the algorithm you posted is consistent with the originally-published algorithm.
(for example, http://prng.di.unimi.it/ and http://prng.di.unimi.it/xoshiro128starstar.c for xoshiro128**
)
The various PRNGs here are nicely documented which makes this a very useful source, but I was wondering if you could clarify whether the code here is available under an open source license, and if so, which license?
Is cyrb128 Public domain like cyrb53?
from your post on https://stackoverflow.com/a/47593316/1986995
function cyrb128(str) {
let h1 = 1779033703, h2 = 3144134277,
h3 = 1013904242, h4 = 2773480762;
for (let i = 0, k; i < str.length; i++) {
k = str.charCodeAt(i);
h1 = h2 ^ Math.imul(h1 ^ k, 597399067);
h2 = h3 ^ Math.imul(h2 ^ k, 2869860233);
h3 = h4 ^ Math.imul(h3 ^ k, 951274213);
h4 = h1 ^ Math.imul(h4 ^ k, 2716044179);
}
h1 = Math.imul(h3 ^ (h1 >>> 18), 597399067);
h2 = Math.imul(h4 ^ (h2 >>> 22), 2869860233);
h3 = Math.imul(h1 ^ (h3 >>> 17), 951274213);
h4 = Math.imul(h2 ^ (h4 >>> 19), 2716044179);
return [(h1^h2^h3^h4)>>>0, (h2^h1)>>>0, (h3^h1)>>>0, (h4^h1)>>>0];
}
I am guessing this is how it is used
function sfc32(a, b, c, d) {
return function() {
a >>>= 0; b >>>= 0; c >>>= 0; d >>>= 0;
var t = (a + b) | 0;
a = b ^ b >>> 9;
b = c + (c << 3) | 0;
c = (c << 21 | c >>> 11);
d = d + 1 | 0;
t = t + d | 0;
c = c + t | 0;
return (t >>> 0) / 4294967296;
}
}
get_random_gen = (seed)=>{
return sfc32(...cyrb128(seed))
}
let random = get_random_gen('seed');
random()
I've been following the instructions here, but all my outputs begin with a decimal point (e.g. 0.14844651753082871
). This is similar to Math.random()
, but I'm looking for something more like Crypto.getRandomValues()
. Is it not possible to accomplish this, or am I doing something wrong?
Hi,
Thanks a lot for documenting PRNGs in such a neat way!
I was wondering about the following:
SplitMix32 is a transformation of the
fmix32
finalizer from MurmurHash3 into a PRNG.
You use (15, 13, 16) right-shifts in your version, but the C reference implementation uses (16, 13, 16).
The SMhasher version of MurmurHash3 also has 16 as the first right shift.
Is this difference between your version and the "upstream" fmix32
intentional?
(I also see that xxHash uses the 15 right-shift, but it has different constants, too.)
I'm curious as to the input range for the values of cyrb53
and cyrb53a
. More specifically, I'm curious as to what happens when ch
is >255
.
The reason for asking is that I want to serialize the values I'm hashing into an Iterable<number>
, which I'm then passing into the cyrb53a
function; I just need to know how to normalize them first.
Also, what does CYRB in CYRB53 stand for? Probably your handle backwards. But why not make it a fun acronym too? ๐
Many thanks for sharing your work and hashing functions under a public domain licence, and being so incredibly responsive to everybody's questions.
let are better scoped than var, etc. so except for backward compatibility it has no reason to use var rather than let
The xoroshiro and xoshiro PRNG families were introduced in the same paper, namely "Scrambled Linear Pseudorandom Number Generators" by Blackman and @vigna, 2019. Thus, it's almost certainly incorrect to say that the xoshiro family is "newer" than the xoroshiro family or that the latter is "obsolete in favor of" the former.
By the way, the scrambled xorshift family (including xorshift+, etc.) was presented in 2016 under the title "An experimental exploration of Marsaglia's xorshift generators, scrambled" by Vigna.
@tommyettinger wrote in November 2022:
Revisiting this 5 years later, this is not great. Mulberry32 isn't equidistributed, so it doesn't produce all possible 32-bit values, and actually can't produce about 1/3 of all possible uint32_t numbers.
Hey boss, I know this issue is similar to #18, but I'm experiencing an issue using your cyrb53
implementation:
We cannot accept this algo (or derivations thereof) due to its problematic usage rights situation. The author has claimed to have made this "public domain", but such a claim is invalid in some jurisdictions. I have not found an OS license attributed to this code by the author. If you have found that anywhere, please provide the link.
Any chance you'd be willing to throw a license into the repo to satisfy these conditions? Your code works perfectly for my use case and I'd love to see it implemented in the project I'm working on.
thanks for sharing all your great explorations!
do you have comparison charts available for your experimental functions?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.