jasondavies / bloomfilter.js Goto Github PK
View Code? Open in Web Editor NEWJavaScript bloom filter using FNV for fast hashing
Home Page: http://www.jasondavies.com/bloomfilter/
License: BSD 3-Clause "New" or "Revised" License
JavaScript bloom filter using FNV for fast hashing
Home Page: http://www.jasondavies.com/bloomfilter/
License: BSD 3-Clause "New" or "Revised" License
It seems counting has not implemented.How to remove the locations of a element from the bitmap?
First of all, thank you for the great implementation of Bloom Filter. We're using this package from our h2-auto-push
package, and it's working great.
But I want to make sure we have no license problems by depending on this package. Your LICENSE file looks similar to BSD-3-Clause but not quite the same. Can I understand it as BSD-3-Clause?
If my understanding is correct, can you please put the license info into your package.json
? Because it is missing, the npm page says "license: none": https://www.npmjs.com/package/bloomfilter. It'll be as simple as adding this line:
{ "license" : "BSD-3-Clause" }
https://docs.npmjs.com/files/package.json#license
Thank you!
Hey I ran across your project as I was researching bloom filters, and I noticed this on your website:
"Unfortunately I can't use the 64-bit trick in the linked post as JavaScript only supports bitwise operations on 32 bits."
I don't know if you'd be interested in this, but a little while ago I wrote a version of fnv with an expanded keyspace (up to 1024-bit): https://github.com/tjwebb/fnv-plus.
I'm not 100% familiar with the algorithms used to implement your bloom filter, but I know with some algorithms you can estimate the number of items in the filter.
Is this possible with your algorithm?
something like bloom.empty() that set all bits to initial status
plz :)
Hi Jason,
Great library! Thanks for writing this.
I needed to send Bloom filters to and from my webapp frontend to the Python backend (i.e., do add() in JS and test() in Python and the reverse). I ended up porting bloomfilter.js to Python by doing a line-by-line translation. Maybe I missed a note in the docs about an easier way? :)
If you're interested I can send a pull-request with a contrib/ directory with the Python version. It's a little hacky, because I used a C module to get Javascript numeric semantics (modulo and arithmetic are different in Python than Javascript).
Let me know.
Ranga
We're wondering if we can store the filter somehow for use later.
Unfortunately I can't use the 64-bit trick in the linked post as JavaScript only supports bitwise operations on 32 bits.
could this help ?
https://github.com/dcodeIO/long.js
> bloom=new BloomFilter(100, 3)
BloomFilter {m: 100, k: 3, buckets: Int32Array[4], _locations: Uint8Array[3], locations: function…}
> bloom.add('test')
undefined
> bl2=new BloomFilter(bloom.buckets, 3)
BloomFilter {m: 128, k: 3, buckets: Int32Array[4], _locations: Uint8Array[3], locations: function…}
> bl2.test('test')
false
> bloom.test('test')
true
Edited by @jasondavies: looks like we need to review the use of double-FNV.
like
['ant', 'bison'].indexOf('ant')
0
Given the standalone nature of your bloomfilter, I was wondering if it would make sense to serialize/deserialize the bloomfilter bytearray/array.
My use case is that I'm trying to send a dictionary to the front-end to see if words the user types in are in it. I thought it might be faster/smaller to store the dictionary in a bloomfilter.
I was just wondering why you don't have an option to use the optimal m and k parameters (see here) based on n and p.
There are a couple ways I imagine this going:
true
would interpret the first and second values of the constructor as n and p instead of m and k.I could implement any of these, but I wanted to ask you first if there was any reason why you didn't do this.
It is worth it to convert this implementation to TypeScript?
It will be easier for others to understand as well as contribute.
It would be great to see this package in bower.io registry
If I wanted to combine two filters of the same size, can I get the array forms and add the values at each respective index? (Then reserialize into another filter)
Hello!
var array = [].slice.call(bloom.buckets),
json = JSON.stringify(array);
This method doesn't work. How I can get serialization in your package? bloom method doesn't identify the buckets. It only hash bitview, view, serialize, and etc.
I'm getting a much higher false positive rate than I would expect from a bloom filter of the size that I'm using
I'm using a 1024-bit bloomfilter with 16 hashes and 20 elements in each filter.
I'm running a test which adds 20 elements to a filter, checking before adding each that the item isn't already in the filter.
After running the test 500 times, there are ~4 collisions.
Given a bloom filter with those parameters, there should only be about a 1/1.3 billion chance of collision (https://hur.st/bloomfilter/?n=20&p=&m=1024&k=16)
Here's the short script:
let collisions = 0
for(let i=0; i< 500; i++) {
const filter = new BloomFilter(1024, 16)
const dict = {}
for(let j=0; j< 20; j++) {
const str = Math.floor(Math.random() * 1000000000).toString(16)
if(filter.test(str) && dict[str] !== true){
console.log("COLLISION: ", str)
collisions++
}
filter.add(str)
dict[str] = true
}
console.log(i)
}
console.log('done: ', collisions)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.