Git Product home page Git Product logo

amdcomputelibraries / morton_filter Goto Github PK

View Code? Open in Web Editor NEW
82.0 12.0 17.0 85 KB

A compressed, sparse cuckoo filter (see https://www.vldb.org/pvldb/vol11/p1041-breslow.pdf)

License: MIT License

C++ 95.83% Shell 0.82% Python 1.38% C 0.66% Makefile 1.30%
cuckoo-filter cuckoofilter mortonfilter morton-filter probabilistic-data-structures membership-query membership-queries hashing probabilistic-filters

morton_filter's People

Contributors

abreslow avatar jlgreathouse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

morton_filter's Issues

Errors with clang compiler (Mac OS / LLVM) and Morton3_8 / load factor >= 0.5

When using Morton3_8 with random long keys as below, and using a load factor of >= 0.5, often I get false negatives (meaning: for an entry previously added, the method likely_contains returns false). My usage:

filter = new Morton3_8((size_t) (size / 0.50) + 64);
// in a loop, do this
uint64_t key = ...
filter->insert(key);
// in a loop, do this
uint64_t key = ...
filter->likely_contains(key);

the same key is only ever added once, and keys are random, generated using mix64, as follows:

uint64_t x = index++;
x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
x = x ^ (x >> 31);
return x;

Could you tell me what could be wrong?

Static assertion in switch statement is overly broad

The following static assertion triggers even when it shouldn't because it is evaluated even when the alternate bucket selection method isn't TABLE_BASED_OFFSET, causing Morton1_8 to fail to compile (it uses FUNCTION_BASED_OFFSET, and the assertion shouldn't trigger).

static_assert(offsets[0] > _buckets_per_block,
"Cannot use TABLE_BASED_OFFSET with so many buckets per block");

context:

switch(_alternate_bucket_selection_method){
case AlternateBucketSelectionMethodEnum::TABLE_BASED_OFFSET:{
constexpr int16_t offsets[] = {83, 149, 211, 277, 337, 397, 457, 521,
587, 653, 719, 787, 853, 919, 983, 1051, 1117, 1181, 1249, 1319, 1399,
1459,
1511, 1571, 1637, 1699, 1759, 1823, 1889, 1951, 2017, 1579};//, 1579
static_assert(offsets[0] > _buckets_per_block,
"Cannot use TABLE_BASED_OFFSET with so many buckets per block");
offset = offsets[fingerprint % (sizeof(offsets) / sizeof(offsets[0]))];
break;
}
case AlternateBucketSelectionMethodEnum::FUNCTION_BASED_OFFSET:{
offset = ((raw_primary_hash(fingerprint) & 0x1fff) +
(_buckets_per_block)) | one;
break;
}
case AlternateBucketSelectionMethodEnum::FAN_ET_AL_PARTIAL_KEY:
return fan_et_al_partial_key_cuckoo_hash_alternate_bucket(bucket_id,
fingerprint);
break;
}

Since _alternate_bucket_selection_method is known at compile time, the inverse of the case taken should be added to the assertion, something like this:

static_assert(offsets[0] > _buckets_per_block ||
              _alternate_bucket_selection_method != AlternateBucketSelectionMethodEnum::TABLE_BASED_OFFSET,
  "Cannot use TABLE_BASED_OFFSET with so many buckets per block");

Self-resizing

The readme mentions self-resizing and the code has a resize function. However I could not find reference to this in the paper. Is this mechanism (and potential trade-offs) described somewhere?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.