Git Product home page Git Product logo

Comments (9)

mapleFU avatar mapleFU commented on May 28, 2024

I'm trying to implement this. During my impl, I found a problem that:

  1. Get random set size
  2. Get all indices, counting the random fetching indices
  3. Traverse the set

The command 1-3 should be done in one snapshot, otherwise, the case might be:

  • Get all indices, size == 10
  • Random need to access 10th member
  • User delete an element
  • 10th cannot be fetched

So, GetMetadata and traversing the set should be done in one rocksdb snapshot. Would you think this is safe and ok? @git-hulk @PragmaTwice

from kvrocks.

git-hulk avatar git-hulk commented on May 28, 2024

I'm wondering if it'd be better to sacrifice the scope of the candidate set to simplify the implementation like below:

  1. if size <= N(e.g. 128/256), then iterate all members and randomly pick one of them.
  2. if size > N, random among the first N elements if the cached random point is empty. Otherwise, seek from the cached random point and then random among the next N elements, then cache the random point again or remove the cache random point if it reaches the end.

from kvrocks.

mapleFU avatar mapleFU commented on May 28, 2024

if size > N, random among the first N elements if the cached random point is empty. Otherwise, seek from the cached random point and then random among the next N elements, then cache the random point again or remove the cache random point if it reaches the end.

I thought of using this previously, and this is named "statistics" in this issue. but where should we storing this and how do updating this?

from kvrocks.

git-hulk avatar git-hulk commented on May 28, 2024

but where should we storing this and how do updating this?

We're now using a LRU to cache the iterator key while scanning the DB/hash/set, maybe we can do this in the same way.

from kvrocks.

mapleFU avatar mapleFU commented on May 28, 2024

That's a bit complex πŸ€”, in my current impl I may just need to iter the set with one snapshot...

from kvrocks.

git-hulk avatar git-hulk commented on May 28, 2024

That's a bit complex πŸ€”, in my current impl I may just need to iter the set with one snapshot...

Yes, it should be more complex than the current implementation. Then I think the implementation with the same snapshot is good.

from kvrocks.

mapleFU avatar mapleFU commented on May 28, 2024

Yeah, another point is that support get meta by snapshot can enhance our isolation. It might slightly affect LSM-Tree Garbage collection, but I think it's ok

from kvrocks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.