Git Product home page Git Product logo

sucds's People

Contributors

hirosassa avatar kampersanda avatar ragnargrootkoerkamp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sucds's Issues

Implement rank/select bit vector trait

Related to #13

There are several implementations of rank/select bit vector data structure,
and some of them have already been / will be implemented in this library.

Since there are data structures, like wavelet matrix, uses rank/select bit vector as a building block,
it is better that implementing trait for rank/select bit vector as below and let user be able to choose concrete implementations:

trait RsBitVector {
    fn rank0(&self, pos: usize) -> usize;
    fn rank1(&self, pos: usize) -> usize;
    fn select0(&self, k: usize) -> usize;
    fn select1(&self, k: usize) -> usize;
...
}

@kampersanda How do you think about above? If it is roughly OK, I will create PR to discuss this more concretely.

More space efficient serialization

nice library. I remember the original ot code has compact_elias_fano codes where universe and num elems can be passed in during serialization to make the format more compact. this is especially important when we want to store many small lists where the overhead of storing these explicitly can be prohibitive.

Any plans for such structures?

Large `mem::size_of` for `EliasFano`

I ran into a stack overflow in my system since EliasFano takes up 336 bytes on the stack! The main culprit is DArray, which takes up 288 bytes. It looks like the main issue is structures like DArrayIndex, which include three different heap allocations, which each take 24 bytes. Then, DArray has two DArrayIndexes and a Rank9SelIndex.

I'm just Boxing these structures in my system for now, but I'm curious if you'd be interested in PRs to make the structures more space efficient. Here are some ideas:

  • Change Vec<T> to Box<[T]> for immutable structures. This takes the per-allocation cost down from 24 to 16 bytes.
  • Merge allocations in top-level structures like EliasFano into a single, large heap allocation. This will likely make the code uglier but (1) have better heap utilization, since we can make a single, precise allocation, and (2) dramatically reduce stack usage.

Excellent Library

Not an issue just some praise...

This project is a fantastic hidden gem. Extremely useful for big data tasks. Thank you for the great work.

Use unsafe for faster operations

Basically, sucds does not recommend to use unsafe approaches such as get_unchecked because deserialization from any data is supported, making it easy to build data structures that contain invalid values.

However, unsafe approaches can be safely used to some parts by prechecking.

Simplify API

Current APIs contain some redundant functions.
For example, EliasFano::from_bits and EliasFano::from_bitvec can be generalized.

For maintenance cost, such functions should be unified.

rkyv support?

Hello!

I've started implementing support for rkyv, a zero-copy deserialization framework - my use-cases entail mmaping data in from disk, and rkyv allows me to do that that quite efficiently. My initial work is here.

Would you be interested in a PR that adds support for rkyv more generally? If so, I'd be happy to clean up my initial work and contribute it here for everyone to use. I've found it surprisingly easy to do so far (I think it speaks to the good design choices made here) and would be pleased to do so if it's a feature you want.

(If you do want this feature, I'd be interested in your thoughts as to the general direction I've taken with the initial bit of work; if that's not the general design pattern you'd like to see then I can adjust.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.