Git Product home page Git Product logo

Comments (5)

tgruben avatar tgruben commented on May 22, 2024

i'm wondering if the roaring implementation needs to replace the 16bit containers with 32 bit containers. The idea behind those containers is to have half the bits for low and half for high. So the container should be 32 bits instead of 64 bits.

Thought?

from featurebase.

benbjohnson avatar benbjohnson commented on May 22, 2024

I pushed up a PR to actually return the error message: 🀦
#37

Can you rerun the import and send the error?
As far as the containers, if they have 32-bits then the array size or else they become really inefficient. Instead of 4,096 elements it'd probably need to go up toΒ 67,108,864 (4K x 16K). If we left it at 4K then we'd need to allocate a huge array for the bitmap container to hold all the elements.
Do you have a failing test case for the issue?
Ben

On Friday, January 8, 2016 10:43 AM, tgruben <[email protected]> wrote:

i'm wondering if the roaring implementation needs to replace the 16bit containers with 32 bit containers. The idea behind those containers is to have half the bits for low and half for high. So the container should be 32 bits instead of 64 bits.Thought?β€”
Reply to this email directly or view it on GitHub.

from featurebase.

tgruben avatar tgruben commented on May 22, 2024

You right on the inefficiency. Looking at the roaring code it seems to chop up the 32 bit number into 16bit parts..a high and a low. Perhaps we split up into 4 16 bit chunks. HighHigh, HighLow, LowHigh, LowLow and do the search. Never needing the 64 bit ops..ie search64.

And I would search in this order(for it would be just the same performance as roaring for values< 2^32
LowHigh,LowLow,HighLow,HighHigh

How does that grab you?

Here is the error...
2016/01/09 08:18:06 import error: db=3, frame=b.n, slice=%!s(uint64=59), bits=5434, err=open storage: unmarshal storage: file=/Users/tgruben/.pilosa/3/b.n/59, err=checksum mismatch: exp=5c922157, got=9dc388c3
2

The error is in importing the attached file above in slice 59, i'm trying to get it narrowed down better than that, but I haven't found the culprit. Below is the profile ids of the failed "import"

pids.txt

from featurebase.

benbjohnson avatar benbjohnson commented on May 22, 2024

It seems like it would add a lot of complexity by adding 4 levels instead of just 2. Unless we had a lot of containers it doesn't seem like it would help much.

from featurebase.

benbjohnson avatar benbjohnson commented on May 22, 2024

@tgruben I found and fixed the issue: #38

from featurebase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.