Git Product home page Git Product logo

Comments (5)

powturbo avatar powturbo commented on June 9, 2024

This can be the case depending on the distribution of the data.
It's because of the block length 256 vs 128 integers and 128 is compressing allmost always better.
However, the difference is normally negligible, 256 with avx2 is faster.
See the gov2 inverted index benchmark in the README. You'll have better compression when the DocIds are sorted.

from turbopfor-integer-compression.

anson304 avatar anson304 commented on June 9, 2024

Thanks for explanation, but I don't really get what you mean by "128 is compressing allmost always better" and how this relates to the block length. I see that the default block length for your index is 128 integers. Would you be able to explain in more detail?

I noticed in idxqry.c that you use p4d1dec128v32 as the default SIMD compressor instead of p4d1dec256v32, is this because p4d1dec128v32 is faster?

from turbopfor-integer-compression.

powturbo avatar powturbo commented on June 9, 2024

Well, the TurboPFor SSE v128 functions are using a block length of 128 integers, whereas we have 256 integers/block in the AVX2 v256 functions.

.... is this because p4d1dec128v32 is faster?
No, avx2 is always faster. The idx??? programs was written at the time when the avx2 functions were not available in TurboPFor.
You can adapt the idx??? programs to use avx2 v256 instead of the sse v128 functions.

from turbopfor-integer-compression.

anson304 avatar anson304 commented on June 9, 2024

I thought that v128 compresses 128 bits which is 32 integers (4 bytes each) at a time, and v256 compresses 64 integers at a time. Doesn't that mean that v256 can be used on blocks that contain 128 integers?

from turbopfor-integer-compression.

powturbo avatar powturbo commented on June 9, 2024

Doesn't that mean that v256 can be used on blocks that contain 128 integers?
Yes.

There are low level bitpacking (bitpackv128/bitpack256) functions compressing 32 (32 bits) integers at a time.
Blocks with <32 integers must be compressed with the scalar functions.

The high level bitpacking (bitnpack128/bitnpack256) functions are compressing blocks with 128/256 integers using sse/avx2. You can pass arbitrary number of integers to these functions.
The bit size for each 128/256 integers is automatically determined at encoding and stored in the output buffer.

The sse v128 functions are not compatible with the v256 functions.
v128 encoding <-> v128 decoding
v256 encoding <-> v256 decoding

from turbopfor-integer-compression.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.