Hello, I am using Turbopfor 256 and 128 for compression of posting l

Turbopfor 256 performs worse than Turbopfor 128 about turbopfor-integer-compression HOT 5 CLOSED

anson304 commented on June 9, 2024

Turbopfor 256 performs worse than Turbopfor 128

from turbopfor-integer-compression.

Comments (5)

powturbo commented on June 9, 2024

This can be the case depending on the distribution of the data.
It's because of the block length 256 vs 128 integers and 128 is compressing allmost always better.
However, the difference is normally negligible, 256 with avx2 is faster.
See the gov2 inverted index benchmark in the README. You'll have better compression when the DocIds are sorted.

from turbopfor-integer-compression.

anson304 commented on June 9, 2024

Thanks for explanation, but I don't really get what you mean by "128 is compressing allmost always better" and how this relates to the block length. I see that the default block length for your index is 128 integers. Would you be able to explain in more detail?

I noticed in idxqry.c that you use p4d1dec128v32 as the default SIMD compressor instead of p4d1dec256v32, is this because p4d1dec128v32 is faster?

from turbopfor-integer-compression.

powturbo commented on June 9, 2024

Well, the TurboPFor SSE v128 functions are using a block length of 128 integers, whereas we have 256 integers/block in the AVX2 v256 functions.

.... is this because p4d1dec128v32 is faster?
No, avx2 is always faster. The idx??? programs was written at the time when the avx2 functions were not available in TurboPFor.
You can adapt the idx??? programs to use avx2 v256 instead of the sse v128 functions.

from turbopfor-integer-compression.

anson304 commented on June 9, 2024

I thought that v128 compresses 128 bits which is 32 integers (4 bytes each) at a time, and v256 compresses 64 integers at a time. Doesn't that mean that v256 can be used on blocks that contain 128 integers?

from turbopfor-integer-compression.

powturbo commented on June 9, 2024

Doesn't that mean that v256 can be used on blocks that contain 128 integers?
Yes.

There are low level bitpacking (bitpackv128/bitpack256) functions compressing 32 (32 bits) integers at a time.
Blocks with <32 integers must be compressed with the scalar functions.

The high level bitpacking (bitnpack128/bitnpack256) functions are compressing blocks with 128/256 integers using sse/avx2. You can pass arbitrary number of integers to these functions.
The bit size for each 128/256 integers is automatically determined at encoding and stored in the output buffer.

The sse v128 functions are not compatible with the v256 functions.
v128 encoding <-> v128 decoding
v256 encoding <-> v256 decoding

from turbopfor-integer-compression.

Turbopfor 256 performs worse than Turbopfor 128 about turbopfor-integer-compression HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent