Comments (5)
This can be the case depending on the distribution of the data.
It's because of the block length 256 vs 128 integers and 128 is compressing allmost always better.
However, the difference is normally negligible, 256 with avx2 is faster.
See the gov2 inverted index benchmark in the README. You'll have better compression when the DocIds are sorted.
from turbopfor-integer-compression.
Thanks for explanation, but I don't really get what you mean by "128 is compressing allmost always better" and how this relates to the block length. I see that the default block length for your index is 128 integers. Would you be able to explain in more detail?
I noticed in idxqry.c that you use p4d1dec128v32 as the default SIMD compressor instead of p4d1dec256v32, is this because p4d1dec128v32 is faster?
from turbopfor-integer-compression.
Well, the TurboPFor SSE v128 functions are using a block length of 128 integers, whereas we have 256 integers/block in the AVX2 v256 functions.
.... is this because p4d1dec128v32 is faster?
No, avx2 is always faster. The idx??? programs was written at the time when the avx2 functions were not available in TurboPFor.
You can adapt the idx??? programs to use avx2 v256 instead of the sse v128 functions.
from turbopfor-integer-compression.
I thought that v128 compresses 128 bits which is 32 integers (4 bytes each) at a time, and v256 compresses 64 integers at a time. Doesn't that mean that v256 can be used on blocks that contain 128 integers?
from turbopfor-integer-compression.
Doesn't that mean that v256 can be used on blocks that contain 128 integers?
Yes.
There are low level bitpacking (bitpackv128/bitpack256) functions compressing 32 (32 bits) integers at a time.
Blocks with <32 integers must be compressed with the scalar functions.
The high level bitpacking (bitnpack128/bitnpack256) functions are compressing blocks with 128/256 integers using sse/avx2. You can pass arbitrary number of integers to these functions.
The bit size for each 128/256 integers is automatically determined at encoding and stored in the output buffer.
The sse v128 functions are not compatible with the v256 functions.
v128 encoding <-> v128 decoding
v256 encoding <-> v256 decoding
from turbopfor-integer-compression.
Related Issues (20)
- Benchmark: TurboTranspose+iccodecs vs Quantile Compression
- macos 13.3.1 m1 build issue HOT 8
- D1 Differential Coding HOT 2
- Boundary check in idxqry.C HOT 3
- Benchmark: TurboPFor Integer Compression on APPLE M1
- Just some questions about TurboPFor Implementation HOT 3
- p4ddec32 HOT 1
- Cross-compiling for iOS HOT 1
- python support HOT 1
- small array compression HOT 3
- negative ints? HOT 1
- Streaming Data HOT 3
- icapp I and J arguments HOT 1
- -E option HOT 2
- fpxenc8 error
- Is lzturbo dead?
- Messy project management, fixes randomly reverted HOT 9
- vlccomp32, vhicomp32 corrupt memory for small input buffers
- Fastest Integer Decompression Algorithms? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from turbopfor-integer-compression.