Comments (6)
Closed further research answered most questions.
from c-blosc.
Yes, the default compressor in Blosc (BloscLZ) is geared towards speed, not
compression ratio, but maybe included LZ4HC or ZLIB can get better ratios,
specially when using large blocksizes. Does this match your research or
you found something different?
2014-12-03 9:08 GMT+01:00 OneArb [email protected]:
—
Reply to this email directly or view it on GitHub
#73 (comment).
Francesc Alted
from c-blosc.
- I found a few compression overview http://compressionratings.com/sort.cgi?rating_sum.brief+6n
http://heartofcomp.altervista.org/MOC/MOCACE.htm
Would it be worth submiting and get blosck in the fray ?
Looking over the benchmark section I notice that bloscLZ is the only decompressor able to outperform memcopy, at least on your machine.
[blosc zlib benchmark] 'http://www.blosc.org/benchmarks-zlib.html) use a different compression scale than the other compressor. It also starts at 0% (vs. 1) which interfers with the graph readability.
Some chart across compressors would ease comparison.
I sure would like to see bloscLZ take its due place within the compressor benchmark community.
- simple.c uses almost all CPU bandwidth on my 2 core machine. Is that expected ?
from c-blosc.
Regarding the zlib Benchmarks, the first measurement is also at one, but because zlib has such high compression ratios, especially with that dataset, it looks like the measurement is at zero. Ideally we should start all graphs at one, since this means "no compression".
Regarding the speed of BloscLZ, I believe what you are seeing is a distortion due to measurement. The only benchmarks we have listed fo LZ4 right now are from a BlueGene. This is a HPC architecture and let's just say things behave differently there than on commodity hardware. I believe that both LZ4 and BloscLZ (maybe snappy too) can outperform memcpy
when driven by Blosc. The reason we don't have any LZ4 benchmarks listed yet is that support for driving LZ4 from Blosc has only been officially supported for about a year now. Support for BloscLZ is much older, so many benchmarks have accumulated for this one.
from c-blosc.
FYI: the reason we get these "off-the-charts" ratios for zlib is because of the shuffle filter in Blosc that can pre-condition certain datasets favorably for zlib, effectively boosting the compression ratio.
See also: http://slides.zetatech.org/haenel-ep14-compress-me-stupid.pdf page 23 onwards
from c-blosc.
https://www.youtube.com/watch?v=IzqlWUTndTo at 9:39
provides the comparative chart I was looking for. LZ4 seems indeed a bit more speedy overall considering range, linear and random distribution.
at 11:19 compressor charts vs. memcopy for each distribution type.
I see Intel Core i5 test for each supported compressor on http://blosc.org/synthetic-benchmarks.html, perhaps the benchmark distorsion has some other source ?
from c-blosc.
Related Issues (20)
- funtions blosc_set_compressor & blosc_get_compressor
- Unable to build on macOS with Apple Clang 12 HOT 6
- build for arm64 error HOT 2
- c-blosc needs a CMake package file HOT 3
- Is there a function like compress_bound() ? HOT 1
- Files referencing non existing `LICENSE.txt`
- Conda package wants unprovided compression functions HOT 2
- Update conda package of latest release 1.21.1? HOT 3
- Support Quantile Compression codec HOT 3
- Linker error
- macOS: Universal2 build fails on SSE2 HOT 4
- Switching from `master` to `main` HOT 1
- Uncompressable data set? HOT 6
- THOUGHTS_FOR_2.0.txt
- 1.21.1: test suite is failing HOT 6
- zlib / zlib-ng licence file HOT 2
- CMake: Enable external ZLIB from CMake variable ZLIB_ROOT
- Upgrade ZSTD to 1.5.5 due to potential corruption HOT 4
- Illegal Instruction vinserti128 in set_host_implementation HOT 11
- Troubles while building HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from c-blosc.