Git Product home page Git Product logo

Comments (6)

OneArb avatar OneArb commented on July 19, 2024

Closed further research answered most questions.

from c-blosc.

FrancescAlted avatar FrancescAlted commented on July 19, 2024

Yes, the default compressor in Blosc (BloscLZ) is geared towards speed, not
compression ratio, but maybe included LZ4HC or ZLIB can get better ratios,
specially when using large blocksizes. Does this match your research or
you found something different?

2014-12-03 9:08 GMT+01:00 OneArb [email protected]:

Closed #73 #73.


Reply to this email directly or view it on GitHub
#73 (comment).

Francesc Alted

from c-blosc.

OneArb avatar OneArb commented on July 19, 2024
  1. I found a few compression overview http://compressionratings.com/sort.cgi?rating_sum.brief+6n

https://docs.google.com/spreadsheet/ccc?key=0AiLIAFlgldSodENkNEhIM3lDZEtBTlFUQ29FdWhvTEE&usp=sharing#gid=2

http://heartofcomp.altervista.org/MOC/MOCACE.htm

Would it be worth submiting and get blosck in the fray ?

Looking over the benchmark section I notice that bloscLZ is the only decompressor able to outperform memcopy, at least on your machine.

[blosc zlib benchmark] 'http://www.blosc.org/benchmarks-zlib.html) use a different compression scale than the other compressor. It also starts at 0% (vs. 1) which interfers with the graph readability.

Some chart across compressors would ease comparison.

I sure would like to see bloscLZ take its due place within the compressor benchmark community.

  1. simple.c uses almost all CPU bandwidth on my 2 core machine. Is that expected ?

from c-blosc.

esc avatar esc commented on July 19, 2024

Regarding the zlib Benchmarks, the first measurement is also at one, but because zlib has such high compression ratios, especially with that dataset, it looks like the measurement is at zero. Ideally we should start all graphs at one, since this means "no compression".

Regarding the speed of BloscLZ, I believe what you are seeing is a distortion due to measurement. The only benchmarks we have listed fo LZ4 right now are from a BlueGene. This is a HPC architecture and let's just say things behave differently there than on commodity hardware. I believe that both LZ4 and BloscLZ (maybe snappy too) can outperform memcpy when driven by Blosc. The reason we don't have any LZ4 benchmarks listed yet is that support for driving LZ4 from Blosc has only been officially supported for about a year now. Support for BloscLZ is much older, so many benchmarks have accumulated for this one.

from c-blosc.

esc avatar esc commented on July 19, 2024

FYI: the reason we get these "off-the-charts" ratios for zlib is because of the shuffle filter in Blosc that can pre-condition certain datasets favorably for zlib, effectively boosting the compression ratio.

See also: http://slides.zetatech.org/haenel-ep14-compress-me-stupid.pdf page 23 onwards

from c-blosc.

OneArb avatar OneArb commented on July 19, 2024

https://www.youtube.com/watch?v=IzqlWUTndTo at 9:39
provides the comparative chart I was looking for. LZ4 seems indeed a bit more speedy overall considering range, linear and random distribution.

at 11:19 compressor charts vs. memcopy for each distribution type.

I see Intel Core i5 test for each supported compressor on http://blosc.org/synthetic-benchmarks.html, perhaps the benchmark distorsion has some other source ?

from c-blosc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.