Git Product home page Git Product logo

blosc / c-blosc Goto Github PK

View Code? Open in Web Editor NEW
961.0 961.0 150.0 8.18 MB

A blocking, shuffling and loss-less compression library that can be faster than `memcpy()`.

Home Page: https://www.blosc.org

License: Other

Python 0.15% C 91.69% CMake 0.79% Makefile 0.89% Shell 0.19% Batchfile 0.01% Starlark 0.08% SAS 0.03% Ada 1.52% Assembly 0.51% Pascal 1.28% C# 0.92% C++ 0.77% M4 0.01% DIGITAL Command Language 0.47% Roff 0.13% HTML 0.51% Module Management System 0.03%
c compression fast

c-blosc's Introduction

Blosc: A blocking, shuffling and lossless compression library

Author Contact URL
Blosc Development Team [email protected] https://www.blosc.org
Gitter GH Actions NumFOCUS Code of Conduct
Gitter CI CMake Powered by NumFOCUS Contributor Covenant

What is it?

Blosc is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc is the first compressor (that I'm aware of) that is meant not only to reduce the size of large datasets on-disk or in-memory, but also to accelerate memory-bound computations.

It uses the blocking technique so as to reduce activity in the memory bus as much as possible. In short, this technique works by dividing datasets in blocks that are small enough to fit in caches of modern processors and perform compression / decompression there. It also leverages, if available, SIMD instructions (SSE2, AVX2) and multi-threading capabilities of CPUs, in order to accelerate the compression / decompression process to a maximum.

See some benchmarks about Blosc performance.

Blosc is distributed using the BSD license, see LICENSE.txt for details.

Meta-compression and other differences over existing compressors

C-Blosc is not like other compressors: it should rather be called a meta-compressor. This is so because it can use different compressors and filters (programs that generally improve compression ratio). At any rate, it can also be called a compressor because it happens that it already comes with several compressor and filters, so it can actually work like a regular codec.

Currently C-Blosc comes with support of BloscLZ, a compressor heavily based on FastLZ (https://ariya.github.io/FastLZ/), LZ4 and LZ4HC (http://www.lz4.org/), Snappy (https://google.github.io/snappy/), Zlib (https://zlib.net/) and Zstandard (https://facebook.github.io/zstd/).

C-Blosc also comes with highly optimized (they can use SSE2 or AVX2 instructions, if available) shuffle and bitshuffle filters (for info on how and why shuffling works see here). However, additional compressors or filters may be added in the future.

Blosc is in charge of coordinating the different compressor and filters so that they can leverage the blocking technique as well as multi-threaded execution (if several cores are available) automatically. That makes that every codec and filter will work at very high speeds, even if it was not initially designed for doing blocking or multi-threading.

Finally, C-Blosc is specially suited to deal with binary data because it can take advantage of the type size meta-information for improved compression ratio by using the integrated shuffle and bitshuffle filters.

When taken together, all these features set Blosc apart from other compression libraries.

Compiling the Blosc library

Blosc can be built, tested and installed using CMake_. The following procedure describes the "out of source" build.

  $ cd c-blosc
  $ mkdir build
  $ cd build

Now run CMake configuration and optionally specify the installation directory (e.g. '/usr' or '/usr/local'):

  $ cmake -DCMAKE_INSTALL_PREFIX=your_install_prefix_directory ..

CMake allows to configure Blosc in many different ways, like preferring internal or external sources for compressors or enabling/disabling them. Please note that configuration can also be performed using UI tools provided by CMake (ccmake or cmake-gui):

  $ ccmake ..      # run a curses-based interface
  $ cmake-gui ..   # run a graphical interface

Build, test and install Blosc:

  $ cmake --build .
  $ ctest
  $ cmake --build . --target install

The static and dynamic version of the Blosc library, together with header files, will be installed into the specified CMAKE_INSTALL_PREFIX.

Codec support with CMake

C-Blosc comes with full sources for LZ4, LZ4HC, Snappy, Zlib and Zstd and in general, you should not worry about not having (or CMake not finding) the libraries in your system because by default the included sources will be automatically compiled and included in the C-Blosc library. This effectively means that you can be confident in having a complete support for all the codecs in all the Blosc deployments (unless you are explicitly excluding support for some of them).

But in case you want to force Blosc to use external codec libraries instead of the included sources, you can do that:

  $ cmake -DPREFER_EXTERNAL_ZSTD=ON ..

You can also disable support for some compression libraries:

  $ cmake -DDEACTIVATE_SNAPPY=ON ..  # in case you don't have a C++ compiler

Examples

In the examples/ directory you can find hints on how to use Blosc inside your app.

Supported platforms

Blosc is meant to support all platforms where a C89 compliant C compiler can be found. The ones that are mostly tested are Intel (Linux, Mac OSX and Windows) and ARM (Linux), but exotic ones as IBM Blue Gene Q embedded "A2" processor are reported to work too.

Mac OSX troubleshooting

If you run into compilation troubles when using Mac OSX, please make sure that you have installed the command line developer tools. You can always install them with:

  $ xcode-select --install

Wrapper for Python

Blosc has an official wrapper for Python. See:

https://github.com/Blosc/python-blosc

Command line interface and serialization format for Blosc

Blosc can be used from command line by using Bloscpack. See:

https://github.com/Blosc/bloscpack

Filter for HDF5

For those who want to use Blosc as a filter in the HDF5 library, there is a sample implementation in the hdf5-blosc project in:

https://github.com/Blosc/hdf5-blosc

Mailing list

There is an official mailing list for Blosc at:

[email protected] https://groups.google.com/g/blosc

Acknowledgments

See THANKS.rst.


Enjoy data!

c-blosc's People

Contributors

albertosm27 avatar asford avatar avalentino avatar ax3l avatar crspeller avatar dependabot[bot] avatar derobins avatar dimitripapadopoulos avatar emmenlau avatar esc avatar extrowerk avatar francescalted avatar havardaasen avatar iblislin avatar jack-pappas avatar jakirkham avatar jbms avatar juliantaylor avatar kdm9 avatar keszybz avatar kiyo-masui avatar lasote avatar ldeakin avatar mgorny avatar mkitti avatar nmoinvaz avatar t20100 avatar tnorth avatar wenjuno avatar yasushima-gd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

c-blosc's Issues

warning: ‘__nodebug__’ attribute directive ignored

When compiling with GCC (on Linux) I am seeing a lot of errors like:

c-blosc/blosc/shuffle-avx2.c: In function ‘_mm256_loadu2_m128i’:
c-blosc/blosc/shuffle-avx2.c:48:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
c-blosc/blosc/shuffle-avx2.c: At top level:
c-blosc/blosc/shuffle-avx2.c:56:1: warning: ‘__nodebug__’ attribute directive ignored [-Wattributes]

This does not happen with clang or Visual Studio. @jack-pappas any hint on this?

AVX2 binaries won't be able to re-distribute

@juliantaylor just warned about the possibility that if someone creates a binary with AVX2, then this won't work in non-SSE2 machines. This is in fact a big problem because Blosc tends to be embedded in larger binary packages. The only solution would be to disable AVX2 usage on non-AVX2 machines. @littlezhou do you think this can be done easily?

merge PR#81 broke MSVC compatibility

The merge of PR #81 broke MSVC compatibility (at least when used with bcolz). MSVC complains of:

blosclz.obj : error LNK2019: unresolved external symbol _llabs referenced in fun
ction _blosclz_decompress

The culprit seems to be that the PR replaced abs(op-ref) with llabs(op-ref) in blosc\blosclz.c which seems to be undefined for MSVC. The problem persists for the current tip.

blosc use case

I am checking if I could use blosc to compress 1000 char long strings or so.

As a test I am using the string "Methionylthreonylthreonylglutaminyla..." which is highly repetive.

http://blog.jmay.us/2009/11/longest-english-word.html

I modified simple.c and the best I can get is 1.5x compression with shuffle and 2.8x without shuffle at clevel 9

without shuffle

chars
1000 1.4x
2000 1.8x
3000 2x
4000 2.1x
5000 2.3x

ZIP compresses the full string to 5.5x

Follows my settings :

define LINESIZE 98310

define SIZE 100000

define SHAPE {10,10,10}

define CHUNKSHAPE {1,10,10}

static unsigned char data[LINESIZE];
static unsigned char data_out[SIZE];
static unsigned char data_dest[LINESIZE];

Questions : Am I within expected compression ratios without switching to Zlib ?

Is the block / string I intend to compress too small for blosc use case ?

Is there any prospect for blosc to support indexed and random access of compressed blocks ?

Any suggestions for performance "small" string compression ?

AVX2 implementation segfaults

Here it is a way to make it crash (on a AVX2 machine because on a SSE2 one this works well):

 ~/c-blosc/build $ bench/bench blosclz single 1 262140 4 4
Blosc version: 1.6.1.dev ($Date:: 2015-04-20 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib
Supported compression libraries:
  BloscLZ: 1.0.3
  LZ4: 1.6.0
  Snappy: 1.1.1
  Zlib: 1.2.8
Using compressor: blosclz
Running suite: single
--> 1, 262140, 4, 4, blosclz
********************** Run info ******************************
Blosc version: 1.6.1.dev ($Date:: 2015-04-20 #$)
Using synthetic data with 4 significant bits (out of 32)
Dataset size: 262140 bytes      Type size: 4 bytes
Working set: 256.0 MB           Number of threads: 1
********************** Running benchmarks *********************
memcpy(write):             31.8 us, 7850.6 MB/s
memcpy(read):              21.1 us, 11845.5 MB/s
Compression level: 0
comp(write):       22.3 us, 11206.0 MB/s          Final bytes: 262156  Ratio: 1.00
decomp(read):      21.1 us, 11855.2 MB/s          OK
Compression level: 1
comp(write):       41.5 us, 6027.8 MB/s   Final bytes: 2216  Ratio: 118.29
Segmentation fault

This is using current master. @littlezhou can you have a look at this please?

Blosc 1.6.1 fails to build on PPC G4, OS X 10.5.8

I am using Macports to install most of my personal software built from source. The first (install) build of Blosc I used was v1.5.2, and have upgraded with no trouble through Blosc v1.6.0. Blosc v1.6.1 fails.
GCC 4.2 (Apple's version) tries to compile shuffle-sse2.c which, of course, is not needed, and cannot be compiled on a PowerPC. I am attaching a PNG of the fail point in the build:

failpoint

I realize that making sure code compiles on a PPC G4 is not your top priority, but if you can find the time your help would be appreciated!

Build HDF5 filter in CI builds

The BUILD_HDF5_FILTER option in CMakeLists.txt defaults to OFF; it currently isn't overridden for CI builds so the blosc HDF5 filter isn't built/tested.

I'm opening this issue as a reminder to implement the necessary steps in the CI builds to fetch the HDF5 source (or binaries with source headers, if available) then build the blosc HDF5 filter against it so we know it compiles and works on some basic level.

MinGW blosc_decompress_ctx decompression exception with -O3

I have compiled the library with MinGW and am playing with different combinations of settings for blosc_compress_ctx and blosc_decompress_ctx. I call them from C# via PInvoke.

Everything works fine on a single call, but on repeated calls decompression fails. I narrowed the problem down to numinternalthreads in blosc_decompress_ctx: when it is above 1, repeated calls fail. I haven't noticed a pattern of failure yet, but input data and order of tests change the point of failure. Compression works OK with multiple threads.

Exceptions are System.AccessViolationException and System.Runtime.InteropServices.SEHException: External component has thrown an exception. and have nothing to do with C#.

I could have compiled the library incorrectly, but compression and single-threaded decompression do work. Benchmark single also works well for all compressors...
...
And (after another hour) funny enough, everything works with -O1, -O2 optimization options and fails only with -O3 option (regardless of -g option). This -O3 option was set inside Makefile.mingw in bench folder, so I used it by default and was going to try to debug the code and switched to plain -g2.

Here is my C# code of decompress, nothing special (same as in Snappy.NET):

unsafe int INativeLibraryFacade.blosc_decompress_ctx(byte[] src, byte[] dest) {
            if (src == null) throw new ArgumentNullException("src");

            var destSize = new UIntPtr((uint)dest.Length);
            var numThreads = new IntPtr(4); //new IntPtr(System.Environment.ProcessorCount);
            checked { 
                fixed (byte* srcPtr = &src[0])
                fixed (byte* detPtr = &dest[0]) {
                    var decompSize = Native.blosc_decompress_ctx(
                        srcPtr, detPtr, destSize, numThreads);
                    if (decompSize <= 0) throw new ApplicationException("Invalid compression input");                   
                    return decompSize;
                } }
        }

... part of Native ...
[DllImport(LibraryName, CallingConvention = CallingConvention.Cdecl)]
        private static unsafe extern int blosc_decompress_ctx(byte* src, byte* dest,
                                    UIntPtr destsize, IntPtr numinternalthreads);

My test iterates over 1K arrays of doubles, decimals and longs with different settings (clevel = 9):

[Test]
        public void CouldCompressAndDecompress() {
            for (int round = 0; round < 100; round++) {
            // library and shuffle
            Compress(CompressionMethod.blosclz, true);
            Compress(CompressionMethod.blosclz, false);

            Compress(CompressionMethod.lz4, true);
            Compress(CompressionMethod.lz4, false);

            Compress(CompressionMethod.zlib, true);
            Compress(CompressionMethod.zlib, false);
            }
        }

Bug in blosclz (Invalid write of size 1)

It seems to affect to blosclz only, and I can reproduce it on both SSE2 and AVX2 machines:

$ valgrind build/bench/bench blosclz single 1 4194280 12 25
==4771== Memcheck, a memory error detector
==4771== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4771== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==4771== Command: build/bench/bench blosclz single 1 4194280 12 25
==4771== 
Blosc version: 1.6.1.dev ($Date:: 2015-04-20 #$)
List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib
Supported compression libraries:
  BloscLZ: 1.0.3
  LZ4: 1.6.0
  Snappy: 1.1.1
  Zlib: 1.2.8
Using compressor: blosclz
Running suite: single
--> 1, 4194280, 12, 25, blosclz
********************** Run info ******************************
Blosc version: 1.6.1.dev ($Date:: 2015-04-20 #$)
Using synthetic data with 25 significant bits (out of 32)
Dataset size: 4194280 bytes     Type size: 12 bytes
Working set: 256.0 MB           Number of threads: 1
********************** Running benchmarks *********************
memcpy(write):           3987.5 us, 1003.1 MB/s
memcpy(read):            3175.1 us, 1259.8 MB/s
Compression level: 0
comp(write):     3192.6 us, 1252.9 MB/s   Final bytes: 4194296  Ratio: 1.00
decomp(read):    3133.0 us, 1276.7 MB/s   OK
Compression level: 1
==4771== Invalid write of size 1
==4771==    at 0x5045B16: blosclz_compress (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x50432AC: blosc_c (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x5045262: do_job (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x5045524: blosc_compress_context (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x5045609: blosc_compress (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x402742: do_bench (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771==    by 0x4017C1: main (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771==  Address 0x7681078 is 0 bytes after a block of size 4,194,296 alloc'd
==4771==    at 0x4C2D136: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4771==    by 0x4C2D251: posix_memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4771==    by 0x402354: do_bench (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771==    by 0x4017C1: main (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771== 
==4771== Invalid write of size 1
==4771==    at 0x5045B1D: blosclz_compress (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x50432AC: blosc_c (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x5045262: do_job (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x5045524: blosc_compress_context (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x5045609: blosc_compress (in /home/faltet/blosc/c-blosc/build/blosc/libblosc.so.1.6.1)
==4771==    by 0x402742: do_bench (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771==    by 0x4017C1: main (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771==  Address 0x7681079 is 1 bytes after a block of size 4,194,296 alloc'd
==4771==    at 0x4C2D136: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4771==    by 0x4C2D251: posix_memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4771==    by 0x402354: do_bench (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771==    by 0x4017C1: main (in /home/faltet/blosc/c-blosc/build/bench/bench)
==4771== 
^C==4771== 
==4771== HEAP SUMMARY:
==4771==     in use at exit: 281,053,976 bytes in 78 blocks
==4771==   total heap usage: 79,537 allocs, 79,459 frees, 322,534,408 bytes allocated
==4771== 
==4771== LEAK SUMMARY:
==4771==    definitely lost: 48 bytes in 8 blocks
==4771==    indirectly lost: 0 bytes in 0 blocks
==4771==      possibly lost: 0 bytes in 0 blocks
==4771==    still reachable: 281,053,928 bytes in 70 blocks
==4771==         suppressed: 0 bytes in 0 blocks
==4771== Rerun with --leak-check=full to see details of leaked memory
==4771== 
==4771== For counts of detected and suppressed errors, rerun with: -v
==4771== ERROR SUMMARY: 50 errors from 2 contexts (suppressed: 0 from 0)

convert blosc to library

Could you please consider converting blosc to shared library? Especially, since blosc releases are seperate from pytables bumps.

Having it as a library is convient at least in two ways:

  • allow users to bump blosc without the need of recompiling pytables
  • prevents potential security issues with software bundling older versions of blosc

I am willing to provide patches if you're interested.
Cheers,
Kacper

  1. Downstream [http://bugs.gentoo.org/show_bug.cgi?id=345953 bug report]

Bus error on architectures that are not 32bit aligned

See http://lists.alioth.debian.org/pipermail/debian-science-maintainers/2012-February/011508.html

Program received signal SIGBUS, Bus error.
blosc_d (blocksize=32768, leftoverblock=<optimized out>, src=0x132f543 "S", dest=0x1421808 "", tmp=0x1429810 "", tmp2=
    0x1431830 "") at blosc/blosc.c:330
330 cbytes = sw32(((uint32_t *)(src))[0]); /* amount of compressed bytes */
(gdb) print src
$1 = (uint8_t *) 0x132f543 "S"
(gdb) print ((uint32_t *)(src))[0]
$2 = 1392508928
(gdb) print sw32(((uint32_t *)(src))[0])
$3 = 83
(gdb) print cbytes
$4 = <optimized out>
(gdb) bt
#0 blosc_d (blocksize=32768, leftoverblock=<optimized out>, src=0x132f543 "S", dest=0x1421808 "", tmp=0x1429810 "", tmp2=
    0x1431830 "") at blosc/blosc.c:330
#1 0xf5bde754 in serial_blosc () at blosc/blosc.c:422
#2 0xf5be043c in blosc_filter (flags=257, cd_nelmts=6, cd_values=<optimized out>, nbytes=186, buf_size=0xffff84c8, buf=
    0xffff84c4) at blosc/blosc_filter.c:229
#3 0xf5b36e04 in H5Z_pipeline () from /usr/lib/libhdf5.so.7
#4 0xf5917ca4 in H5D_chunk_lock () from /usr/lib/libhdf5.so.7
#5 0xf5918d28 in ?? () from /usr/lib/libhdf5.so.7
#6 0xf5918d28 in ?? () from /usr/lib/libhdf5.so.7
Backtrace stopped: previous frame identical to this frame (corrupt stack?)


   0xf5bddfd0 <+144>: call 0xf5be0060 <blosclz_decompress>
   0xf5bddfd4 <+148>: nop
   0xf5bddfd8 <+152>: cmp %l2, %o0
   0xf5bddfdc <+156>: bne,pn %icc, 0xf5bde0b0 <blosc_d+368>
   0xf5bddfe0 <+160>: inc %l3
   0xf5bddfe4 <+164>: add %i2, %i1, %i2
   0xf5bddfe8 <+168>: add %l1, %o0, %l1
   0xf5bddfec <+172>: cmp %l3, %l5
   0xf5bddff0 <+176>: bge,pn %icc, 0xf5bde04c <blosc_d+268>
   0xf5bddff4 <+180>: add %l4, %o0, %l4
=> 0xf5bddff8 <+184>: ld [ %i2 ], %o0
   0xf5bddffc <+188>: call 0xf5bddd40 <sw32>
   0xf5bde000 <+192>: add %i2, 4, %i2
   0xf5bde004 <+196>: mov %o0, %i1
   0xf5bde008 <+200>: mov %l1, %o2
   0xf5bde00c <+204>: mov %i2, %o0
   0xf5bde010 <+208>: mov %i1, %o1
   0xf5bde014 <+212>: cmp %i1, %l2

Enable unit tests based on host CPU feature detection

The AVX2 unit tests are still disabled (in tests/CMakeLists.txt). Ideally, I think we want to use the new CPU feature-detection logic in blosc along with CMake to determine which features the host CPU supports and enable the unit tests accordingly. E.g., if the host supports SSE2 and AVX2, both of those test suites will be enabled within the CMake script itself; if a host doesn't support AVX2 (for example), the tests wouldn't be enabled by CMake and therefore wouldn't even run.

Alternatively, we could use the CPU feature detection within the test programs themselves to determine if the host CPU supports the features necessary for the test to run. In the case where it doesn't, we'd just return a success code (which is what the AVX2 tests are currently doing). Or, instead of returning a success code, it would be better if CMake / CTest supported returning an exit code indicating the test was "ignored" (many of the xUnit-style testing frameworks include this); I don't know whether CTest supports this, so some research would be required.

Any solution implemented for this should be designed under the assumption blosc will support additional CPU feature sets in the future (i.e., not limited to SSE2 and AVX2).

AVX2 support on VS2013 does not work

Here it is the error that I am seeing using VS2013 Professional:

3>  shuffle-avx2.c
7>------ Rebuild All started: Project: test_getitem, Configuration: Release x64 ------
4>C:\Users\Francesc\Desktop\c-blosc2\blosc\shuffle-avx2.c(14): fatal error C1189: #error :  AVX2 is not supported by the target architecture/platform and/or this compiler.
2>C:\Users\Francesc\Desktop\c-blosc2\blosc\shuffle-avx2.c(14): fatal error C1189: #error :  AVX2 is not supported by the target architecture/platform and/or this compiler.

Update blosc lz4 version to r126

When lz4 r125 is released blosc should update to this revision. There is a rare case that causes a crash and a potential security issue that will be fixed in this revision.

TODO for next release

I propose to get the next release out by mid January 2014. I am especially interested in integrating the support for alternative codecs we wroked on this summer. To that end I can currently remember the following storys:

  • Improve snappy support. We currently build against the snappy dynamic library and hard require this to be installed. Would be nice to support conditional compile as well as compiling snappy into the blosc binary. Snappy is built with CMake too (IIRC) so should be able to tie it in.
  • Debug remaining issues with Snappy on OSX. @FrancescAlted I recall you had some segfaults.
  • Support for zlib. Needs to be written and tested.
  • Test Windows/OSX Support for the above

The current development is at: http://git.io/xjHYkg lz4 support is ready, since we have added the sources to the blosc tree.

Anything else I have missed?

Deal with snappy dependency

Either make snappy optional, if we continue to link against libsnappy, or subtree merge (SVN equivalent OSLT) snappy sources and handle building it into Blosc using cmake. Snappy uses cmake too.

warning: ‘__nodebug__’ attribute directive ignored

When compiling with GCC (on Linux) I am seeing a lot of errors like:

c-blosc/blosc/shuffle-avx2.c: In function ‘_mm256_loadu2_m128i’:
c-blosc/blosc/shuffle-avx2.c:48:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
c-blosc/blosc/shuffle-avx2.c: At top level:
c-blosc/blosc/shuffle-avx2.c:56:1: warning: ‘__nodebug__’ attribute directive ignored [-Wattributes]

This does not happen with clang or Visual Studio. @jack-pappas any hint on this?

Library version number is outdated

This is a minor issue, but the current lib version set in CMakeLists.txt is still to 1.1.6, and should be updated (perhaps with a script as recommended in this file).

allow 64-bit buffer sizes

blosc_compress returns the size of compressed block according to blosc.h. However, the return type is int, which has an obvious susceptibility to overflow in the common case of a 64-bit machine where int is 32 bits.

Update: Upon looking through blosc.c, it seems that 32-bit limits are found throughout the code, not just in the API. At least on 64-bit machines, it would be highly desirable to support 64-bit buffer sizes.

Ideally, you would return size_t. But this is unsigned, and I see that you want to use negative return values to indicate errors. Two possibilities:

  • Return ptrdiff_t (C90) or (in C99) intptr_t. This is a 64-bit signed value on 64-bit machines, and a 32-bit signed value on 32-bit machines.
  • Return size_t, and return an error code in some other way. errno would be the obvious way for blosc_compress. For blosc_compress_ctx one option would be to add an int *errno parameter.

do_job() return value assigned to an unsigned int

If the allocation of temporaries fails, do_job() returns -1. But on line 713 and line 734 :

ntbytes = do_job();
if (ntbytes < 0) {
   return -1;
}

Where ntbytes is uint32_t, line 623. The result will be cast to an unsigned it, and the failure will never be catched. (right ?)

library install version incorrectly matches release version

With every release you make, your library's name changes. This is forcing all dependent projects to rebuild.

Please make sure that you only change the install version of your library when you make binary incompatible changes. This will allow a dependent project to build against version 1.3.4 and run against version 1.3.5.

Build failure on ubuntu 15.04

The current master fails to build on Ubuntu 15.04

$ ~/hdf5-1.8.15/src/cmake-3.2.1-Linux-x86_64/bin/cmake ..
-- The C compiler identification is GNU 4.9.2
-- The CXX compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
Configuring for Blosc version: 1.6.1.dev
-- Found LZ4 library: /usr/lib/x86_64-linux-gnu/liblz4.so
-- Found SNAPPY library: /usr/lib/libsnappy.so
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.8") 
-- SSE2 is here.  Adding support for it.
-- Adding run-time support for SSE2.
-- Adding run-time support for AVX2.
-- Looking for include file pthread.h
-- Looking for include file pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Configuring done
-- Generating done
-- Build files have been written to: /home/antonio/projects/c-blosc/b2
antonio@mac2:~/projects/c-blosc/b2$ make
Scanning dependencies of target blosc_shared
[  6%] Building C object blosc/CMakeFiles/blosc_shared.dir/blosc.c.o
[ 13%] Building C object blosc/CMakeFiles/blosc_shared.dir/blosclz.c.o
[ 20%] Building C object blosc/CMakeFiles/blosc_shared.dir/shuffle-generic.c.o
[ 26%] Building C object blosc/CMakeFiles/blosc_shared.dir/shuffle-sse2.c.o
[ 33%] Building C object blosc/CMakeFiles/blosc_shared.dir/shuffle-avx2.c.o
[ 40%] Building C object blosc/CMakeFiles/blosc_shared.dir/shuffle.c.o
Linking C shared library libblosc.so
[ 40%] Built target blosc_shared
Scanning dependencies of target blosc_static
[ 46%] Building C object blosc/CMakeFiles/blosc_static.dir/blosc.c.o
[ 53%] Building C object blosc/CMakeFiles/blosc_static.dir/blosclz.c.o
[ 60%] Building C object blosc/CMakeFiles/blosc_static.dir/shuffle-generic.c.o
[ 66%] Building C object blosc/CMakeFiles/blosc_static.dir/shuffle-sse2.c.o
[ 73%] Building C object blosc/CMakeFiles/blosc_static.dir/shuffle-avx2.c.o
[ 80%] Building C object blosc/CMakeFiles/blosc_static.dir/shuffle.c.o
Linking C static library libblosc.a
[ 80%] Built target blosc_static
Scanning dependencies of target test_api
[ 86%] Building C object tests/CMakeFiles/test_api.dir/test_api.c.o
Linking C executable test_api
[ 86%] Built target test_api
Scanning dependencies of target test_basics
[ 93%] Building C object tests/CMakeFiles/test_basics.dir/test_basics.c.o
/home/antonio/projects/c-blosc/tests/test_basics.c:34:4: warning: #warning AVX2 shuffle tests not enabled. [-Wcpp]
   #warning AVX2 shuffle tests not enabled.
    ^
Linking C executable test_basics
CMakeFiles/test_basics.dir/test_basics.c.o: nella funzione "main":
test_basics.c:(.text.startup+0x9a5): riferimento non definito a "shuffle_generic"
test_basics.c:(.text.startup+0x9bd): riferimento non definito a "unshuffle_generic"
test_basics.c:(.text.startup+0xb7c): riferimento non definito a "shuffle_sse2"
test_basics.c:(.text.startup+0xb8f): riferimento non definito a "shuffle_generic"
test_basics.c:(.text.startup+0xca5): riferimento non definito a "shuffle_generic"
test_basics.c:(.text.startup+0xcbd): riferimento non definito a "unshuffle_sse2"
test_basics.c:(.text.startup+0xe06): riferimento non definito a "shuffle_sse2"
test_basics.c:(.text.startup+0xe1e): riferimento non definito a "unshuffle_generic"
test_basics.c:(.text.startup+0xf4d): riferimento non definito a "shuffle_sse2"
test_basics.c:(.text.startup+0xf68): riferimento non definito a "unshuffle_sse2"
/usr/bin/ld: test_basics: hidden symbol `unshuffle_generic' isn't defined
/usr/bin/ld: link finale non riuscito: Valore errato
collect2: error: ld returned 1 exit status
tests/CMakeFiles/test_basics.dir/build.make:89: set di istruzioni per l'obiettivo "tests/test_basics" non riuscito
make[2]: *** [tests/test_basics] Errore 1
CMakeFiles/Makefile2:230: set di istruzioni per l'obiettivo "tests/CMakeFiles/test_basics.dir/all" non riuscito
make[1]: *** [tests/CMakeFiles/test_basics.dir/all] Errore 2
Makefile:146: set di istruzioni per l'obiettivo "all" non riuscito
make: *** [all] Errore 2

c-blosc-1.5.0 compilation broken on non-SSE2 platforms

See discussion here : https://trac.macports.org/ticket/45964#comment:6

On the test platform, if I compare c-blosc-1.4.1 (which compiles) and c-blosc-1.5.0 (which doesn't), the definitions of shuffle and unshuffle were updated in 1.5.0 in shuffle.h to make the argument _src const, but the change wasn't reflected in lines 493 and 498 (the conditionally executed non-SSE2 section) of shuffle.c.

This fixes the issue:

#else   /* no __SSE2__ available */

void shuffle(size_t bytesoftype, size_t blocksize,
             const uint8_t* _src, uint8_t* _dest) {
  _shuffle(bytesoftype, blocksize, _src, _dest);
}

void unshuffle(size_t bytesoftype, size_t blocksize,
               const uint8_t* _src, uint8_t* _dest) {
  _unshuffle(bytesoftype, blocksize, _src, _dest);
}

#endif  /* __SSE2__ */

blosc.h should bring in stdlib.h

blosc.h uses size_t types, but doesn't bring in stdlib.h

commit cb96d252318eba90a292ddc01d49227f4749b2fd
Author: Rob Latham <[email protected]>
Date:   Wed Jul 23 16:12:49 2014 -0500

    blosc.h uses size_t, defined in stdlib.h

diff --git a/blosc/blosc.h b/blosc/blosc.h
index b9b11d87..0e306cca 100644
--- a/blosc/blosc.h
+++ b/blosc/blosc.h
@@ -7,6 +7,7 @@
 **********************************************************************/

 #include <limits.h>
+#include <stdlib.h>
 #ifdef __cplusplus
 extern "C" {
 #endif

Linking to C++

I spent a while trying to link blocs from a C++ program and after struggling with many undefined references to blocs_compress and blocs_init, I realized that there was no extern "C" block in the blocs.h header. If you notice in say zlib.h, there is a:

ifdef __cplusplus

extern "C" {

endif

at the beginning of the header and a

ifdef __cplusplus

}

endif

at the end of the header.

It might be nice to add this to the official release to save some other people down the line some debugging time.

I have attached the git diff for blocs.h.

diff --git a/blosc/blosc.h b/blosc/blosc.h
index 1e39f1c..226b839 100644
--- a/blosc/blosc.h
+++ b/blosc/blosc.h
@@ -11,6 +11,10 @@
#ifndef BLOSC_H
#define BLOSC_H

+#ifdef __cplusplus
+extern "C" {
+#endif
+
/* Version numbers /
#define BLOSC_VERSION_MAJOR 1 /
for major interface/format changes /
#define BLOSC_VERSION_MINOR 3 /
for minor interface/format changes */
@@ -324,5 +328,9 @@ char *blosc_cbuffer_complib(const void *cbuffer);
*/
void blosc_set_blocksize(size_t blocksize);

+#ifdef __cplusplus
+}
+#endif
+

#endif

_POSIX_BARRIERS problem

The following fix to blosc/blosc.c was needed to fix compilation on the ranger supercomputer at TACC. I needed to add the last check to PREVENT the expression from being true. A simple macro to let the user decide what to do would work for me.

if defined(_POSIX_BARRIERS) && ( (_POSIX_BARRIERS - 20012L) >= 0 && _POSIX_BARRIERS != 200112L)

Blosclz performance tuning on X86

Hi, I'm very interesting in blosclz compress/decompress algorithm and want to make it run faster on x86 processor(by taking advantage of features of X86), any suggestions on this? I've tried to use AVX2 instruction set instead of SSE in shuffle and unshuffle functions and there are about ~40% performance improvement compared to SSE. Thanks!

Use emulation to support AVX2 tests on CI builds

Neither the travis-ci or AppVeyor builds currently run on hardware with AVX2 support, meaning any changes made to the AVX2 shuffle need to be tested manually.

It should be possible to run the tests under an emulator like Intel SDE or QEMU so they can run automatically as part of the CI builds even when the underlying hardware doesn't support certain CPU features. QEMU even goes one step further and can emulate other CPU architectures too, e.g., to allow running tests for an ARM NEON shuffle on the x86 hardware used by the CI platforms.

Windows binaries

I have compiled the version 1.7 with CMake and VS2013/15 and noticed that in the package, the bin folder contains msvcrXX and msvcpXX. Does Blosc have dependencies on those Microsoft libraries when compiled with VS, or this is a CMake/VS artifact? (I do not have access to Windows without Visual Studio, cannot quickly test if the runtime libraries are needed because they are installed system-wide.)

Additionally, it would be very kind of you to release MinGW x86/x64 binaries. Especially in the light of the latest blog post: http://www.blosc.org/blog/hairy-msvc-situation.rst.html. At my current PC at work I do not have MinGW toolchain and prospects to set it up from scratch again scare me a lot. One of the cool features of Blosc is run-time detection of SSE/AVX, so there is no reason to recompile binaries for each hardware config.

build failure: multiple definition of `pthread_create'

c-blosc fails to build on Windows with gcc with the message multiple definition of 'pthread_create'.

The issue seems to be with the file c-blosc/blosc/blosc.c and the following definition block:

#if defined(_WIN32)
  #include "win32/pthread.h"
  #include "win32/pthread.c"
#else
  #include <pthread.h>
#endif

It seem the win32 gcc already provides the functions that are provided by win32/pthread.c leading to conflicting definitions.

On my machine testing for __GNUC__ solves this issue:

#if defined(_WIN32) && !defined(__GNUC__)
  #include "win32/pthread.h"
  #include "win32/pthread.c"
#else
  #include <pthread.h>
#endif

I am not sure however, if this breaks things in other win32 configurations.

C:\bcolz>gcc --version
gcc (tdm-1) 4.9.2

Build log:

[bquerymp] C:\bcolz>python setup.py build_ext --inplace
* Found Cython 0.22 package installed.
* Found numpy 1.9.1 package installed.
* Found numexpr 2.3.1 package installed.
running build_ext
cythoning bcolz/carray_ext.pyx to bcolz\carray_ext.c
building 'bcolz.carray_ext' extension
C compiler: gcc -O2 -Wall -Wstrict-prototypes

compile options: '-DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D__MSVCRT_VERSION_
_=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\numpy\core\includ
e -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc/internal-compli
bs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anaconda\envs\bquery
mp\include -IC:\Anaconda\envs\bquerymp\PC -c'
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/blosc\shuff
le.c -o build\temp.win32-2.7\Release\c-blosc\blosc\shuffle.o
Found executable C:\Program Files\TDM-GCC-32\bin\gcc.exe
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\deflate.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\deflate.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\gzread.c -o build\temp.win32-2.7\Release\c-blosc\internal-comp
libs\zlib-1.2.8\gzread.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\lz4-r119\lz4hc.c -o build\temp.win32-2.7\Release\c-blosc\internal-complib
s\lz4-r119\lz4hc.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\zutil.c -o build\temp.win32-2.7\Release\c-blosc\internal-compl
ibs\zlib-1.2.8\zutil.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/blosc\blosc
lz.c -o build\temp.win32-2.7\Release\c-blosc\blosc\blosclz.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\compress.c -o build\temp.win32-2.7\Release\c-blosc\internal-co
mplibs\zlib-1.2.8\compress.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c bcolz\carray_ext.c
-o build\temp.win32-2.7\Release\bcolz\carray_ext.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\lz4-r119\lz4.c -o build\temp.win32-2.7\Release\c-blosc\internal-complibs\
lz4-r119\lz4.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\inflate.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\inflate.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\trees.c -o build\temp.win32-2.7\Release\c-blosc\internal-compl
ibs\zlib-1.2.8\trees.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\inffast.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\inffast.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\snappy-1.1.1\snappy-c.cc -o build\temp.win32-2.7\Release\c-blosc\internal
-complibs\snappy-1.1.1\snappy-c.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\uncompr.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\uncompr.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\gzwrite.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\gzwrite.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\snappy-1.1.1\snappy.cc -o build\temp.win32-2.7\Release\c-blosc\internal-c
omplibs\snappy-1.1.1\snappy.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\infback.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\infback.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\inftrees.c -o build\temp.win32-2.7\Release\c-blosc\internal-co
mplibs\zlib-1.2.8\inftrees.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\gzclose.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\gzclose.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\gzlib.c -o build\temp.win32-2.7\Release\c-blosc\internal-compl
ibs\zlib-1.2.8\gzlib.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\crc32.c -o build\temp.win32-2.7\Release\c-blosc\internal-compl
ibs\zlib-1.2.8\crc32.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\snappy-1.1.1\snappy-sinksource.cc -o build\temp.win32-2.7\Release\c-blosc
\internal-complibs\snappy-1.1.1\snappy-sinksource.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\snappy-1.1.1\snappy-stubs-internal.cc -o build\temp.win32-2.7\Release\c-b
losc\internal-complibs\snappy-1.1.1\snappy-stubs-internal.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/blosc\blosc
.c -o build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o
gcc -O2 -Wall -Wstrict-prototypes -DHAVE_LZ4=1 -DHAVE_SNAPPY=1 -DHAVE_ZLIB=1 -D_
_MSVCRT_VERSION__=0x0900 -Ibcolz -IC:\Anaconda\envs\bquerymp\lib\site-packages\n
umpy\core\include -Ic-blosc\blosc -Ic-blosc/internal-complibs\lz4-r119 -Ic-blosc
/internal-complibs\snappy-1.1.1 -Ic-blosc/internal-complibs\zlib-1.2.8 -IC:\Anac
onda\envs\bquerymp\include -IC:\Anaconda\envs\bquerymp\PC -c c-blosc/internal-co
mplibs\zlib-1.2.8\adler32.c -o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\adler32.o
g++ -shared build\temp.win32-2.7\Release\bcolz\carray_ext.o build\temp.win32-2.7
\Release\c-blosc\blosc\blosc.o build\temp.win32-2.7\Release\c-blosc\blosc\bloscl
z.o build\temp.win32-2.7\Release\c-blosc\blosc\shuffle.o build\temp.win32-2.7\Re
lease\c-blosc\internal-complibs\lz4-r119\lz4.o build\temp.win32-2.7\Release\c-bl
osc\internal-complibs\lz4-r119\lz4hc.o build\temp.win32-2.7\Release\c-blosc\inte
rnal-complibs\snappy-1.1.1\snappy-c.o build\temp.win32-2.7\Release\c-blosc\inter
nal-complibs\snappy-1.1.1\snappy-sinksource.o build\temp.win32-2.7\Release\c-blo
sc\internal-complibs\snappy-1.1.1\snappy-stubs-internal.o build\temp.win32-2.7\R
elease\c-blosc\internal-complibs\snappy-1.1.1\snappy.o build\temp.win32-2.7\Rele
ase\c-blosc\internal-complibs\zlib-1.2.8\adler32.o build\temp.win32-2.7\Release\
c-blosc\internal-complibs\zlib-1.2.8\compress.o build\temp.win32-2.7\Release\c-b
losc\internal-complibs\zlib-1.2.8\crc32.o build\temp.win32-2.7\Release\c-blosc\i
nternal-complibs\zlib-1.2.8\deflate.o build\temp.win32-2.7\Release\c-blosc\inter
nal-complibs\zlib-1.2.8\gzclose.o build\temp.win32-2.7\Release\c-blosc\internal-
complibs\zlib-1.2.8\gzlib.o build\temp.win32-2.7\Release\c-blosc\internal-compli
bs\zlib-1.2.8\gzread.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zl
ib-1.2.8\gzwrite.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1
.2.8\infback.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8
\inffast.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\inf
late.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\inftree
s.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\trees.o bu
ild\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\uncompr.o build\
temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\zutil.o -LC:\Anacond
a\envs\bquerymp\libs -LC:\Anaconda\envs\bquerymp\PCbuild -lpython27 -lmsvcr90 -o
 C:\bcolz\bcolz\carray_ext.pyd
Found executable C:\Program Files\TDM-GCC-32\bin\g++.exe
C:/Program Files/TDM-GCC-32/bin/../lib/gcc/mingw32/4.9.2/../../../libpthread.a(l
ibwinpthread_la-thread.o):thread.c:(.text+0x3710): multiple definition of `pthre
ad_create'
build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o:blosc.c:(.text+0x180): first
defined here
C:/Program Files/TDM-GCC-32/bin/../lib/gcc/mingw32/4.9.2/../../../libpthread.a(l
ibwinpthread_la-cond.o):cond.c:(.text+0x3d0): multiple definition of `pthread_co
nd_init'
build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o:blosc.c:(.text+0x260): first
defined here
C:/Program Files/TDM-GCC-32/bin/../lib/gcc/mingw32/4.9.2/../../../libpthread.a(l
ibwinpthread_la-cond.o):cond.c:(.text+0x980): multiple definition of `pthread_co
nd_destroy'
build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o:blosc.c:(.text+0x300): first
defined here
C:/Program Files/TDM-GCC-32/bin/../lib/gcc/mingw32/4.9.2/../../../libpthread.a(l
ibwinpthread_la-cond.o):cond.c:(.text+0xb80): multiple definition of `pthread_co
nd_signal'
build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o:blosc.c:(.text+0x3e0): first
defined here
C:/Program Files/TDM-GCC-32/bin/../lib/gcc/mingw32/4.9.2/../../../libpthread.a(l
ibwinpthread_la-cond.o):cond.c:(.text+0xca0): multiple definition of `pthread_co
nd_broadcast'
build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o:blosc.c:(.text+0x440): first
defined here
C:/Program Files/TDM-GCC-32/bin/../lib/gcc/mingw32/4.9.2/../../../libpthread.a(l
ibwinpthread_la-cond.o):cond.c:(.text+0xdb0): multiple definition of `pthread_co
nd_wait'
build\temp.win32-2.7\Release\c-blosc\blosc\blosc.o:blosc.c:(.text+0x340): first
defined here
collect2.exe: error: ld returned 1 exit status
error: Command "g++ -shared build\temp.win32-2.7\Release\bcolz\carray_ext.o buil
d\temp.win32-2.7\Release\c-blosc\blosc\blosc.o build\temp.win32-2.7\Release\c-bl
osc\blosc\blosclz.o build\temp.win32-2.7\Release\c-blosc\blosc\shuffle.o build\t
emp.win32-2.7\Release\c-blosc\internal-complibs\lz4-r119\lz4.o build\temp.win32-
2.7\Release\c-blosc\internal-complibs\lz4-r119\lz4hc.o build\temp.win32-2.7\Rele
ase\c-blosc\internal-complibs\snappy-1.1.1\snappy-c.o build\temp.win32-2.7\Relea
se\c-blosc\internal-complibs\snappy-1.1.1\snappy-sinksource.o build\temp.win32-2
.7\Release\c-blosc\internal-complibs\snappy-1.1.1\snappy-stubs-internal.o build\
temp.win32-2.7\Release\c-blosc\internal-complibs\snappy-1.1.1\snappy.o build\tem
p.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\adler32.o build\temp.wi
n32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\compress.o build\temp.win32
-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\crc32.o build\temp.win32-2.7\R
elease\c-blosc\internal-complibs\zlib-1.2.8\deflate.o build\temp.win32-2.7\Relea
se\c-blosc\internal-complibs\zlib-1.2.8\gzclose.o build\temp.win32-2.7\Release\c
-blosc\internal-complibs\zlib-1.2.8\gzlib.o build\temp.win32-2.7\Release\c-blosc
\internal-complibs\zlib-1.2.8\gzread.o build\temp.win32-2.7\Release\c-blosc\inte
rnal-complibs\zlib-1.2.8\gzwrite.o build\temp.win32-2.7\Release\c-blosc\internal
-complibs\zlib-1.2.8\infback.o build\temp.win32-2.7\Release\c-blosc\internal-com
plibs\zlib-1.2.8\inffast.o build\temp.win32-2.7\Release\c-blosc\internal-complib
s\zlib-1.2.8\inflate.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zl
ib-1.2.8\inftrees.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-
1.2.8\trees.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\
uncompr.o build\temp.win32-2.7\Release\c-blosc\internal-complibs\zlib-1.2.8\zuti
l.o -LC:\Anaconda\envs\bquerymp\libs -LC:\Anaconda\envs\bquerymp\PCbuild -lpytho
n27 -lmsvcr90 -o C:\bcolz\bcolz\carray_ext.pyd" failed with exit status 1

Upgrade lz4 to r118

Due to potential security concerns the shipped version of lz4 should be upgraded.

Add `compcode` to blosc_cbuffer_versions

It would be interesting to add a way to know which compressor was used to compress a buffer. The clear candidate for this is blosc_cbuffer_versions whose current signature is:

void blosc_cbuffer_versions(const void *cbuffer, int *version, int *versionlz);

and with the suggested change, the signature would become:

void blosc_cbuffer_versions(const void *cbuffer, int *version, int *compcode, int *versionlz);

which introduces an API change. However, as blosc_cbuffer_versions() should not be widely used, I think this is a reasonable change (although it may require a jump in the minor version).

Opinions?

BLOSC_MAX_BUFFERSIZE should not just be INT_MAX

Blosc do a malloc reservation for output of input_buffer_size + BLOSC_MAX_OVERHEAD, but the internal pointers for Blosc are always signed 32-bit long. So, this definition should rather be:

define BLOSC_MAX_BUFFERSIZE (INT_MAX - BLOSC_MAX_OVERHEAD)

This flaw has been discovered by the extreme test in bloscpack:

nosetests test_bloscpack.py:pack_unpack_extreme

Add optional support for zlib-ng

zlib-ng is a fork of zlib which is working to "modernize" the zlib codebase. It includes some patches from Intel and Cloudflare which use the SSE 4.2 pclmulqdq instruction to optimize the CRC32 calculations for compressed blocks, leading to a significant speedup for both compression and decompression.

It seems like a promising project but it'd be important to perform in-depth testing before trusting any important data to it. Perhaps a compromise would be to import zlib-ng to this repository (or add it as a submodule so it's easier to stay up-to-date), then implement a CMake option to use it instead of the standard zlib when compiling blosc to allow for experimentation and verification.

The HDF5 filter fails tests on Windows when c-blosc and HDF5 link to different CRTs

When you build c-blosc on Windows, the HDF5 filter test will fail if c-blosc and HDF5 link to different versions of the C run-time library (CRT). The reason for this is that, unlike systems based on Unix, Windows does not make the C library (C run-time) a part of the operating system. Instead, the C library is implemented in separate shared libraries (dlls - e.g. msvcr110.dll). Each version of Visual Studio uses a different set of dlls and they are further split into debug and release versions. None of these dlls share CRT state (heap management structures, etc.) Problems thus arise when memory is allocated in one dll and freed in another.

This problem is inherent to how the H5Z functions in HDF5 handle memory and is not a c-blosc issue. The HDF Group will have to work out a solution for this, which will possibly involve exposing H5Z re/c/m/alloc and free functions and ensuring that buffers are correctly handled in the H5Z code.

In the meantime, users should be aware that the binaries released by THG are release only, so running the HDF5 filter tests when c-blosc is built in debug mode will always fail when linking against those. They do pass when both c-blosc and HDF5 are built with the same configuration (release vs. debug).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.