Git Product home page Git Product logo

lz4-java's Introduction

LZ4 Java

LZ4 compression for Java, based on Yann Collet's work available at http://code.google.com/p/lz4/.

This library provides access to two compression methods that both generate a valid LZ4 stream:

  • fast scan (LZ4):
    • low memory footprint (~ 16 KB),
    • very fast (fast scan with skipping heuristics in case the input looks incompressible),
    • reasonable compression ratio (depending on the redundancy of the input).
  • high compression (LZ4 HC):
    • medium memory footprint (~ 256 KB),
    • rather slow (~ 10 times slower than LZ4),
    • good compression ratio (depending on the size and the redundancy of the input).

The streams produced by those 2 compression algorithms use the same compression format, are very fast to decompress and can be decompressed by the same decompressor instance.

Implementations

For LZ4 compressors, LZ4 HC compressors and decompressors, 3 implementations are available:

  • JNI bindings to the original C implementation by Yann Collet,
  • a pure Java port of the compression and decompression algorithms,
  • a Java port that uses the sun.misc.Unsafe API in order to achieve compression and decompression speeds close to the C implementation.

Have a look at LZ4Factory for more information.

Compatibility notes

  • Compressors and decompressors are interchangeable: it is perfectly correct to compress with the JNI bindings and to decompress with a Java port, or the other way around.

  • Compressors might not generate the same compressed streams on all platforms, especially if CPU endianness differs, but the compressed streams can be safely decompressed by any decompressor implementation on any platform.

Examples

LZ4Factory factory = LZ4Factory.fastestInstance();

byte[] data = "12345345234572".getBytes("UTF-8");
final int decompressedLength = data.length;

// compress data
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
byte[] compressed = new byte[maxCompressedLength];
int compressedLength = compressor.compress(data, 0, decompressedLength, compressed, 0, maxCompressedLength);

// decompress data
// - method 1: when the decompressed length is known
LZ4FastDecompressor decompressor = factory.fastDecompressor();
byte[] restored = new byte[decompressedLength];
int compressedLength2 = decompressor.decompress(compressed, 0, restored, 0, decompressedLength);
// compressedLength == compressedLength2

// - method 2: when the compressed length is known (a little slower)
// the destination buffer needs to be over-sized
LZ4SafeDecompressor decompressor2 = factory.safeDecompressor();
int decompressedLength2 = decompressor2.decompress(compressed, 0, compressedLength, restored, 0);
// decompressedLength == decompressedLength2
byte[] data = "12345345234572".getBytes("UTF-8");
final int decompressedLength = data.length;

LZ4FrameOutputStream outStream = new LZ4FrameOutputStream(new FileOutputStream(new File("test.lz4")));
outStream.write(data);
outStream.close();

byte[] restored = new byte[decompressedLength];
LZ4FrameInputStream inStream = new LZ4FrameInputStream(new FileInputStream(new File("test.lz4")));
inStream.read(restored);
inStream.close();

xxhash Java

xxhash hashing for Java, based on Yann Collet's work available at https://github.com/Cyan4973/xxHash (old version http://code.google.com/p/xxhash/). xxhash is a non-cryptographic, extremly fast and high-quality (SMHasher score of 10) hash function.

Implementations

Similarly to LZ4, 3 implementations are available: JNI bindings, pure Java port and pure Java port that uses sun.misc.Unsafe.

Have a look at XXHashFactory for more information.

Compatibility notes

  • All implementation return the same hash for the same input bytes:
    • on any JVM,
    • on any platform (even if the endianness or integer size differs).

Example

XXHashFactory factory = XXHashFactory.fastestInstance();

byte[] data = "12345345234572".getBytes("UTF-8");
ByteArrayInputStream in = new ByteArrayInputStream(data);

int seed = 0x9747b28c; // used to initialize the hash value, use whatever
                       // value you want, but always the same
StreamingXXHash32 hash32 = factory.newStreamingHash32(seed);
byte[] buf = new byte[8]; // for real-world usage, use a larger buffer, like 8192 bytes
for (;;) {
  int read = in.read(buf);
  if (read == -1) {
    break;
  }
  hash32.update(buf, 0, read);
}
int hash = hash32.getValue();

Download

You can download released artifacts from Maven Central.

You can download pure-Java lz4-java from Maven Central. These artifacts include the Safe and Unsafe Java versions but not JNI bindings. (Experimental)

Documentation

Performance

Both lz4 and xxhash focus on speed. Although compression, decompression and hashing performance can depend a lot on the input (there are lies, damn lies and benchmarks), here are some benchmarks that try to give a sense of the speed at which they compress/decompress/hash bytes.

Build

Requirements

  • JDK version 7 or newer,
  • ant version 1.10.2 or newer,
  • ivy.

If ivy is not installed yet, ant can take care of it for you, just run ant ivy-bootstrap. The library will be installed under ${user.home}/.ant/lib.

You might hit an error like the following when the ivy in ${user.home}/.ant/lib is old. You can delete it and then run ant ivy-bootstrap again to install the latest version.

[ivy:resolve] 		::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve] 		::          UNRESOLVED DEPENDENCIES         ::
[ivy:resolve] 		::::::::::::::::::::::::::::::::::::::::::::::

Instructions

For lz4-java 1.5.0 or newer, first run git submodule init and then git submodule update to initialize the lz4 submodule in src/lz4.

Then run ant. It will:

  • generate some Java source files in build/java from the templates that are located under src/build,
  • compile the lz4 and xxhash libraries and their JNI (Java Native Interface) bindings,
  • compile Java sources in src/java (normal sources), src/java-unsafe (sources that make use of sun.misc.Unsafe) and build/java (auto-generated sources) to build/classes, build/unsafe-classes and build/generated-classes,
  • generate a JAR file called lz4-${version}.jar under the dist directory.

The JAR file that is generated contains Java class files, the native library and the JNI bindings. If you add this JAR to your classpath, the native library will be copied to a temporary directory and dynamically linked to your Java application.

lz4-java's People

Contributors

adamretter avatar bastienf avatar blambov avatar clockfort avatar danielfree avatar drcrallen avatar guyuqi avatar jirutka avatar joshrosen avatar jpountz avatar linnaea avatar luben avatar lyrachord avatar magnet avatar maropu avatar odaira avatar rockyzhang-zz avatar ruippeixotog avatar sunxiaoguang avatar windie avatar yilongli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lz4-java's Issues

Remove the temporary library file on exit

net.jpountz.util.Native.java first copies the shared library file to a temporary file, and then tries to call System.load. There is a call to File.deleteOnExit to remove it, but does it work:

  • if some code calls System.exit?
  • under Windows, where you can't remove files which are still in use?

JNI Windows

Hi,

I am trying to use LZ4 as native on WIN 7 32/64bits but after compiling the project, I've got an UnsatisfiedLinkError.

The file generated is an *.so file under win32/x86/

This file should not be a *.dll file? Anyway I have already read a post which seems relate the same problem but i did not see any solution for that. How i can get this working?

Thx

enquiry of compressor/decompressor

  1. In a multithreaded environment, what is better way for perfromance?

    a. share a factory between multiple threads, create compressor/decompressor in each thread
    b. share a compressor/decompressor between multiple threads
    c. do not share anything, create everything ad-hoc

  2. What is the behaviour of safe decompressor/fast decompressor if the output size is larger than the specified decompressedLength/allocated_buffer_size?

    an exception is thrown or memory corruption occur?

arrayOffset() not taken into account by JNI ByteBuffer methods

Changing AbstractLZ4Test.Tester.BYTE_BUFFER to use slicing demonstrates the problem:

    public static final Tester<ByteBuffer> BYTE_BUFFER = new Tester<ByteBuffer>() {

      @Override
      public ByteBuffer allocate(int length) {
        ByteBuffer bb;
        int slice = randomInt(5);
        if (randomBoolean()) {
          bb = ByteBuffer.allocate(length + slice);
        } else {
          bb = ByteBuffer.allocateDirect(length + slice);
        }
        bb.position(slice);
        bb = bb.slice();
        if (randomBoolean()) {
          bb.order(ByteOrder.LITTLE_ENDIAN);
        } else {
          bb.order(ByteOrder.BIG_ENDIAN);
        }
        return bb;
      }

Optimised skip() method for LZ4BlockInputStream

If I only want to decompress some part of the file, let's say a section at the begining and some other bits at the end, I can use skip(). But it looks to me like skiping through the file would currently still mean calling refill() on every block in between. Am I right? Is it possible to only read the headers in between and and then only the block to be ready for the next call to read()? People might scan the file using an index for example.

`StreamingXXHash32.getValue()` should return a long

Hello,

I'm using your implementation of xxHash to check the integrity of uploaded files in a web application. As you can imagine, client side I'm using the JS implementation found here.

Problems arise when the calculated hash is >= 2^31: while client side the hash is correctly sent as unsigned, server side the call to lStreamingXXHash32.getValue() returns a negative int, due to the overflow. This makes the two values not comparable (without some hack, at least).

Maybe it would be better change the StreamingXXHash32.getValue() implementation so that it returns a long. That way no overflow would occur, and the original unsigned value would be returned.

Thanks for you work!

fastestInstance succeeds when it shouldn't

The factory call suceeds in returning JNI instance even when i don't have the JNI libs included.

I think you should call factory.fastCompressor().maxCompressedLength(100);

to make sure its really working.

problems with LZ4Block{Input,Output}Stream

Hi Adrien,

I tried your new LZ4Block{Input,Output}Stream as published in:
http://search.maven.org/#artifactdetails|net.jpountz.lz4|lz4|1.1.0|jar

and unfortunately I had problems.

I have committed the passing and failing tests into:
https://github.com/mooreb/lz4-java-stream/tree/master/tests-2013-02-11/for-jpountz

The error I am experiencing is:

java.lang.AssertionError: Bytes differed! Seed value was 1360613556874 expected:<-123> but was:<26>
    at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
    at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
    at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
    at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:200)
    at net.jpountz.lz4.LZ4StreamTest.assertEqualContent(LZ4StreamTest.java:148)
    at net.jpountz.lz4.LZ4StreamTest.assertContentInSingleBlock(LZ4StreamTest.java:133)
    at net.jpountz.lz4.LZ4StreamTest.randomizedTest(LZ4StreamTest.java:95)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
java.lang.AssertionError: Exception was thrown.  Seed value was 1360613556874
    at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
    at net.jpountz.lz4.LZ4StreamTest.randomizedTest(LZ4StreamTest.java:106)
    at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
    at org.testng.TestRunner.privateRun(TestRunner.java:767)
    at org.testng.TestRunner.run(TestRunner.java:617)
    at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
    at org.testng.SuiteRunner.run(SuiteRunner.java:240)
    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1198)
    at org.testng.TestNG.runSuitesLocally(TestNG.java:1123)
    at org.testng.TestNG.run(TestNG.java:1031)
    at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
    at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
    at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
    at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:111)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

The difference between the passing and failing tests is below:

$ diff -Naur passing-test/LZ4StreamTest.java failing-test/LZ4StreamTest.java
--- passing-test/LZ4StreamTest.java 2013-02-11 12:42:55.000000000 -0800
+++ failing-test/LZ4StreamTest.java 2013-02-11 12:43:15.000000000 -0800
@@ -39,7 +39,7 @@
     private void compressContent() throws IOException {
         ByteArrayOutputStream compressedOutputStream = new ByteArrayOutputStream();
 
-        LZ4OutputStream os = new LZ4OutputStream(compressedOutputStream);
+        LZ4BlockOutputStream os = new LZ4BlockOutputStream(compressedOutputStream);
@@ -77,7 +77,7 @@
     @Test
     public void randomizedTest() throws IOException {
         try {
-            InputStream is = new LZ4InputStream(new ByteArrayInputStream(compressedOutput));
+            InputStream is = new LZ4BlockInputStream(new ByteArrayInputStream(compressedOutput));
             int currentContentPosition = 0;
 

Hoping this helps and finds you well,

b

Stream is corrupted exception on IBM JDK 1.6

Hi,

when trying to use the IBM JDK, I get an exception as follows:

Exception in thread "main" java.io.IOException: Stream is corrupted
    at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:207)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:116)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:129)

my initialization looks as follows:

    @Override
    public OutputStream getCompressorOutputStream(OutputStream out) throws IOException {

        // block size
        int blockSize = 1 << 16;

        // compressor
        LZ4Factory factory = LZ4Factory.safeInstance();
        LZ4Compressor compressor = factory.fastCompressor();

        // checksum
        int seed = 0x9747b28c;
        XXHashFactory checksumFactory = XXHashFactory.safeInstance();
        Checksum checksum = checksumFactory.newStreamingHash32(seed).asChecksum();

        return new LZ4BlockOutputStream(out, blockSize, compressor, checksum, false);
    }

    @Override
    public InputStream getCompressorInputStream(InputStream in) throws IOException {

        // decompressor
        LZ4Factory factory = LZ4Factory.safeInstance();
        LZ4FastDecompressor decompressor = factory.fastDecompressor();

        // checksum
        int seed = 0x9747b28c;
        XXHashFactory checksumFactory = XXHashFactory.safeInstance();
        Checksum checksum = checksumFactory.newStreamingHash32(seed).asChecksum();

        return new LZ4BlockInputStream(in, decompressor, checksum);
    }

unfortunately, sometimes it also works :-( so, really no idea what the problem is. However, it never happens using either OpenJDK or OracleJDK.

Thanks and best,
Lukas

Calling flush on a closed LZ4BlockOutputStream throws NPE

Found this when using Jackson's ObjectMapper because it auto closes.. I know the Java spec is if its 'closed' and another operation is called its perfectly fine to throw an exception.. however for ease of use and the fact there's already support in the LZ4BlockOuputStream to call close repeatedly.. it was simple enough to patch flush..

JavaSafeCompressor missing in 1.2 and master branch

Lib builds well (except the unsafe-warnings), but the example doesn't run:

Exception in thread "main" java.lang.AssertionError: java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4JavaSafeCompressor

Similar to issue #19.

I'm also interested in building LZ4 with the pure java implementation only.

Android version

lz4-1.2.0.jar is incompatible with Android as in contains native libraries that will not run on the device. As a quick hack I've manually removed the following from the jar linux/amd64/liblz4-java.so, linux/i386/liblz4-java.so, win32/amd64/liblz4-java.and it so but it would be great to have a Android specific version with separate ARM and x86 native libraries. Great lib thanks for the hard work.

Recommended maxCompressedLength is often not adequate.

In running some tests, I've found that the suggestion made in the documentation to "make sure that maxDestLen >= maxCompressedLength(srcLen)" is often inadequate to avoid getting an LZ4Exception during compression if compressing small chunks of data.

Though you can presume that small bits of data won't be sent to the compressor since there would be a net loss, that assumption can easily fail in practice. To me, this seems to be unacceptable, especially since there is no fixed uncompressed array size at which the compressed array will actually be less than or equal. Depending on what data is being compressed, I've seen uncompressed byte arrays of only 14 compress down to 10 bytes, but I've also seen 28 byte arrays take 30 bytes once compressed.

To be honest, the entire implementation of designating a byte array of a fixed size prior to calling compression seems extremely non-Java and prone to memory waste. It would be trivial to have the compression implementation maintain a dynamic array internally and simply return the compressed byte array without the user messing with pre-allocation.

The other benefit of maintaining an internal dynamic array for compression is that the library would not require nearly as much memory to be allocated for the larger data to be compressed. When compression ratios range around 13%, allocating a static array for the full 100% is just stupid (imho).

Failing this change to the library, it would be nice to at least provide a helper method that returns a "good" maxDestLength when provided an uncompressed byte array.

For my own solution to calculate a maxDestLength that will (hopefully) work, I have done the following, but it is not a very satisfying solution:

maxDestLength = (data.length < 100) ? 100 : data.length();

solaris x64 Unsafe_GetInt crash

[Loaded net.jpountz.lz4.LZ4JavaUnsafeSafeDecompressor from file:/opt/cassandra/apache-cassandra-2.0.14/lib/lz4-1.2.0.jar]
[Loaded net.jpountz.lz4.LZ4Constants from file:/opt/cassandra/apache-cassandra-2.0.14/lib/lz4-1.2.0.jar]
[Loaded net.jpountz.util.UnsafeUtils from file:/opt/cassandra/apache-cassandra-2.0.14/lib/lz4-1.2.0.jar]

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e46a9f4, pid=29042, tid=2

JRE version: Java(TM) SE Runtime Environment (8.0_45-b14) (build 1.8.0_45-b14)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc6a9f4] Unsafe_GetInt+0x170

JNI / classloader issues

A JNI library can only be loaded once per classloader, so if any classloader which is not a child of the classloader which loaded the JNI library tries to use the JNI version, it will fail. This can be a problem with servlet containers.

snappy-java has a hack to inject bytecode into the root classloader but I'm not sure this is the way to go...

If not fixable, I think we should either document this behavior or forbid loading the JNI lib from a non-root classloader?

deploy to maven

It would be great if the code was deployed to a maven repo (obviously without jni support)

Source ByteBuffer gets tampered on Decompressing

Hi Team,
I am using LZ4 1.3.0 library to compress and decompress ByteBuffer. My sourcebuffer gets altered during decompression.
//Compression
int compressLen = compressor.compress(message, 0, decompressedLength, message, 0, maxCompressedLength);

//Decompression
int decompressed = deCompressor.decompress(msg, 0,compressedLen, bufferMsg, 0, bufferMsg.capacity());

Actually my source buffer contains many compressed messages separated by Identifiers and I am decompressing them one by one in loop.
First decompression works fine, but from second decompression, I start getting following error for all the subsequent messages "Error decoding offset 422 of input buffer".

However if i use a Temporary buffer as source buffer to decompress, decompression works fine for all the compressed messages.
//Decompression using tempBuffer
System.arraycopy(msg.array(),msg.arrayOffset(), tempMsg.array(),tempMsg.arrayOffset(), compressedLen);
int decompressed = deCompressor.decompress(tempMsg, 0,compressedLen , bufferMsg, 0, bufferMsg.capacity());

It seems to me that the SourceBuffer to be decompressed gets somehow modified on Decompression.

Can anybody please help, How do i achive this decompression without using a temp Buffer.

Crash with IBM Java 7

I ran into this problem when trying to use Apache Cassandra 2.0.7 with IBM Java 7. Cassandra crashes during start up and generates a core file. It is pretty easy to recreate with a Java program that calls on LZ4Factory.nativeInstance().

Stack trace

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "pool-1-thread-4" J9VMThread:0x0000000050B7C200, j9thread_t:0x00007F80B0E96660, java/lang/Thread:0x000000004E8D6410, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x17, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x4FE0, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO2            (native stack address range from:0x00007F80A5A98000, to:0x00007F80A5AD9000, size:0x41000)
3XMCPUTIME               CPU usage total: 0.484368829 secs
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=12930384 (0xC54D50)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
4XESTACKTRACE                at net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
4XESTACKTRACE                at net/jpountz/lz4/LZ4Factory.<init>(LZ4Factory.java:163)
4XESTACKTRACE                at net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
4XESTACKTRACE                at net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)

Assertion failure

1XECTHTYPE     Current thread history (J9VMThread:0x0000000001080500)
3XEHSTTYPE     13:57:09:734842157 GMT j9mm.107 - * ** ASSERTION FAILED ** at StandardAccessBarrier.cpp:322: ((false && (elems == getArrayObjectDataAddress((J9VMToken*)vmThread, arrayObject)))) 

Allow to recycle compression buffers

When compressing lots of small buffers, it may happen that the bottleneck is the allocation of the hash table. There should be an option in order to reuse these hash tables per thread.

XXHash performance on ARM

I did some basic performance tests (https://github.com/neophob/PixelController/tree/develop) comparing XXHash and Adler32 on some ARM systems (RPi and BBB). XXHash should be must faster than Adler32, according to your benchmarks (http://jpountz.github.io/lz4-java/1.2.0/xxhash-benchmark/).

I use v1.2.0. So there are 3 possibilities:
a) my benchmark sucks
b) my code sucks
c) the code is not really fast on ARM

My code can be found here: https://github.com/neophob/PixelController/blob/develop/pixelcontroller-core/src/main/java/com/neophob/sematrix/core/perf/PerfTests.java

NullPointerException when LZ4Factory is loaded by bootstrap class loader

At https://github.com/jpountz/lz4-java/blob/aef24cfbbf53d6ade60073b287edc610be743e43/src/java/net/jpountz/lz4/LZ4Factory.java#L149, LZ4Factory.class.getClassLoader() might return null if LZ4Factory is loaded by the bootstrap class loader. It results in a NPE.

LZ4Factory is loaded by the bootstrap class loader in my application because I am using lz4 in a Java agent and this Java agent is loaded by the bootstrap class loader.

LZ4BlockInputStream lz4-java-1.1.1 needs to read fully

java.io.IOException: Stream is corrupted
at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:159)
at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:123)
at org.apache.commons.io.IOUtils.read(IOUtils.java:2454)
at org.apache.commons.io.IOUtils.read(IOUtils.java:2476)

I think this line

in.read(compressedBuffer, 0, HEADER_LENGTH);

needs to read fully.

Typo

There's a typo in LZ4Factory:
unknwonSizeDecompressor
should obviously be
unknownSizeDecompressor

safe compressor missing

I can't rely on platform-specific code, so I'm trying to get my compressor with LZ4Factory.safeInstance().fastCompressor(). That's failing with

Exception in thread "AWT-EventQueue-0" java.lang.AssertionError:
java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4JavaSafeCompressor
at net.jpountz.lz4.LZ4Factory.instance(LZ4Factory.java:48)
at net.jpountz.lz4.LZ4Factory.safeInstance(LZ4Factory.java:85)

Can I build myself a jar that includes LZ4JavaSafeCompressor and/or a jar with no native code?

Compressed message doesn't start with magic number?

Hi,
I am trying to use this library, here is the simple class I am using to wrap it.

https://github.com/openworm/org.geppetto.core/blob/181813152b1747428a2c1aa3126dc3eb40638bb8/src/main/java/org/geppetto/core/common/LZ4Compress.java

When I try to compress the following string:

{"type":"read_url_parameters","data":"{}"}

the result of the compress call is the following:

[-16, 27, 123, 34, 116, 121, 112, 101, 34, 58, 34, 114, 101, 97, 100, 95, 117, 114, 108, 95, 112, 97, 114, 97, 109, 101, 116, 101, 114, 115, 34, 44, 34, 100, 97, 116, 97, 34, 58, 34, 123, 125, 34, 125]

Shouldn't this start with the magic number 0x184D2204?
I am trying to decode this with this library and it complains with:

Invalid magic number: 227B1BF0 @0

Am I doing something wrong? Thanks!

Misspelled API

In LZ4Factory, the method is named "unknwonSizeDecompressor". Looks like this is leftover from when the class name itself was also misspelled.

Large byte array compression on ARM Linux

My environment:
PandaBoard-ES, Ubuntu server, OpenJDK 1.7 - ZeroVM

Issue:
I've a large byte array containing 614400 elements (640 x 480 x 2)
I want to compress it and the program is frozen. When I try to kill the task, I've :

OpenJDK Zero VM warning: Exception java.lang.NullPointerException occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminated

After a while, the program automatically terminates with the error below:


A fatal error has been detected by the Java Runtime Environment:

Internal Error (os_linux_zero.cpp:285), pid=923, tid=2379707504
fatal error: caught unhandled signal 11

JRE version: 7.0_25-b30
Java VM: OpenJDK Zero VM (22.0-b10 mixed mode linux-arm )
Derivative: IcedTea 2.3.10
Distribution: Ubuntu 12.04 LTS, package 7u25-2.3.10-1ubuntu0.12.04.2
Failed to write core dump. Core dumps have been disabled. To enable core umping, try "ulimit -c unlimited" before starting Java again

An error report file with more information is saved as:
/home/db/Desktop/thesis_java/hs_err_pid923.log
Segmentation fault


Here is my code block:

ShortBuffer buffer = depthMD.getData().createShortBuffer();
ByteBuffer bb = ByteBuffer.allocate(sb.capacity()*2);
bb.asShortBuffer().put(sb);
byte[] data = new byte[bb.capacity()];
bb.get(data);
// Verified that I've a clean byte array data here.

LZ4Factory factory = LZ4Factory.fastestInstance();
int decompressedLength = data.length;
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength =
compressor.maxCompressedLength(decompressedLength);
byte[] compressed = new byte[maxCompressedLength];
//Next line gives that error
int compressedLength = compressor.compress(data, 0,decompressedLength, compressed, 0, maxCompressedLength);
System.out.println(data.length);

Next release for XXH64

When do you plan for next release sothat people can use XXH64 from release version? Is the master branch XXH64 code production ready?

JNI Problem

Hi
JNI headers file "jni.h" is exist in my JAVA.FRAMEWORK path but build output:
...
compile-java:
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/classes
[javac] Compiling 24 source files to /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/classes

generate-header:
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers

compile-jni:
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/objects
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni/${platform}/x86_64
[cpptasks:cc] 5 total files to be compiled.
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:2:17: error: jni.h: No such file or directory
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:16: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:24: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:32: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:40: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:48: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:56: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:21: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘OutOfMemoryError’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:28: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:33: error: expected ‘)’ before ‘’ token
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:42: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:70: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:98: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:126: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:154: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:2:17: error: jni.h: No such file or directory
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:18: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:26: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:34: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:21: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘OutOfMemoryError’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:28: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:33: error: expected ‘)’ before ‘
’ token
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:42: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:59: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’

BUILD FAILED
/Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build.xml:62: gcc failed with return code 1

Total time: 8 seconds

Idea to improve performance by eliminating one copying step

I'm going over your code while considering to use it for a Hazelcast enterprise feature. In the same effort I have done some I/O coding similar to the write(byte[], int, int) method of LZ4BlockOutputStream. The way I wrote it is not too complex, but avoids repeatedly copying into the internal buffer whenever the supplied buffer is larger than it. Basically,

void write(byte[] b, int off, int len) {
    while (len >= BUFFER_SIZE) {
        flushLocalBuffer();
        compress(b, off, BUFFER_SIZE);
        off += BUFFER_SIZE;
        len -= BUFFER_SIZE;
    }
    while (len > 0) {
        final int transferredCount = Math.min(BUFFER_SIZE - position, len);
        System.arraycopy(b, off, buf, position, transferredCount);
        off += transferredCount;
        len -= transferredCount;
        position += transferredCount;
        ensureBufHasRoom();
    }
}

Since performance is a high concern in this project, perhaps it is of value to bring up this detail.

LZ4BlockInputStream cannot read two consecutive write-close operations from two different LZ4BlockOutputStream

How to reproduce:
Run the test case testWriteCloseWriteCloseRead(). In pseudo code:

  1. Write some data to a file with a LZ4BlockOutputStream and close the stream
  2. Write some more data to the same file with a new LZ4BlockOutputStream and close the stream.
  3. Read the sum of the data with one single instance of LZ4BlockInputStream
  /**
   * Write and close two stream instances to the same file. Read the entire data with one
   * LZ4BlockInputStream.
   */
  @Test
  public void testWriteCloseWriteCloseRead() throws IOException {
    final byte[] testBytes = "Testing!".getBytes(Charset.forName("UTF-8"));

    //Write the first time
    ByteArrayOutputStream bytes = new ByteArrayOutputStream();
    LZ4BlockOutputStream out = new LZ4BlockOutputStream(bytes);
    out.write(testBytes);
    out.close();

    //Write the second time
    out = new LZ4BlockOutputStream(bytes);
    out.write(testBytes);
    out.close();

    ByteArrayInputStream in = new ByteArrayInputStream(bytes.toByteArray());
    LZ4BlockInputStream lz4In = new LZ4BlockInputStream(in);
    DataInputStream dataIn = new DataInputStream(lz4In);

    byte[] buffer = new byte[testBytes.length];
    dataIn.readFully(buffer);
    assertArrayEquals(testBytes, buffer);

//    in.skip(LZ4BlockOutputStream.HEADER_LENGTH); //This test case can only be passed if 21 bytes (the footer) is skipped

    buffer = new byte[testBytes.length];
    dataIn.readFully(buffer);
    assertArrayEquals(testBytes, buffer);
  }

Actual:
An java.io.EOFException is thrown

Expected:
The sum of the data should be read and returned.

Analysis:
The LZ4BlockOutputStream will write a header, data and a footer. The footer is very similar to the header. Two LZ4BlockOutputStreams will create this:
Header | Compressed Data | Footer | Header |Compressed Data | Footer
One instance of LZ4BlockInputStream will read the header and the compressed data. If the user tries to read more data it will try to read a header again. But since it has not skipped the previous footer it will read the footer instead. The footer, although similar to the header contains a 0 length and will therefore return -1 from the read() method and the DataInputStream will thus throw a EOFException.

If the user manually skips 21 bytes (the length of the header/footer) the LZ4BlockInputStream will happily continue to read another “frame” (se the out-commeted row in the test case).

Workaround:
The user can manually call in.skip(21).

Suggested fix:
I think it would be appropriate if a LZ4BlockInputStream consumes all bytes related the one frame: that is the footer should be consumed when the end of the frame has been reached

I’m guessing the solution might be a bit trickier because the footer is related to the frame and the header to the block? (I’m probably using the term block and frame wrong)

Another approach would be to just say that this should not be possible. But this “feature” works with a normal GZIPOutputStream/GZIPInputStream so it would be good if it also works with LZ4.

Plz help to build

/jpountz-lz4-java-969fc61/build.xml:29: Problem: failed to create task or type antlib:org.apache.ivy.ant:resolve
Cause: The name is undefined.
Action: Check the spelling.
Action: Check that any custom tasks/types have been declared.
Action: Check that any / declarations have taken place.
No types or tasks have been defined in this namespace yet

This appears to be an antlib declaration.
Action: Check that the implementing library exists in one of:
-/usr/share/ant/lib
-/Users/User/.ant/lib
-a directory added on the command line with the -lib argument

Total time: 0 seconds

+I have already compiled and add ant-cpp-task

Make the POM generate an OSGi compliant manifest

It would be nice if the POM could be extended to generate an OSGi compliant manifest file. The result of including this is that the library could then be used in OSGi containers.
This can be easily done adding to the pom something like this:

            <plugin>
                <groupId>org.apache.felix</groupId>
                <artifactId>maven-bundle-plugin</artifactId>
                <version>2.3.7</version>
                <extensions>true</extensions>
                <configuration>
                    <manifestLocation>src/main/java/META-INF</manifestLocation>
                    <supportedProjectTypes>
                        <supportedProjectType>jar</supportedProjectType>
                        <supportedProjectType>bundle</supportedProjectType>
                    </supportedProjectTypes>
                    <instructions>
                        <Bundle-SymbolicName>${project.groupId}.${project.artifactId}</Bundle-SymbolicName>
                        <Bundle-Version>${project.version}</Bundle-Version>
                        <Bundle-ClassPath>.,{maven-dependencies}</Bundle-ClassPath>
                    </instructions>
                </configuration>
            </plugin>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.