lz4 / lz4-java Goto Github PK

View Code? Open in Web Editor NEW

1.1K 53.0 245.0 12.7 MB

LZ4 compression for Java

License: Apache License 2.0

Java 93.80% HTML 2.12% C 4.08%

lz4-compression jni-bindings java compressor lz4-java decompression lz4-compressors

lz4-java's Introduction

LZ4 Java

LZ4 compression for Java, based on Yann Collet's work available at http://code.google.com/p/lz4/.

This library provides access to two compression methods that both generate a valid LZ4 stream:

fast scan (LZ4):
- low memory footprint (~ 16 KB),
- very fast (fast scan with skipping heuristics in case the input looks incompressible),
- reasonable compression ratio (depending on the redundancy of the input).
high compression (LZ4 HC):
- medium memory footprint (~ 256 KB),
- rather slow (~ 10 times slower than LZ4),
- good compression ratio (depending on the size and the redundancy of the input).

The streams produced by those 2 compression algorithms use the same compression format, are very fast to decompress and can be decompressed by the same decompressor instance.

Implementations

For LZ4 compressors, LZ4 HC compressors and decompressors, 3 implementations are available:

JNI bindings to the original C implementation by Yann Collet,
a pure Java port of the compression and decompression algorithms,
a Java port that uses the sun.misc.Unsafe API in order to achieve compression and decompression speeds close to the C implementation.

Have a look at LZ4Factory for more information.

Compatibility notes

Compressors and decompressors are interchangeable: it is perfectly correct to compress with the JNI bindings and to decompress with a Java port, or the other way around.
Compressors might not generate the same compressed streams on all platforms, especially if CPU endianness differs, but the compressed streams can be safely decompressed by any decompressor implementation on any platform.

Examples

LZ4Factory factory = LZ4Factory.fastestInstance();

byte[] data = "12345345234572".getBytes("UTF-8");
final int decompressedLength = data.length;

// compress data
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength = compressor.maxCompressedLength(decompressedLength);
byte[] compressed = new byte[maxCompressedLength];
int compressedLength = compressor.compress(data, 0, decompressedLength, compressed, 0, maxCompressedLength);

// decompress data
// - method 1: when the decompressed length is known
LZ4FastDecompressor decompressor = factory.fastDecompressor();
byte[] restored = new byte[decompressedLength];
int compressedLength2 = decompressor.decompress(compressed, 0, restored, 0, decompressedLength);
// compressedLength == compressedLength2

// - method 2: when the compressed length is known (a little slower)
// the destination buffer needs to be over-sized
LZ4SafeDecompressor decompressor2 = factory.safeDecompressor();
int decompressedLength2 = decompressor2.decompress(compressed, 0, compressedLength, restored, 0);
// decompressedLength == decompressedLength2

byte[] data = "12345345234572".getBytes("UTF-8");
final int decompressedLength = data.length;

LZ4FrameOutputStream outStream = new LZ4FrameOutputStream(new FileOutputStream(new File("test.lz4")));
outStream.write(data);
outStream.close();

byte[] restored = new byte[decompressedLength];
LZ4FrameInputStream inStream = new LZ4FrameInputStream(new FileInputStream(new File("test.lz4")));
inStream.read(restored);
inStream.close();

xxhash Java

xxhash hashing for Java, based on Yann Collet's work available at https://github.com/Cyan4973/xxHash (old version http://code.google.com/p/xxhash/). xxhash is a non-cryptographic, extremly fast and high-quality (SMHasher score of 10) hash function.

Implementations

Similarly to LZ4, 3 implementations are available: JNI bindings, pure Java port and pure Java port that uses sun.misc.Unsafe.

Have a look at XXHashFactory for more information.

Compatibility notes

All implementation return the same hash for the same input bytes:
- on any JVM,
- on any platform (even if the endianness or integer size differs).

Example

XXHashFactory factory = XXHashFactory.fastestInstance();

byte[] data = "12345345234572".getBytes("UTF-8");
ByteArrayInputStream in = new ByteArrayInputStream(data);

int seed = 0x9747b28c; // used to initialize the hash value, use whatever
                       // value you want, but always the same
StreamingXXHash32 hash32 = factory.newStreamingHash32(seed);
byte[] buf = new byte[8]; // for real-world usage, use a larger buffer, like 8192 bytes
for (;;) {
  int read = in.read(buf);
  if (read == -1) {
    break;
  }
  hash32.update(buf, 0, read);
}
int hash = hash32.getValue();

Download

You can download released artifacts from Maven Central.

You can download pure-Java lz4-java from Maven Central. These artifacts include the Safe and Unsafe Java versions but not JNI bindings. (Experimental)

Documentation

Performance

Both lz4 and xxhash focus on speed. Although compression, decompression and hashing performance can depend a lot on the input (there are lies, damn lies and benchmarks), here are some benchmarks that try to give a sense of the speed at which they compress/decompress/hash bytes.

Build

Requirements

JDK version 7 or newer,
ant version 1.10.2 or newer,
ivy.

If ivy is not installed yet, ant can take care of it for you, just run ant ivy-bootstrap. The library will be installed under ${user.home}/.ant/lib.

You might hit an error like the following when the ivy in ${user.home}/.ant/lib is old. You can delete it and then run ant ivy-bootstrap again to install the latest version.

[ivy:resolve] 		::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve] 		::          UNRESOLVED DEPENDENCIES         ::
[ivy:resolve] 		::::::::::::::::::::::::::::::::::::::::::::::

Instructions

For lz4-java 1.5.0 or newer, first run git submodule init and then git submodule update to initialize the lz4 submodule in src/lz4.

Then run ant. It will:

generate some Java source files in build/java from the templates that are located under src/build,
compile the lz4 and xxhash libraries and their JNI (Java Native Interface) bindings,
compile Java sources in src/java (normal sources), src/java-unsafe (sources that make use of sun.misc.Unsafe) and build/java (auto-generated sources) to build/classes, build/unsafe-classes and build/generated-classes,
generate a JAR file called lz4-${version}.jar under the dist directory.

The JAR file that is generated contains Java class files, the native library and the JNI bindings. If you add this JAR to your classpath, the native library will be copied to a temporary directory and dynamically linked to your Java application.

lz4-java's People

Contributors

Stargazers

Watchers

Forkers

whoschek ruippeixotog hugoferreira newsky jnorthrup c0mpsc1 strategist922 patriotemeritus stevenschlansker arkadiuszsz fyaquafold michal-harish candido tvlehton pieroxy doylena imace battersky dfdezs bfreuden steve-scalyr hexiaofeng xerial blerer lyubent scottcarey bouviervj idelpivnitskiy spohl pluradj blambov rodrigoparada surabhshah ihsanh haikuowuya truyet dvasilen cloudninja42 vnktram v3n bigrats coderpaulk sergei-rodionov dfrsg meher421 mobilist arcodergh asaliseng amolkarande1993 drcrallen clockfort sushantkafle irdiavxhi n4j moujtechnologies sammychen105 anddegs java8964 jamalahmedmaaz pranjalpatil30 baskar007 pombredanne tonycrider codeaudit danielfree sunchao binque livehl razvandu davies gyscos tubemogul zhangweixin adamrlukaitis plusql thisismana alrikg leventov dray92 hiliujunyi duydcoco jirutka odaira looney32768 leinad75 antoine-tran ganeshraju rockyzhang-zz ganeshrajulinaro swri-robotics w3ss rudzen bryonglodencissp akashshakya chenrui2014 raisepublicwedding pramit46 liinnux jianbozheng hyqgod

lz4-java's Issues

LZ4BlockInputStream constructor only takes LZ4FastDecompressor instances as argument

I would expect any LZ4Decompressor to be usable here (just as any LZ4Compressor could be used with LZ4BlockOutputStream).

Please implement a streaming interface for lz4-java

I suggest this interface:
https://github.com/mooreb/lz4-java-stream

which implements LZ4InputStream, LZ4OutputStream and a test that verifies the streams work.

Remove the temporary library file on exit

net.jpountz.util.Native.java first copies the shared library file to a temporary file, and then tries to call System.load. There is a call to File.deleteOnExit to remove it, but does it work:

if some code calls System.exit?
under Windows, where you can't remove files which are still in use?

JNI Windows

Hi,

I am trying to use LZ4 as native on WIN 7 32/64bits but after compiling the project, I've got an UnsatisfiedLinkError.

The file generated is an *.so file under win32/x86/

This file should not be a *.dll file? Anyway I have already read a post which seems relate the same problem but i did not see any solution for that. How i can get this working?

Thx

enquiry of compressor/decompressor

In a multithreaded environment, what is better way for perfromance?

a. share a factory between multiple threads, create compressor/decompressor in each thread
b. share a compressor/decompressor between multiple threads
c. do not share anything, create everything ad-hoc
What is the behaviour of safe decompressor/fast decompressor if the output size is larger than the specified decompressedLength/allocated_buffer_size?

an exception is thrown or memory corruption occur?

arrayOffset() not taken into account by JNI ByteBuffer methods

Changing AbstractLZ4Test.Tester.BYTE_BUFFER to use slicing demonstrates the problem:

    public static final Tester<ByteBuffer> BYTE_BUFFER = new Tester<ByteBuffer>() {

      @Override
      public ByteBuffer allocate(int length) {
        ByteBuffer bb;
        int slice = randomInt(5);
        if (randomBoolean()) {
          bb = ByteBuffer.allocate(length + slice);
        } else {
          bb = ByteBuffer.allocateDirect(length + slice);
        }
        bb.position(slice);
        bb = bb.slice();
        if (randomBoolean()) {
          bb.order(ByteOrder.LITTLE_ENDIAN);
        } else {
          bb.order(ByteOrder.BIG_ENDIAN);
        }
        return bb;
      }

Optimised skip() method for LZ4BlockInputStream

If I only want to decompress some part of the file, let's say a section at the begining and some other bits at the end, I can use skip(). But it looks to me like skiping through the file would currently still mean calling refill() on every block in between. Am I right? Is it possible to only read the headers in between and and then only the block to be ready for the next call to read()? People might scan the file using an index for example.

lz4 streaming format 1.4

It doesn't look like the LZ4BlockOutputStream / LZ4BlockInputStream support the streaming format described here: http://fastcompression.blogspot.com/2013/04/lz4-streaming-format-final.html

Is this on the roadmap or is there a different set of streams I need to use for compatibility?

thanks

Compression/decompression methods with java.nio.ByteBuffer

support lz4frame format

Please implement support for the LZ4 framing format, as described here:
https://docs.google.com/document/d/1Tdxmn5_2e5p1y4PtXkatLndWVb0R8QARJFe6JI4Keuo/edit

There's a basic implementation already available in Apache Kafka, see:
https://github.com/apache/kafka/tree/trunk/clients/src/main/java/org/apache/kafka/common/message

Implement partial decompression

The C API supports partial decompression of a block in case only the first bytes are of interest.

`StreamingXXHash32.getValue()` should return a long

Hello,

I'm using your implementation of xxHash to check the integrity of uploaded files in a web application. As you can imagine, client side I'm using the JS implementation found here.

Problems arise when the calculated hash is >= 2^31: while client side the hash is correctly sent as unsigned, server side the call to lStreamingXXHash32.getValue() returns a negative int, due to the overflow. This makes the two values not comparable (without some hack, at least).

Maybe it would be better change the StreamingXXHash32.getValue() implementation so that it returns a long. That way no overflow would occur, and the original unsigned value would be returned.

Thanks for you work!

fastestInstance succeeds when it shouldn't

The factory call suceeds in returning JNI instance even when i don't have the JNI libs included.

I think you should call factory.fastCompressor().maxCompressedLength(100);

to make sure its really working.

problems with LZ4Block{Input,Output}Stream

Hi Adrien,

I tried your new LZ4Block{Input,Output}Stream as published in:
http://search.maven.org/#artifactdetails|net.jpountz.lz4|lz4|1.1.0|jar

and unfortunately I had problems.

I have committed the passing and failing tests into:
https://github.com/mooreb/lz4-java-stream/tree/master/tests-2013-02-11/for-jpountz

The error I am experiencing is:

java.lang.AssertionError: Bytes differed! Seed value was 1360613556874 expected:<-123> but was:<26>
    at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
    at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
    at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
    at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:200)
    at net.jpountz.lz4.LZ4StreamTest.assertEqualContent(LZ4StreamTest.java:148)
    at net.jpountz.lz4.LZ4StreamTest.assertContentInSingleBlock(LZ4StreamTest.java:133)
    at net.jpountz.lz4.LZ4StreamTest.randomizedTest(LZ4StreamTest.java:95)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
java.lang.AssertionError: Exception was thrown.  Seed value was 1360613556874
    at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
    at net.jpountz.lz4.LZ4StreamTest.randomizedTest(LZ4StreamTest.java:106)
    at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
    at org.testng.TestRunner.privateRun(TestRunner.java:767)
    at org.testng.TestRunner.run(TestRunner.java:617)
    at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
    at org.testng.SuiteRunner.run(SuiteRunner.java:240)
    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1198)
    at org.testng.TestNG.runSuitesLocally(TestNG.java:1123)
    at org.testng.TestNG.run(TestNG.java:1031)
    at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
    at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
    at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
    at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:111)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

The difference between the passing and failing tests is below:

$ diff -Naur passing-test/LZ4StreamTest.java failing-test/LZ4StreamTest.java
--- passing-test/LZ4StreamTest.java 2013-02-11 12:42:55.000000000 -0800
+++ failing-test/LZ4StreamTest.java 2013-02-11 12:43:15.000000000 -0800
@@ -39,7 +39,7 @@
     private void compressContent() throws IOException {
         ByteArrayOutputStream compressedOutputStream = new ByteArrayOutputStream();
 
-        LZ4OutputStream os = new LZ4OutputStream(compressedOutputStream);
+        LZ4BlockOutputStream os = new LZ4BlockOutputStream(compressedOutputStream);
@@ -77,7 +77,7 @@
     @Test
     public void randomizedTest() throws IOException {
         try {
-            InputStream is = new LZ4InputStream(new ByteArrayInputStream(compressedOutput));
+            InputStream is = new LZ4BlockInputStream(new ByteArrayInputStream(compressedOutput));
             int currentContentPosition = 0;

Hoping this helps and finds you well,

Stream is corrupted exception on IBM JDK 1.6

Hi,

when trying to use the IBM JDK, I get an exception as follows:

Exception in thread "main" java.io.IOException: Stream is corrupted
    at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:207)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:116)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:129)

my initialization looks as follows:

    @Override
    public OutputStream getCompressorOutputStream(OutputStream out) throws IOException {

        // block size
        int blockSize = 1 << 16;

        // compressor
        LZ4Factory factory = LZ4Factory.safeInstance();
        LZ4Compressor compressor = factory.fastCompressor();

        // checksum
        int seed = 0x9747b28c;
        XXHashFactory checksumFactory = XXHashFactory.safeInstance();
        Checksum checksum = checksumFactory.newStreamingHash32(seed).asChecksum();

        return new LZ4BlockOutputStream(out, blockSize, compressor, checksum, false);
    }

    @Override
    public InputStream getCompressorInputStream(InputStream in) throws IOException {

        // decompressor
        LZ4Factory factory = LZ4Factory.safeInstance();
        LZ4FastDecompressor decompressor = factory.fastDecompressor();

        // checksum
        int seed = 0x9747b28c;
        XXHashFactory checksumFactory = XXHashFactory.safeInstance();
        Checksum checksum = checksumFactory.newStreamingHash32(seed).asChecksum();

        return new LZ4BlockInputStream(in, decompressor, checksum);
    }

unfortunately, sometimes it also works :-( so, really no idea what the problem is. However, it never happens using either OpenJDK or OracleJDK.

Thanks and best,
Lukas

lz4.1.3.0 backward compatibility

noticed lz4.1.3.0 requires JDK7 to build, but does it have backward compatibility to run on JDK6 without any issue?

Calling flush on a closed LZ4BlockOutputStream throws NPE

Found this when using Jackson's ObjectMapper because it auto closes.. I know the Java spec is if its 'closed' and another operation is called its perfectly fine to throw an exception.. however for ease of use and the fact there's already support in the LZ4BlockOuputStream to call close repeatedly.. it was simple enough to patch flush..

JavaSafeCompressor missing in 1.2 and master branch

Lib builds well (except the unsafe-warnings), but the example doesn't run:

Exception in thread "main" java.lang.AssertionError: java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4JavaSafeCompressor

Similar to issue #19.

I'm also interested in building LZ4 with the pure java implementation only.

ArrayIndexOutOfBoundsException in LZ4BlockInputStream with lz4-java-1.1.1

I experienced this error in my real-world testing:

java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:129)

Android version

lz4-1.2.0.jar is incompatible with Android as in contains native libraries that will not run on the device. As a quick hack I've manually removed the following from the jar linux/amd64/liblz4-java.so, linux/i386/liblz4-java.so, win32/amd64/liblz4-java.and it so but it would be great to have a Android specific version with separate ARM and x86 native libraries. Great lib thanks for the hard work.

Recommended maxCompressedLength is often not adequate.

In running some tests, I've found that the suggestion made in the documentation to "make sure that maxDestLen >= maxCompressedLength(srcLen)" is often inadequate to avoid getting an LZ4Exception during compression if compressing small chunks of data.

Though you can presume that small bits of data won't be sent to the compressor since there would be a net loss, that assumption can easily fail in practice. To me, this seems to be unacceptable, especially since there is no fixed uncompressed array size at which the compressed array will actually be less than or equal. Depending on what data is being compressed, I've seen uncompressed byte arrays of only 14 compress down to 10 bytes, but I've also seen 28 byte arrays take 30 bytes once compressed.

To be honest, the entire implementation of designating a byte array of a fixed size prior to calling compression seems extremely non-Java and prone to memory waste. It would be trivial to have the compression implementation maintain a dynamic array internally and simply return the compressed byte array without the user messing with pre-allocation.

The other benefit of maintaining an internal dynamic array for compression is that the library would not require nearly as much memory to be allocated for the larger data to be compressed. When compression ratios range around 13%, allocating a static array for the full 100% is just stupid (imho).

Failing this change to the library, it would be nice to at least provide a helper method that returns a "good" maxDestLength when provided an uncompressed byte array.

For my own solution to calculate a maxDestLength that will (hopefully) work, I have done the following, but it is not a very satisfying solution:

maxDestLength = (data.length < 100) ? 100 : data.length();

solaris x64 Unsafe_GetInt crash

[Loaded net.jpountz.lz4.LZ4JavaUnsafeSafeDecompressor from file:/opt/cassandra/apache-cassandra-2.0.14/lib/lz4-1.2.0.jar]
[Loaded net.jpountz.lz4.LZ4Constants from file:/opt/cassandra/apache-cassandra-2.0.14/lib/lz4-1.2.0.jar]
[Loaded net.jpountz.util.UnsafeUtils from file:/opt/cassandra/apache-cassandra-2.0.14/lib/lz4-1.2.0.jar]

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e46a9f4, pid=29042, tid=2

JRE version: Java(TM) SE Runtime Environment (8.0_45-b14) (build 1.8.0_45-b14)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc6a9f4] Unsafe_GetInt+0x170

Add support for decompression with prefix

LZ4_decompress_safe_withPrefix64k and LZ4_decompress_fast_withPrefix64k in the C API.

Pull upstream r113 change that allows changing lz4hc compression level

lz4 r113 seems to contain a new change that allows changing HC compressor level. It would be great to get to work in lz4-java, cause some people are willing to spend CPU time to improve compression.

JNI / classloader issues

A JNI library can only be loaded once per classloader, so if any classloader which is not a child of the classloader which loaded the JNI library tries to use the JNI version, it will fail. This can be a problem with servlet containers.

snappy-java has a hack to inject bytecode into the root classloader but I'm not sure this is the way to go...

If not fixable, I think we should either document this behavior or forbid loading the JNI lib from a non-root classloader?

native resources in default package aren't easily maven-shade-able

The maven shade plugin cannot be well-used with this package because it leaves some resources (the native libraries) in the default package, which are not easily relocatable.

deploy to maven

It would be great if the code was deployed to a maven repo (obviously without jni support)

Source ByteBuffer gets tampered on Decompressing

Hi Team,
I am using LZ4 1.3.0 library to compress and decompress ByteBuffer. My sourcebuffer gets altered during decompression.
//Compression
int compressLen = compressor.compress(message, 0, decompressedLength, message, 0, maxCompressedLength);

//Decompression
int decompressed = deCompressor.decompress(msg, 0,compressedLen, bufferMsg, 0, bufferMsg.capacity());

Actually my source buffer contains many compressed messages separated by Identifiers and I am decompressing them one by one in loop.
First decompression works fine, but from second decompression, I start getting following error for all the subsequent messages "Error decoding offset 422 of input buffer".

However if i use a Temporary buffer as source buffer to decompress, decompression works fine for all the compressed messages.
//Decompression using tempBuffer
System.arraycopy(msg.array(),msg.arrayOffset(), tempMsg.array(),tempMsg.arrayOffset(), compressedLen);
int decompressed = deCompressor.decompress(tempMsg, 0,compressedLen , bufferMsg, 0, bufferMsg.capacity());

It seems to me that the SourceBuffer to be decompressed gets somehow modified on Decompression.

Can anybody please help, How do i achive this decompression without using a temp Buffer.

Crash with IBM Java 7

I ran into this problem when trying to use Apache Cassandra 2.0.7 with IBM Java 7. Cassandra crashes during start up and generates a core file. It is pretty easy to recreate with a Java program that calls on LZ4Factory.nativeInstance().

Stack trace

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "pool-1-thread-4" J9VMThread:0x0000000050B7C200, j9thread_t:0x00007F80B0E96660, java/lang/Thread:0x000000004E8D6410, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x17, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x4FE0, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO2            (native stack address range from:0x00007F80A5A98000, to:0x00007F80A5AD9000, size:0x41000)
3XMCPUTIME               CPU usage total: 0.484368829 secs
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=12930384 (0xC54D50)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at net/jpountz/lz4/LZ4JNI.LZ4_compress_limitedOutput(Native Method)
4XESTACKTRACE                at net/jpountz/lz4/LZ4JNICompressor.compress(LZ4JNICompressor.java:31)
4XESTACKTRACE                at net/jpountz/lz4/LZ4Factory.<init>(LZ4Factory.java:163)
4XESTACKTRACE                at net/jpountz/lz4/LZ4Factory.instance(LZ4Factory.java:46)
4XESTACKTRACE                at net/jpountz/lz4/LZ4Factory.nativeInstance(LZ4Factory.java:76)

Assertion failure

1XECTHTYPE     Current thread history (J9VMThread:0x0000000001080500)
3XEHSTTYPE     13:57:09:734842157 GMT j9mm.107 - * ** ASSERTION FAILED ** at StandardAccessBarrier.cpp:322: ((false && (elems == getArrayObjectDataAddress((J9VMToken*)vmThread, arrayObject))))

Allow to recycle compression buffers

When compressing lots of small buffers, it may happen that the bottleneck is the allocation of the hash table. There should be an option in order to reuse these hash tables per thread.

XXHash performance on ARM

I did some basic performance tests (https://github.com/neophob/PixelController/tree/develop) comparing XXHash and Adler32 on some ARM systems (RPi and BBB). XXHash should be must faster than Adler32, according to your benchmarks (http://jpountz.github.io/lz4-java/1.2.0/xxhash-benchmark/).

I use v1.2.0. So there are 3 possibilities:
a) my benchmark sucks
b) my code sucks
c) the code is not really fast on ARM

My code can be found here: https://github.com/neophob/PixelController/blob/develop/pixelcontroller-core/src/main/java/com/neophob/sematrix/core/perf/PerfTests.java

NullPointerException when LZ4Factory is loaded by bootstrap class loader

At https://github.com/jpountz/lz4-java/blob/aef24cfbbf53d6ade60073b287edc610be743e43/src/java/net/jpountz/lz4/LZ4Factory.java#L149, LZ4Factory.class.getClassLoader() might return null if LZ4Factory is loaded by the bootstrap class loader. It results in a NPE.

LZ4Factory is loaded by the bootstrap class loader in my application because I am using lz4 in a Java agent and this Java agent is loaded by the bootstrap class loader.

LZ4BlockInputStream lz4-java-1.1.1 needs to read fully

java.io.IOException: Stream is corrupted
at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:159)
at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:123)
at org.apache.commons.io.IOUtils.read(IOUtils.java:2454)
at org.apache.commons.io.IOUtils.read(IOUtils.java:2476)

I think this line

in.read(compressedBuffer, 0, HEADER_LENGTH);

needs to read fully.

Provide shared libraries for common platforms

For this issue, the hard part is testing...

Typo

There's a typo in LZ4Factory:
unknwonSizeDecompressor
should obviously be
unknownSizeDecompressor

safe compressor missing

I can't rely on platform-specific code, so I'm trying to get my compressor with LZ4Factory.safeInstance().fastCompressor(). That's failing with

Exception in thread "AWT-EventQueue-0" java.lang.AssertionError:
java.lang.ClassNotFoundException: net.jpountz.lz4.LZ4JavaSafeCompressor
at net.jpountz.lz4.LZ4Factory.instance(LZ4Factory.java:48)
at net.jpountz.lz4.LZ4Factory.safeInstance(LZ4Factory.java:85)

Can I build myself a jar that includes LZ4JavaSafeCompressor and/or a jar with no native code?

Document How To Use It

Please add an example of how to use this.

Compressed message doesn't start with magic number?

Hi,
I am trying to use this library, here is the simple class I am using to wrap it.

https://github.com/openworm/org.geppetto.core/blob/181813152b1747428a2c1aa3126dc3eb40638bb8/src/main/java/org/geppetto/core/common/LZ4Compress.java

When I try to compress the following string:

{"type":"read_url_parameters","data":"{}"}

the result of the compress call is the following:

[-16, 27, 123, 34, 116, 121, 112, 101, 34, 58, 34, 114, 101, 97, 100, 95, 117, 114, 108, 95, 112, 97, 114, 97, 109, 101, 116, 101, 114, 115, 34, 44, 34, 100, 97, 116, 97, 34, 58, 34, 123, 125, 34, 125]

Shouldn't this start with the magic number 0x184D2204?
I am trying to decode this with this library and it complains with:

Invalid magic number: 227B1BF0 @0

Am I doing something wrong? Thanks!

Misspelled API

In LZ4Factory, the method is named "unknwonSizeDecompressor". Looks like this is leftover from when the class name itself was also misspelled.

Large byte array compression on ARM Linux

My environment:
PandaBoard-ES, Ubuntu server, OpenJDK 1.7 - ZeroVM

Issue:
I've a large byte array containing 614400 elements (640 x 480 x 2)
I want to compress it and the program is frozen. When I try to kill the task, I've :

OpenJDK Zero VM warning: Exception java.lang.NullPointerException occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminated

After a while, the program automatically terminates with the error below:

A fatal error has been detected by the Java Runtime Environment:

Internal Error (os_linux_zero.cpp:285), pid=923, tid=2379707504
fatal error: caught unhandled signal 11

JRE version: 7.0_25-b30
Java VM: OpenJDK Zero VM (22.0-b10 mixed mode linux-arm )
Derivative: IcedTea 2.3.10
Distribution: Ubuntu 12.04 LTS, package 7u25-2.3.10-1ubuntu0.12.04.2
Failed to write core dump. Core dumps have been disabled. To enable core umping, try "ulimit -c unlimited" before starting Java again

An error report file with more information is saved as:
/home/db/Desktop/thesis_java/hs_err_pid923.log
Segmentation fault

Here is my code block:

ShortBuffer buffer = depthMD.getData().createShortBuffer();
ByteBuffer bb = ByteBuffer.allocate(sb.capacity()*2);
bb.asShortBuffer().put(sb);
byte[] data = new byte[bb.capacity()];
bb.get(data);
// Verified that I've a clean byte array data here.

LZ4Factory factory = LZ4Factory.fastestInstance();
int decompressedLength = data.length;
LZ4Compressor compressor = factory.fastCompressor();
int maxCompressedLength =
compressor.maxCompressedLength(decompressedLength);
byte[] compressed = new byte[maxCompressedLength];
//Next line gives that error
int compressedLength = compressor.compress(data, 0,decompressedLength, compressed, 0, maxCompressedLength);
System.out.println(data.length);

Next release for XXH64

When do you plan for next release sothat people can use XXH64 from release version? Is the master branch XXH64 code production ready?

JNI Problem

Hi
JNI headers file "jni.h" is exist in my JAVA.FRAMEWORK path but build output:
...
compile-java:
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/classes
[javac] Compiling 24 source files to /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/classes

generate-header:
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers

compile-jni:
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/objects
[mkdir] Created dir: /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni/${platform}/x86_64
[cpptasks:cc] 5 total files to be compiled.
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:2:17: error: jni.h: No such file or directory
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:16: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:24: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:32: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:40: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:48: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_lz4_LZ4JNI.h:56: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:21: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘OutOfMemoryError’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:28: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:33: error: expected ‘)’ before ‘’ token
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:42: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:70: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:98: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:126: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_lz4_LZ4JNI.c:154: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:2:17: error: jni.h: No such file or directory
[cpptasks:cc] In file included from /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:19:
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:18: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:26: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build/jni-headers/net_jpountz_xxhash_XXHashJNI.h:34: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:21: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘OutOfMemoryError’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:28: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘void’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:33: error: expected ‘)’ before ‘’ token
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:42: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’
[cpptasks:cc] /Volumes/Extends/lz4/jpountz-lz4-java-969fc61/src/jni/net_jpountz_xxhash_XXHashJNI.c:59: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘jint’

BUILD FAILED
/Volumes/Extends/lz4/jpountz-lz4-java-969fc61/build.xml:62: gcc failed with return code 1

Total time: 8 seconds

Some of the java classses are not found it says when excuting code.

Some of the java classses are not found it says when excuting code.
ie) net.jpountz.lz4.LZ4JavaSafeCompressor not found and other classes. Please tell me to resolve these issues.

How can i decompress value, if i don't know maxDestLen?

Hello.

How can i decompress value, if i don't know maxDestLen?

Idea to improve performance by eliminating one copying step

I'm going over your code while considering to use it for a Hazelcast enterprise feature. In the same effort I have done some I/O coding similar to the write(byte[], int, int) method of LZ4BlockOutputStream. The way I wrote it is not too complex, but avoids repeatedly copying into the internal buffer whenever the supplied buffer is larger than it. Basically,

void write(byte[] b, int off, int len) {
    while (len >= BUFFER_SIZE) {
        flushLocalBuffer();
        compress(b, off, BUFFER_SIZE);
        off += BUFFER_SIZE;
        len -= BUFFER_SIZE;
    }
    while (len > 0) {
        final int transferredCount = Math.min(BUFFER_SIZE - position, len);
        System.arraycopy(b, off, buf, position, transferredCount);
        off += transferredCount;
        len -= transferredCount;
        position += transferredCount;
        ensureBufHasRoom();
    }
}

Since performance is a high concern in this project, perhaps it is of value to bring up this detail.

Add helper methods to (de)compress a byte[] and return a byte[]

The API has been designed to avoid unnecessary object allocations, but it could be helpful when performance/memory doesn't matter much to have utility methods that take a byte[] and return a (de)compressed byte[].

LZ4BlockInputStream cannot read two consecutive write-close operations from two different LZ4BlockOutputStream

How to reproduce:
Run the test case testWriteCloseWriteCloseRead(). In pseudo code:

Write some data to a file with a LZ4BlockOutputStream and close the stream
Write some more data to the same file with a new LZ4BlockOutputStream and close the stream.
Read the sum of the data with one single instance of LZ4BlockInputStream

  /**
   * Write and close two stream instances to the same file. Read the entire data with one
   * LZ4BlockInputStream.
   */
  @Test
  public void testWriteCloseWriteCloseRead() throws IOException {
    final byte[] testBytes = "Testing!".getBytes(Charset.forName("UTF-8"));

    //Write the first time
    ByteArrayOutputStream bytes = new ByteArrayOutputStream();
    LZ4BlockOutputStream out = new LZ4BlockOutputStream(bytes);
    out.write(testBytes);
    out.close();

    //Write the second time
    out = new LZ4BlockOutputStream(bytes);
    out.write(testBytes);
    out.close();

    ByteArrayInputStream in = new ByteArrayInputStream(bytes.toByteArray());
    LZ4BlockInputStream lz4In = new LZ4BlockInputStream(in);
    DataInputStream dataIn = new DataInputStream(lz4In);

    byte[] buffer = new byte[testBytes.length];
    dataIn.readFully(buffer);
    assertArrayEquals(testBytes, buffer);

//    in.skip(LZ4BlockOutputStream.HEADER_LENGTH); //This test case can only be passed if 21 bytes (the footer) is skipped

    buffer = new byte[testBytes.length];
    dataIn.readFully(buffer);
    assertArrayEquals(testBytes, buffer);
  }

Actual:
An java.io.EOFException is thrown

Expected:
The sum of the data should be read and returned.

Analysis:
The LZ4BlockOutputStream will write a header, data and a footer. The footer is very similar to the header. Two LZ4BlockOutputStreams will create this:
Header | Compressed Data | Footer | Header |Compressed Data | Footer
One instance of LZ4BlockInputStream will read the header and the compressed data. If the user tries to read more data it will try to read a header again. But since it has not skipped the previous footer it will read the footer instead. The footer, although similar to the header contains a 0 length and will therefore return -1 from the read() method and the DataInputStream will thus throw a EOFException.

If the user manually skips 21 bytes (the length of the header/footer) the LZ4BlockInputStream will happily continue to read another “frame” (se the out-commeted row in the test case).

Workaround:
The user can manually call in.skip(21).

Suggested fix:
I think it would be appropriate if a LZ4BlockInputStream consumes all bytes related the one frame: that is the footer should be consumed when the end of the frame has been reached

I’m guessing the solution might be a bit trickier because the footer is related to the frame and the header to the block? (I’m probably using the term block and frame wrong)

Another approach would be to just say that this should not be possible. But this “feature” works with a normal GZIPOutputStream/GZIPInputStream so it would be good if it also works with LZ4.

Plz help to build

/jpountz-lz4-java-969fc61/build.xml:29: Problem: failed to create task or type antlib:org.apache.ivy.ant:resolve
Cause: The name is undefined.
Action: Check the spelling.
Action: Check that any custom tasks/types have been declared.
Action: Check that any / declarations have taken place.
No types or tasks have been defined in this namespace yet

This appears to be an antlib declaration.
Action: Check that the implementing library exists in one of:
-/usr/share/ant/lib
-/Users/User/.ant/lib
-a directory added on the command line with the -lib argument

Total time: 0 seconds

+I have already compiled and add ant-cpp-task

The JNI HC compressor should support maxDestLen < maxCompressedLength

This feature has been added to the C API.

Make the POM generate an OSGi compliant manifest

It would be nice if the POM could be extended to generate an OSGi compliant manifest file. The result of including this is that the library could then be used in OSGi containers.
This can be easily done adding to the pom something like this:

            <plugin>
                <groupId>org.apache.felix</groupId>
                <artifactId>maven-bundle-plugin</artifactId>
                <version>2.3.7</version>
                <extensions>true</extensions>
                <configuration>
                    <manifestLocation>src/main/java/META-INF</manifestLocation>
                    <supportedProjectTypes>
                        <supportedProjectType>jar</supportedProjectType>
                        <supportedProjectType>bundle</supportedProjectType>
                    </supportedProjectTypes>
                    <instructions>
                        <Bundle-SymbolicName>${project.groupId}.${project.artifactId}</Bundle-SymbolicName>
                        <Bundle-Version>${project.version}</Bundle-Version>
                        <Bundle-ClassPath>.,{maven-dependencies}</Bundle-ClassPath>
                    </instructions>
                </configuration>
            </plugin>

lz4 / lz4-java Goto Github PK

lz4-java's Introduction

LZ4 Java

Implementations

Compatibility notes

Examples

xxhash Java

Implementations

Compatibility notes

Example

Download

Documentation

Performance

Build

Requirements

Instructions

lz4-java's People

Contributors

Stargazers

Watchers

Forkers

lz4-java's Issues

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e46a9f4, pid=29042, tid=2

JRE version: Java(TM) SE Runtime Environment (8.0_45-b14) (build 1.8.0_45-b14)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc6a9f4] Unsafe_GetInt+0x170

Stack trace

Assertion failure

Recommend Projects

Recommend Topics

Recommend Org