Git Product home page Git Product logo

Comments (10)

MiloszKrajewski avatar MiloszKrajewski commented on August 17, 2024 1

Rule number one of performance is: Measure!

I don't know what is bigger pain, memory allocation or large objects, or a little bit of pressure from small object.
Both BLOCK and Pickler need to allocate whole output buffer, so both will have same problem. Pickler is just BLOCK but a little but simpler to use if you don't want to get into details.

If you have a suspicion that most of the time images compress well, so those allocations are actually quite wasteful, you should try STREAM. It will definitely help with LOH allocation but for a price of some extra copying. Check out LZ4Frame class and use some Stream in memory. I say Stream in memory not MemoryStream because MemoryStream has horrible performance, but try https://github.com/MiloszKrajewski/K4os.Streams especially ChunkedByteBufferStream.

Let me know if it helps.

from k4os.compression.lz4.

MiloszKrajewski avatar MiloszKrajewski commented on August 17, 2024 1

This is an option, and I think it is better.
Problem with previous solution is that object cannot find itself:

s.Add(a);
a.V = 1337;
s.Contains(a); // false - what the hell?

The solution with RuntimeHelpers.GetHashCode(this) (which is default implementation) at least guarantees that:

s.Add(a);
a.V = 1337;
s.Containts(a); // true - it is still there

it definitely does not guarantee similar objects to be found:

s.Add(a);
a.V = 1337;
b.V = 1337;
s.Contains(a); // true - it is still there
s.Contains(b); // false - b is technically a different object than a, even if they "look" the same

from k4os.compression.lz4.

sn4k3 avatar sn4k3 commented on August 17, 2024

Thanks, I will have a look and do some testing. At first sight if I want to use ChunkedByteBufferStream with LZ4Frame I need to implement own IBufferWriter around ChunkedByteBufferStream, correct?

If you have a suspicion that most of the time images compress well

They do because model models are "simple" just continous white lines with some grey pixels

Example of a compression with a 8K layer:

rentSpan.Length = 33554432
encodedLength   = 126394

Although size will vary as each model is different (more or less pixels, with or without grey anti-aliasing / blur).

from k4os.compression.lz4.

MiloszKrajewski avatar MiloszKrajewski commented on August 17, 2024

Thanks, I will have a look and do some testing. At first sight if I want to use ChunkedByteBufferStream with LZ4Frame I need to implement own IBufferWriter around ChunkedByteBufferStream, correct?

Well, no - and yes.

I was thinking that you could use it as a Stream (not as IBufferWriter<byte>), and keep it as such stream, never actually allocating contiguous blocks of memory.

ArrayByteBuffer<byte> is allocating one block of memory, expanding it if it is needed. So it has a problem of reallocation, but at the same time, maybe if you streams compress well it is not really a problem, as it will not happen a lot.

Your options are:

  • use STREAM protocol with ChunkedByteBufferStream (which is a stream) (con: Stream related overhead)
  • use STREAM protocal with ArrayByteBuffer<byte> (con: realocation)

Third, but also quite complex option would be to wrap ChunkedByteBufferStream in some adapter implementing IBufferWriter<byte>, but I think you will find out some serious incompatibilities (related to GetSpan(...) requirement to assure returned block in contiguous).

I would definitely try 1 and 2.

from k4os.compression.lz4.

sn4k3 avatar sn4k3 commented on August 17, 2024

To ease things, I tried with SparseBufferWriter as it can pass directly to your LZ4Frame as IBufferWriter, however performance is a bit worse.

var span = mat.GetDataByteSpan();
using var buffer = new SparseBufferWriter<byte>(span.Length / 100);
LZ4Frame.Encode(span, buffer, LZ4Level.L00_FAST);
var result = new byte[buffer.WrittenCount];
buffer.CopyTo(result);
return result;

Here the result using my benchmark tool from the application itself:

image
45% worse in multi-thread over 5000 compressions in 4K

I will continue making some tests and compare...

EDIT 1: Using PooledBufferWriter 35% worse

var span = mat.GetDataByteSpan();
using var buffer = new PooledBufferWriter<byte>();
LZ4Frame.Encode(span, buffer, LZ4Level.L00_FAST);
return buffer.WrittenMemory.ToArray();

image

EDIT 2: Using ArrayBufferWriter 55% worse

image

So far the PooledBufferWriter have the less impact but still a high margin.

from k4os.compression.lz4.

MiloszKrajewski avatar MiloszKrajewski commented on August 17, 2024

Seems like streaming overhead is too much. What about another gamble:

Purposely allocate less memory and gamble. Compression may fail in such case so you will need to rerun it with full buffer in such case.

  • PRO: potential gain in optimistic scenarios
  • CON: twice as slow in pessimistic case

This is is kind of what pickler does: it does not allocate MaxOutputSize(sourceLength) just sourceLength, so when it fail we now that it cannot be compressed, so raw copy is enough.

Other option is create you simplistic streaming:

Split your input into 64kB blocks and compress them separately (sacrificing some compression ratio). The advantage is that might be that output buffer for each block can be same 64kB pooled byte buffer, and then you can store it as a chain of byte arrays. This is, of course, some extra processing, so maybe in the end allocating one big block is the best way ;-)

from k4os.compression.lz4.

sn4k3 avatar sn4k3 commented on August 17, 2024

Purposely allocate less memory and gamble. Compression may fail in such case so you will need to rerun it with full buffer in such case.

I thought about that, but I not yet find a good ratio. Without extra processing I have the bounding area and positive pixels. However compressed size can vary with the pattern of the pixels.
I may end in just store the bounding area instead of whole bitmap. For example on a 12k image if just 100x100 is used I may store only that at a cost of a crop.

Example of a bounding area (Green rect) in a 12K image:
image

This is is kind of what pickler does: it does not allocate MaxOutputSize(sourceLength) just sourceLength, so when it fail we now that it cannot be compressed, so raw copy is enough.

In real scenarios I never expect buffers near sourceLength, I only saw >sourceLength when using non-sense images like a full random noise bitmap generated by code...

Before ArrayPool I use normal array, VS complain about huge allocations, I turn over ArrayPool but I always suspect that would not help much because this information:

ArrayPool of int type for simplicity purposes. The defaultArrayPool is a shared instance that can handle managed ready-to-use arrays of integer type. This pool has a default max array length, equal to 2^20 ( 1024*1024 = 1.048. 576 ) bytes

Even a fullhd image pass that limitation however when I start using ArrayPool the compression performance increased by a good margin, so it's doing something, however 32 rents of 12k look insane to me and I wonder how it works in first place.


Anyway, your library is awesome! I also include other compressors and user can opt, but LZ4 prove to be the best.
Here the chart:

image

For deflate and gzip I'm using MemoryStream, but source is a UnmanagedMemoryStream

from k4os.compression.lz4.

sn4k3 avatar sn4k3 commented on August 17, 2024

I end in compress only the usable area of the image and that show's superior performance gain and less memory footprint in both compress/decompress time. I also set an threshold for it start to compress, eg: very small arrays don't need to compress.

Memory results:
image

My implementation: CMat

from k4os.compression.lz4.

MiloszKrajewski avatar MiloszKrajewski commented on August 17, 2024

This does not look right: https://github.com/sn4k3/UVtools/blob/2d542b02f451fac98ab23e3995820d176cd099a0/UVtools.Core/EmguCV/CMat.cs#L583

First, hash code should depend only on things which are fast to calculate (if you really need array, take 16 first and 16 last bytes, for example), but also can depend on immutable properties only.

class A { public int X; int GetHashCode() => X; bool Equals(A other) => X == other.X; }

var a = new A() { X = 42 }
var s = new HashSet<A>();
var l = new List<A>();

s.Add(a);
l.Add(a);

l.Contains(a); // true
s.Contains(a); // true

a.X = 1337;

l.Contains(a); // still true
s.Contains(a); // false !!!

from k4os.compression.lz4.

sn4k3 avatar sn4k3 commented on August 17, 2024

I admit that I use the default template when overriding Equals that brings all that GetHashCode.
I understand that the hash code should be fast and use only immutable fields. However, nothing on this class is immutable.
Even if I never expect to use this class on a Hash style I know it still a bad design.

I think I will just replace the GetHashCode with: return RuntimeHelpers.GetHashCode(this); so it returns the hashcode from handler. What you think?

from k4os.compression.lz4.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.