invertedtomato / packing Goto Github PK
View Code? Open in Web Editor NEWLibrary for encoding integers in the minimal number of bits. Includes VLQ and Elias Omega encoding.
License: MIT License
Library for encoding integers in the minimal number of bits. Includes VLQ and Elias Omega encoding.
License: MIT License
Works on Linux/Mac/Arm:
dotnet build
dotnet pack -c Release
dotnet test
dotnet run -c Release
Will port performance into https://github.com/dotnet/BenchmarkDotNet either. And target .NET Standard 2.0.
May consider replace
public override void EncodeMany(IByteWriter stream, UInt64[] values, Int32 offset, Int32 count)
with
public override void EncodeMany<T>(T stream, UInt64[] values, Int32 offset, Int32 count)
where T:IByteWriter
It may improve performance.
https://github.com/invertedtomato/integer-compression#elias-omega
States that is good for under 8 values, but on graph fibonacci seems better.
Would you accept test showing when one codec is strictly better than other? I like to compare fibonacci thomson and elias omega. I.e. when we define in game limits for values, I can use different codec depending on limits.
Hi, I've just started using your library, as you said, Fibonacci is great!, I've got a 30% improvement over my VarInt implementation.
My concern is about the codec API.
The IUnsignedCompressor and IUnsignedDecompressor are very clean and very straightforward... but they rely on System.IO.Stream, which has a lot of internal overhead, and is very slow for many operations.
For serialization and deserialization, in my experince, it is faster to just read or write the full binary blob of a file, and do the encoding/decoding job in memory. Yes, there's MemoryStream, but in the end it's just an extra layer over a plain Byte[]
array. Additionally, there's now a lot of new toys in c# like ArraySegment or even Span that allow for very fast array processing.
Lately, I've been changing my serialization APIs to look from this:
void Write(Stream s, int v);
int Read(Stream s);
to something like this:
void Write(IList<Byte>, int v);
ArraySegment<Byte> Read(ArraySegment<Byte> ptr, out int v); // returns the advanced ptr
When writing, I write to a List which is much easier to manipulate than a Stream, like, for example writing some bytes and the editing the header with the bytelength. and then it's easy to write the whole list to an array.
For reading, I read all the bytes to a plan Byte[] array, and I use ArraySegment as a sorts of pointer reading.
But if that's too extreme, maybe something like this could do:
interface IStream
{
void WriteByte(Byte value);
Byte ReadByte();
}
As a replacement of System.IO.Stream , so developers could roll their own reading/writing mechanisms
As I understand I can encode single number via Fibonacci? Could you split it into separate class? I would like that any encoder which does not relies on range of value could be used without reliance on custom writer-reader-buffer-steam.
Please migrate to live package or source code
https://github.com/invertedtomato/buffers/blob/master/Library/IO/Bits/BitOperation.cs
May be reuse some from
as internal copy paste
As of now Fibonacci will throw on int64.max. It is possible to make it not to throw (so it will reduce performance a little), but will be type safe. I.e. handle +1 out of band of number.
But this reduction will be same as already reduced by error handling https://gitlab.com/dzmitry-lahoda/dotnet-system-except
Alternative is to introduce int63(or like), but that also will reduce perf, may be substantially and do intrusive into API of users. I have tested for floats https://gitlab.com/dotnet-fun/dotnet-system-numerics-algebra/blob/master/benchmarks/UnsafeRefFloat.cs .
I am trying to replace 7bit with fib for u8 and u16, but avoid surprises.
So may be avoid these in integer-compression
could be good too.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.