Git Product home page Git Product logo

base64's Introduction

Azure Pipelines Code Coverage NuGet MyGet
Build Status codecov NuGet MyGet

Note

This project is WIP, as hardware intrinsics in .NET Core are also WIP.

gfoidl.Base64

A .NET library for base64 encoding / decoding, as well as base64Url support. Encoding can be done to buffers of type byte (for UTF-8) or char. Decoding can read from buffers of type byte (for UTF-8) or char.

Encoding / decoding supports buffer-chains, for example for very large data or when the data arrives in chunks.

In .NET Core 2.1+ encoding / decoding is done with SIMD-support:

Framework scalar SSSE3 AVX2
.NET Core 3.0 ✔️ ✔️ ✔️
.NET Core 2.1+ ✔️ ✔️
.NET Standard 2.0 ✔️

If available AVX will "eat" up as much as possible, then SSE will "eat" up as much as possible, finally scalar code processes the rest (including padding).

Note: SIMD-support (with HW-intrinsics) is officially supported for .NET Core 3.0 onwards. Hence SIMD-support for .NET Core 2.1+ is not official, and based on an experimental (but tested) solution. Further note that future versions of the JIT may not compile these bits anymore. So use this library with caution in a .NET Core 2.1+ project.

Usage

Basically the entry to encoder / decoder is Base64.Default for base64, and Base64.Url for base64Url.

Encoding

byte[] guid = Guid.NewGuid().ToByteArray();

string guidBase64 = Base64.Default.Encode(guid);
string guidBases64Url = Base64.Url.Encode(guid);

or Span<byte> based (for UTF-8 encoded output):

int guidBase64EncodedLength = Base64.Default.GetEncodedLength(guid.Length);
Span<byte> guidBase64UTF8   = stackalloc byte[guidBase64EncodedLength];
OperationStatus status      = Base64.Default.Encode(guid, guidBase64UTF8, out int consumed, out int written);

int guidBase64UrlEncodedLength = Base64.Url.GetEncodedLength(guid.Length);
Span<byte> guidBase64UrlUTF8   = stackalloc byte[guidBase64UrlEncodedLength];
status                         = Base64.Url.Encode(guid, guidBase64UrlUTF8, out consumed, out written);

Decoding

Guid guid = Guid.NewGuid();

string guidBase64    = Convert.ToBase64String(guid.ToByteArray());
string guidBase64Url = guidBase64.Replace('+', '-').Replace('/', '_').TrimEnd('=');

byte[] guidBase64Decoded    = Base64.Default.Decode(guidBase64);
byte[] guidBase64UrlDecoded = Base64.Url.Decode(guidBase64Url);

or Span<char> based:

int guidBase64DecodedLen    = Base64.Default.GetDecodedLength(guidBase64);
int guidBase64UrlDecodedLen = Base64.Url.GetDecodedLength(guidBase64Url);

Span<byte> guidBase64DecodedBuffer    = stackalloc byte[guidBase64DecodedLen];
Span<byte> guidBase64UrlDecodedBuffer = stackalloc byte[guidBase64UrlDecodedLen];

OperationStatus status = Base64.Default.Decode(guidBase64, guidBase64DecodedBuffer, out int consumed, out int written);
status                 = Base64.Url.Decode(guidBase64Url, guidBase64UrlDecodedBuffer, out consumed, out written);

Buffer chains

Buffer chains are handy when for encoding / decoding

  • very large data
  • data arrives is chunks, e.g. by reading from a (buffered) stream / pipeline
  • the size of data is initially unknown
  • ...
var rnd         = new Random();
Span<byte> data = new byte[1000];
rnd.NextBytes(data);

// exact length could be computed by Base64.Default.GetEncodedLength, here for demo exzessive size
Span<char> base64 = new char[5000];

OperationStatus status = Base64.Default.Encode(data.Slice(0, 400), base64, out int consumed, out int written, isFinalBlock: false);
status                 = Base64.Default.Encode(data.Slice(consumed), base64.Slice(written), out consumed, out int written1, isFinalBlock: true);

base64 = base64.Slice(0, written + written1);

Span<byte> decoded = new byte[5000];
status             = Base64.Default.Decode(base64.Slice(0, 100), decoded, out consumed, out written, isFinalBlock: false);
status             = Base64.Default.Decode(base64.Slice(consumed), decoded.Slice(written), out consumed, out written1, isFinalBlock: true);

decoded = decoded.Slice(0, written + written1);

See demo for further examples.

(Functional) Comparison to classes in .NET

General

.NET provides the classes System.Convert and System.Buffers.Text.Base64 for base64 operations.

base64Url isn't supported, so hacky solutions like

string base64 = Convert.ToBase64String(data);
string base64Url = base64.Replace('+', '-').Replace('/', '_').TrimEnd('=');

are needed. This isn't ideal, as there are avoidable allocations and several iterations over the encoded string (see here and here for benchmark results).

gfoidl.Base64 supports encoding / decoding to / from base64Url in a direct way. Encoding byte[] -> byte[] for UTF-8 is supported, as well as byte[] -> char[]. Decoding byte[] -> byte[] for UTF-8 is supported, as well as char[] -> byte[].

Further SIMD isn't utilized in the .NET classes. (Note: I've opened an issue to add SIMD-support to these classes).

Convert.ToBase64XYZ / Convert.FromBase64XYZ

These methods only support byte[] -> char[] as types for encoding, and char[] -> byte[] as types for decoding, where char[] can also be string or (ReadOnly)Span<char>.

To support UTF-8 another method call like

byte[] utf8Encoded = Encoding.ASCII.GetBytes(base64String);

is needed.

An potential advantage of this class is that it allows the insertion of line-breaks (cf. Base64FormattingOptions.InsertLineBreaks).

System.Buffers.Text.Base64

This class only supports byte[] -> byte[] for encoding / decoding. So in order to get a string Encoding has to be used.

An potential advantage of this class is the support for in-place encoding / decoding (cf. Base64.EncodeToUtf8InPlace, Base64.DecodeFromUtf8InPlace )

Benchmarks

For all benchmarks see results.

Performance gain depends, among a lot of other things, on the workload size, so here no table will with superior results will be shown.

Direct encoding to a string is for small inputs slower than Convert.ToBase64String (has less overhead, and can write to string-buffer in a direct way). But the larger the workload, the better this libraray works. For data-length of 1000 speedup can be ~4x with AVX2 encoding.

Direct decoding from a string is generally (a lot) faster than Convert.ConvertFromBase64CharArray, also depending on workload size, but in the benchmark the speedup is from 1.5 to 10x.

For UTF-8 encoding and decoding speedups for input-length 1000 can be in the height of 5 to 12x.

Note: please measure / profile in your real usecase, as this are just micro-benchmarks.

Acknowledgements

The scalar version of the base64 encoding / decoding is based on System.Buffers.Text.Base64.

The scalar version of the base64Url encoding / decoding is based on dotnet/extensions#334 and dotnet/extensions#338.

Vectorized versions (SSE, AVX) for base64 encoding / decoding is based on https://github.com/aklomp/base64 (see also Acknowledgements in that repository).

Vectorized versions (SSE, AVX) for base64Url encoding / decoding is based on https://github.com/aklomp/base64 (see Acknowledgements in that repository). For decoding (SSE, AVX) code is based on Vector lookup (pshufb) by Wojciech Mula.

base64's People

Contributors

gfoidl avatar martinnv6 avatar ycrumeyrolle avatar daverayment avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.