Git Product home page Git Product logo

stringbuilder's Introduction

StringBuilder

.NET Nuget GitHub tag

A fast and low allocation StringBuilder for .NET.

Getting Started

Install the package:

PM> Install-Package LinkDotNet.StringBuilder

Afterward, use the package as follow:

using LinkDotNet.StringBuilder; // Namespace of the package

ValueStringBuilder stringBuilder = new ValueStringBuilder();
stringBuilder.AppendLine("Hello World");

string result = stringBuilder.ToString();

There are also smaller helper functions, which enable you to use ValueStringBuilder without any instance:

using LinkDotNet.StringBuilder;

_ = ValueStringBuilder.Concat("Hello ", "World"); // "Hello World"
_ = ValueStringBuilder.Concat("Hello", 1, 2, 3, "!"); // "Hello123!"

What does it solve?

The dotnet version of the StringBuilder is an all-purpose version that normally fits a wide variety of needs. But sometimes, low allocation is key. Therefore I created the ValueStringBuilder. It is not a class but a ref struct that tries to allocate as little as possible. If you want to know how the ValueStringBuilder works and why it uses allocations and is even faster, check out this blog post. The blog goes into a bit more in detail about how it works with a simplistic version of the ValueStringBuilder.

What it doesn't solve!

The library is not meant as a general replacement for the StringBuilder shipped with the .net framework itself. You can head over to the documentation and read about the "Known limitations". The library works best for a small to medium amount of strings (not multiple 100'000 characters, even though it can be still faster and uses fewer allocations). At any time, you can convert the ValueStringBuilder to a "normal" StringBuilder and vice versa.

The normal use case is to add concatenate strings in a hot path where the goal is to put as minimal pressure on the GC as possible.

Documentation

More detailed documentation can be found here. It is really important to understand how the ValueStringBuilder works so that you did not run into weird situations where performance/allocations can even rise.

Benchmark

The following table gives you a small comparison between the StringBuilder which is part of .NET and the ValueStringBuilder:

BenchmarkDotNet=v0.13.2, OS=macOS Monterey 12.6.1 (21G217) [Darwin 21.6.0]
Apple M1 Pro, 1 CPU, 10 logical and 10 physical cores
.NET SDK=7.0.100-rc.2.22477.23
  [Host]     : .NET 6.0.10 (6.0.1022.47605), Arm64 RyuJIT AdvSIMD
  DefaultJob : .NET 6.0.10 (6.0.1022.47605), Arm64 RyuJIT AdvSIMD


|                         Method |       Mean |    Error |   StdDev | Ratio | RatioSD |    Gen0 | Allocated | Alloc Ratio |
|------------------------------- |-----------:|---------:|---------:|------:|--------:|--------:|----------:|------------:|
|            DotNetStringBuilder |   227.3 ns |  1.31 ns |  1.22 ns |  1.00 |    0.00 |  0.7114 |    1488 B |        1.00 |
|             ValueStringBuilder |   128.7 ns |  0.57 ns |  0.53 ns |  0.57 |    0.00 |  0.2677 |     560 B |        0.38 |
| ValueStringBuilderPreAllocated |   113.9 ns |  0.67 ns |  0.60 ns |  0.50 |    0.00 |  0.2677 |     560 B |        0.38 |

For more comparison check the documentation.

Another benchmark shows that this ValueStringBuilder uses less memory when it comes to appending ValueTypes such as int, double, ...

|              Method |     Mean |    Error |   StdDev |  Gen 0 | Allocated |
|-------------------- |---------:|---------:|---------:|-------:|----------:|
| DotNetStringBuilder | 17.21 us | 0.622 us | 1.805 us | 1.5259 |      6 KB |
|  ValueStringBuilder | 16.24 us | 0.496 us | 1.462 us | 0.3357 |      1 KB |

Checkout the Benchmark for a more detailed comparison and setup.

stringbuilder's People

Contributors

akoken avatar dependabot[bot] avatar github-actions[bot] avatar linkdotnet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

stringbuilder's Issues

DotNetStringBuilder is faster

Seems like you removed the DotNetStringBuilder bench, but after adding it back this is what I get:

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3593/23H2/2023Update/SunValley3)
AMD Ryzen 7 7800X3D, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.300
  [Host]   : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  ShortRun : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

Job=ShortRun  IterationCount=3  LaunchCount=1
WarmupCount=3

| Method              | Mean      | Error    | StdDev   | Gen0   | Gen1   | Allocated |
|-------------------- |----------:|---------:|---------:|-------:|-------:|----------:|
| ValueStringBuilder  | 176.37 ns | 62.05 ns | 3.401 ns | 0.0250 |      - |   1.23 KB |
| DotNetStringBuilder |  97.29 ns | 35.04 ns | 1.920 ns | 0.0315 | 0.0001 |   1.54 KB |

ZString benchmarks are wrong and misleading

You never dispose the ZString instance which means that the pooled buffer is never returned. This will dramatically impact performance.

As an aside, is this library based off of the dotnet/runtime ValueStringBuiilder? It seems to be nearly the same implementation, just split up.

Is it worth renaming private variables?

Hi! I like your project! But, I have a question, is it worth renaming private variables?
I ask this because in the C# convention they recommend using underscore to private variables.

Use rope-like structure with `ref` fields

In the current setup the ValueStringBuilder has one large array internally. Where this is fine for most cases, it can get tricky for either very big buffers (>85kb) or if we have to grow the buffer a lot.

A different approach would be to use a rope-like structure with a fixed size (kind of like the real StringBuilder that has always chunks of 8000 characters).

In contrast to System.Text.StringBuilder, one could try to create always new ropes when appending. So our ValueStringBuilder would be structured like a linked list. When appending a string we add another instance of ValueStringBuilder to the current one.

The internal structure could look like this:

public ref struct ValueStringBuilder
{
    public ref ValueStringBuilder next;

    public void Append(ReadOnlySpan<char> string)
    {
        ref var nextInstance = next;
        while(!Unsafe.IsNullRef(nextInstance.next)) { nextInstance = nextInstance.next; }

        nextInstance.next = new ValueStringBuilder(string);
    }
}

Currently this is not allowed with C# 11.

Curious about struct layout and packing

Hey, man. I was reading up on .NET struct memory layout, and I remembered something about your struct that I was curious about. Why do you declare your struct to have sequential layout while at the same time ordering the fields in a way that, with packing, causes it to have 4 bytes of wasted space on x64, especially since it's not intended to be used in an array?

Add `AppendFormat` methods

The System.Text.StringBuilder offers some AppendFormat methods that make life easier if you want to format your values:

var sb = new StringBuilder();
sb.AppendFormat("{0} + {1} = {2}", 1, 2, 3);

It would be nice to have the same set of methods for the ValueStringBuilder. One major difference would be that the ValueStringBuilder would offer non-boxed versions:

public ref struct ValueStringBuilder
{
    public void AppendFormat<T>(ReadOnlySpan<char> format, T arg1);
    public void AppendFormat<T1, T2>(ReadOnlySpan<char> format, T1 arg1, T2 arg2);
    public void AppendFormat<T1, T2, T3>(ReadOnlySpan<char> format, T1 arg1, T2 arg2, T3 arg3);
    // ...
}

We want to avoid boxing and unboxing as much as possible. Still, there can be a convenient function that takes a params object[] as input (maybe it should have a different naming to indicate the boxing/unboxing nature).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.