Git Product home page Git Product logo

serializertests's People

Contributors

alois-xx avatar aloiskraus avatar anderssonpeter avatar dbolin avatar itadapter avatar jamescourtney avatar lbargaoanu avatar neuecc avatar salarcode avatar silkfire avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

serializertests's Issues

New serializer

I love your benchmark and have used it in the past to select the best serializer for a project. I have been frustrated with the limitations of other serializers so I wrote my own over the past weekend.

https://github.com/Byrne-Labs/Serializer
https://www.nuget.org/packages/ByrneLabs.Serializer

Advantages:

  • Requires absolutely no decorations or markup of any kind
  • Does not require default constructors
  • Supports references
  • Supports polymorphism
  • Supports interfaces
  • Supports multi-dimensional arrays

Disadvantages

  • Slower than just about everything on your benchmark
  • Large output
  • Questionable quality (like I said, I just wrote it over the weekend)

Feedback regarding the test suite - ideas how to improve it

  • It will be much better if BenchmarkDotNet will be utilized to perform the measurements and produce the comparison charts. Currently, I have a too high deviation of the test run results which, from my experience, could be eliminated by using BenchmarkDotNet (it helps to produce the consistent results with the various statistical information - and also it has many more features which might help with tests bootstrapping (runtime selection, including Mono), preparation (warmup), operation (such as measuring separately the first run and the repeated performance (without counting the first run) without any extra work/code copypasting), processing the measurement results and rendering them in a chart automatically). I know this is a big thing to ask (considering how everything is already designed and implemented), but it can make this test suite perfect!

  • The current test suite is very light and too specific - just two classes one of which contains a single list of the other class instances, 95% of data volume are strings. You're mostly testing the throughput (which is also an important metric), but a real world scenario might include much more data kinds.
    Also, I don't feel it's fair to compare tree serializers with full object graph serializers (including .NET BinaryFormatter which lead you to write the test suite and article!) - the use cases are very different.
    I would suggest making two (or maybe even three, as not all full object graph serializers are full-featured) separate groups of tests for these two different kinds of serializers (though, object graph serializers can continue participating in tests for tree serializers).
    The current test case need to be updated - same test as now (bookshelf-book), but more different kinds of data. Consider adding to Book class the following fields:

    • List of tags (let's say List<BookTag>, and BookTag could be a enum with few values just for diversity sake)
    • Unique book code (64 bit unsigned integer (ulong) might be a good idea, I don't feel using GUID is good idea as it's itself very slow). And let's make it a readonly field - because that's pretty often the case!
    • "Condition" metric (double, could be in range 0-1 where 1 means "new" and 0 means "unreadable", just for example)
    • Date of issue (DateTime)
    • Time since last read (TimeSpan). And as with the unique book code, let's make this a readonly property.
    • Dictionary of readers/owners and how many times the book was read (Dictionary<Owner, int>) populated only for, say, 15% of books, with up to 5 owners per book. Null if there are no owners (or instantiated as empty if a particular serializer doesn't support serialization of null objects). Owner might be a simple class with uint ID and string Name, total size of owners collection could be just 10 owners. I believe it will make the test much less biased towards the tree binary serializers as object graph serializers can reuse Owner instances where tree serializers will simply write the same objects again and again, wasting performance and unnecessarily increasing the output size. Arguably, it's a real world scenario.

    I propose to make a second test specifically for object graph serializers - with much more references (though not sure about the circular references - it might break compatibility with too many binary serializers from the list) and full verifications that the objects references are indeed restored (so we can be sure that a serializer is properly configured and doesn't produce clones where reference tracking is required). I don't have any good ideas yet, but we can invent something based on the bookshelf-book-owner case - say, add cross-references between owner and books she ever read (but it will be hard to avoid ciclic references in that case).
    Another important thing is full C# support, including generics and null handling. I would prefer to see a third test case (or drop the proposed second test case in favor of this!) - which will be used only for serializers which can 99.9% replace BinaryFormatter - including generics support, various collection types, support for fields of object type (and boxed values), proper null handling, tuples (and ValueTuples), inheritance/interfaces support (polymorpic serialization, and ensure that private members of parent classes are properly (de)serialized), cyclic references, structs with readonly fields and other pretty common stuff for every experienced .NET developer. Though, no delegates and comparers (such as stored in Dictionary and HashSet) - AFAIK, there are no alternatives to BinaryFormatter which can handle that. For everything else, at least Wire/Hyperion, AqlaSerializer and my own serializer will satisfy all these requirements and it will be very interesting to compare them against each other and BinaryFormatter.

  • The article formatting is not good in a few places, especially the table of serializers feature comparison and examples of serializers output. Maybe because I'm using Firefox, but reading "Default Serializer Type" column is impossible to me - it's 1 char width!

  • Again, about the serializers comparison table: consider adding "supports C# generics" column and "custom requirements" (for example, "requires decoration of types and fields with its own custom attributes", "all fields/properties must be non-readonly") columns. For people who are looking for a replacement to BinarySerializer it will be very helpful!

Regards!

UPD. I understand that the initial intention was partly to demonstrate how slowly BinaryFormatter performed when objects count exceeded the defined threshold. But this test case will be soon irrelevant (thanks to your report!) except the legacy cases. You can keep the current test case, but I think developers will be more interested in a more general test case to compare serializers.

Another benefit of BenchmarkDotNet will be memory allocation measurements and GC count. I can definitely say that it's possible to write a C# binary serializer which doesn't perform (almost) any allocations during (de)serialization (other than allocation of the instantiated objects themselves during the deserialization) as this is what I've already done. Some allocations are required for any object graph serializer (to track references, to buffer chars, etc) but these allocations might be done in a session object which could be reused for future calls of the (de)serialize method and with enough capacity re-allocations are not required.

System.Text.Json source generators

Maybe the test should use those.
I've tried it on .NET 7 and didn't seem to get better results. Better startup, yes, but other than a regression for 200000 objects, nothing notable.
Perhaps I've missed smth.
I do see a significant improvement in the test here. Maybe the test object is too small, it only has a few properties.

New benchmark for January 2020

Would you be able to write another blog post summarizing the benchmarks that have been added since sep 2019? Would be really useful.

Thank You + HyperSerializer

Many thanks for the effort to provide value to the community. Your work saved me a ton of time making a build vs Nuget decision for a recent project. On your next iteration, you might want to kick the tires on HyperSerializer...

https://github.com/Hyperlnq/HyperSerializer

Again, much gratitude for the thoughtful and comprehensive study.

Sincerely

Adam

A few flaws

  1. TestData is generated in time which is counted by stopwatch. I'm not sure, but it seems it might favor the latter serializers in the testing order. We want to compare the serializers performance and counting generation of the test data (which is very slow and heavy due to strings generation,) towards the serialization duration is a serious flaw. Better if the test data will be pre-generated and simply passed in a delegate. For example
var testData = CreateTestData(); // no TestData property!
var times = Test(nTimes, () =>
{
    var dataStream = GetMemoryStream();
    TestSerializeOnly(dataStream, testData);
});
  1. MemoryStream should be created with preallocating reasonable amount of memory for all the cases (so there will be no memory reallocation every time the memory stream is exceeding capacity, otherwise latter serializer runs will produce favored results).

  2. No GC performed before each test run, but a lot of the previous serializer test run artifacts are still allocated in GC0/GC1 and even GC2 heaps (generated IL code). In some cases it might lead to the sudden GC collection and dramatically affect the results (in my case up to 30% serialization/deserialization duration increase, making comparison of serializers totally invalid). Usage of BenchmarkDotNet to perform the test runs can handle this issue perfectly #5 For now I would suggest adding before each stopwatch starts call of this method:

private void CollectGC()
{
    GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;

    // just to ensure complete GC
    for (var i = 0; i < 3; i++)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
}

I've done these changes locally (except #2) and the results are much smaller total time and much better consistency between the test suite launches.

Regards!

Can't build solution

Hello!
Congrats for making a great serializer tests suite!
I'm very interested in integrating my binary serializer and comparing it with others.

I just pulled the repository, opened it with VS2017, restored the NuGet packages, but can't build it because of this conflict:
The type 'JSONParameters' exists in both 'fastJSON, Version=2.1.0.0, Culture=neutral, PublicKeyToken=6b75a806b86095cd' and 'fastJSON, Version=2.1.0.0, Culture=neutral, PublicKeyToken=null'

I temporary commented out all the code related to fastJSON, but would prefer to restore it for full comparison and eventual pull-request adding my binary serializer.

Regards!

FlatBuffers serialization timing

Thank you for the adding my serializers and I read yout article(2022, and 2019,2018 from before).
By the way, the deserialization of FlattBuffers is at the timing when each property is accessed, and the serialization is also at the timing when the object is constructed.

In other words, it serialize here.
https://github.com/Alois-xx/SerializerTests/blob/master/Program.cs#L718

For example, BookFlat.EndBookFlat calls builder.EndTable();
https://github.com/Alois-xx/SerializerTests/blob/master/TypesToSerialize/BookFlat.cs#L31
https://github.com/google/flatbuffers/blob/master/net/FlatBuffers/FlatBufferBuilder.cs#L804
This is clearly the serialization process itself.

Therefore, shouldn't the benchmark serialization assemble BookShelfFlat here instead of ToFullArray?
https://github.com/Alois-xx/SerializerTests/blob/master/Serializers/FlatBuffer.cs#L23

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.