alois-xx / serializertests Goto Github PK
View Code? Open in Web Editor NEW.NET Serializer testing framework
.NET Serializer testing framework
Hello, when you get the time, would it be possible to add Apache Avro to your list of serializers, please? It has some nice dynamic typing features, but I'm curious to see how it stacks-up against the competition.
https://avro.apache.org/docs/current/
Thanks!
I love your benchmark and have used it in the past to select the best serializer for a project. I have been frustrated with the limitations of other serializers so I wrote my own over the past weekend.
https://github.com/Byrne-Labs/Serializer
https://www.nuget.org/packages/ByrneLabs.Serializer
Advantages:
Disadvantages
It will be much better if BenchmarkDotNet will be utilized to perform the measurements and produce the comparison charts. Currently, I have a too high deviation of the test run results which, from my experience, could be eliminated by using BenchmarkDotNet (it helps to produce the consistent results with the various statistical information - and also it has many more features which might help with tests bootstrapping (runtime selection, including Mono), preparation (warmup), operation (such as measuring separately the first run and the repeated performance (without counting the first run) without any extra work/code copypasting), processing the measurement results and rendering them in a chart automatically). I know this is a big thing to ask (considering how everything is already designed and implemented), but it can make this test suite perfect!
The current test suite is very light and too specific - just two classes one of which contains a single list of the other class instances, 95% of data volume are strings. You're mostly testing the throughput (which is also an important metric), but a real world scenario might include much more data kinds.
Also, I don't feel it's fair to compare tree serializers with full object graph serializers (including .NET BinaryFormatter which lead you to write the test suite and article!) - the use cases are very different.
I would suggest making two (or maybe even three, as not all full object graph serializers are full-featured) separate groups of tests for these two different kinds of serializers (though, object graph serializers can continue participating in tests for tree serializers).
The current test case need to be updated - same test as now (bookshelf-book), but more different kinds of data. Consider adding to Book class the following fields:
List<BookTag>
, and BookTag could be a enum with few values just for diversity sake)ulong
) might be a good idea, I don't feel using GUID
is good idea as it's itself very slow). And let's make it a readonly field - because that's pretty often the case!double
, could be in range 0-1 where 1 means "new" and 0 means "unreadable", just for example)DateTime
)TimeSpan
). And as with the unique book code, let's make this a readonly property.uint ID
and string Name
, total size of owners collection could be just 10 owners. I believe it will make the test much less biased towards the tree binary serializers as object graph serializers can reuse Owner instances where tree serializers will simply write the same objects again and again, wasting performance and unnecessarily increasing the output size. Arguably, it's a real world scenario.I propose to make a second test specifically for object graph serializers - with much more references (though not sure about the circular references - it might break compatibility with too many binary serializers from the list) and full verifications that the objects references are indeed restored (so we can be sure that a serializer is properly configured and doesn't produce clones where reference tracking is required). I don't have any good ideas yet, but we can invent something based on the bookshelf-book-owner case - say, add cross-references between owner and books she ever read (but it will be hard to avoid ciclic references in that case).
Another important thing is full C# support, including generics and null handling. I would prefer to see a third test case (or drop the proposed second test case in favor of this!) - which will be used only for serializers which can 99.9% replace BinaryFormatter
- including generics support, various collection types, support for fields of object
type (and boxed values), proper null handling, tuples (and ValueTuples), inheritance/interfaces support (polymorpic serialization, and ensure that private members of parent classes are properly (de)serialized), cyclic references, structs with readonly fields and other pretty common stuff for every experienced .NET developer. Though, no delegates and comparers (such as stored in Dictionary and HashSet) - AFAIK, there are no alternatives to BinaryFormatter
which can handle that. For everything else, at least Wire/Hyperion, AqlaSerializer and my own serializer will satisfy all these requirements and it will be very interesting to compare them against each other and BinaryFormatter
.
The article formatting is not good in a few places, especially the table of serializers feature comparison and examples of serializers output. Maybe because I'm using Firefox, but reading "Default Serializer Type" column is impossible to me - it's 1 char width!
Again, about the serializers comparison table: consider adding "supports C# generics" column and "custom requirements" (for example, "requires decoration of types and fields with its own custom attributes", "all fields/properties must be non-readonly") columns. For people who are looking for a replacement to BinarySerializer
it will be very helpful!
Regards!
UPD. I understand that the initial intention was partly to demonstrate how slowly BinaryFormatter
performed when objects count exceeded the defined threshold. But this test case will be soon irrelevant (thanks to your report!) except the legacy cases. You can keep the current test case, but I think developers will be more interested in a more general test case to compare serializers.
Another benefit of BenchmarkDotNet will be memory allocation measurements and GC count. I can definitely say that it's possible to write a C# binary serializer which doesn't perform (almost) any allocations during (de)serialization (other than allocation of the instantiated objects themselves during the deserialization) as this is what I've already done. Some allocations are required for any object graph serializer (to track references, to buffer chars, etc) but these allocations might be done in a session object which could be reused for future calls of the (de)serialize method and with enough capacity re-allocations are not required.
This is the repository:
https://github.com/rikimaru0345/Ceras
The developer did his own benchmarks but I would really love to see the results in your test suit.
Would you be able to implement SpanJson (https://github.com/Tornhoof/SpanJson) serializer into your test suite? It's supposed to be really fast.
Maybe the test should use those.
I've tried it on .NET 7 and didn't seem to get better results. Better startup, yes, but other than a regression for 200000 objects, nothing notable.
Perhaps I've missed smth.
I do see a significant improvement in the test here. Maybe the test object is too small, it only has a few properties.
Would you be able to write another blog post summarizing the benchmarks that have been added since sep 2019? Would be really useful.
Many thanks for the effort to provide value to the community. Your work saved me a ton of time making a build vs Nuget decision for a recent project. On your next iteration, you might want to kick the tires on HyperSerializer...
https://github.com/Hyperlnq/HyperSerializer
Again, much gratitude for the thoughtful and comprehensive study.
Sincerely
Adam
https://github.com/akkadotnet/Hyperion/ which is a fork of Wire, seems to be very well maintained.
var testData = CreateTestData(); // no TestData property!
var times = Test(nTimes, () =>
{
var dataStream = GetMemoryStream();
TestSerializeOnly(dataStream, testData);
});
MemoryStream should be created with preallocating reasonable amount of memory for all the cases (so there will be no memory reallocation every time the memory stream is exceeding capacity, otherwise latter serializer runs will produce favored results).
No GC performed before each test run, but a lot of the previous serializer test run artifacts are still allocated in GC0/GC1 and even GC2 heaps (generated IL code). In some cases it might lead to the sudden GC collection and dramatically affect the results (in my case up to 30% serialization/deserialization duration increase, making comparison of serializers totally invalid). Usage of BenchmarkDotNet to perform the test runs can handle this issue perfectly #5 For now I would suggest adding before each stopwatch starts call of this method:
private void CollectGC()
{
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
// just to ensure complete GC
for (var i = 0; i < 3; i++)
{
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
I've done these changes locally (except #2) and the results are much smaller total time and much better consistency between the test suite launches.
Regards!
Hello!
Congrats for making a great serializer tests suite!
I'm very interested in integrating my binary serializer and comparing it with others.
I just pulled the repository, opened it with VS2017, restored the NuGet packages, but can't build it because of this conflict:
The type 'JSONParameters' exists in both 'fastJSON, Version=2.1.0.0, Culture=neutral, PublicKeyToken=6b75a806b86095cd' and 'fastJSON, Version=2.1.0.0, Culture=neutral, PublicKeyToken=null'
I temporary commented out all the code related to fastJSON, but would prefer to restore it for full comparison and eventual pull-request adding my binary serializer.
Regards!
Thank you for the adding my serializers and I read yout article(2022, and 2019,2018 from before).
By the way, the deserialization of FlattBuffers is at the timing when each property is accessed, and the serialization is also at the timing when the object is constructed.
In other words, it serialize here.
https://github.com/Alois-xx/SerializerTests/blob/master/Program.cs#L718
For example, BookFlat.EndBookFlat calls builder.EndTable();
https://github.com/Alois-xx/SerializerTests/blob/master/TypesToSerialize/BookFlat.cs#L31
https://github.com/google/flatbuffers/blob/master/net/FlatBuffers/FlatBufferBuilder.cs#L804
This is clearly the serialization process itself.
Therefore, shouldn't the benchmark serialization assemble BookShelfFlat here instead of ToFullArray?
https://github.com/Alois-xx/SerializerTests/blob/master/Serializers/FlatBuffer.cs#L23
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.