Currently, there are only impressive numbers available for the C++ implementation. The

<a href="https://github.com/ennerf/flatbuffers-java-benchmark/blob/master/src/main/jav

Publish Benchmarks for Java and Android VMs about flatbuffers HOT 28 CLOSED

google commented on April 28, 2024 1

Publish Benchmarks for Java and Android VMs

from flatbuffers.

Comments (28)

commented on April 28, 2024 1

Agreed, though rather than just comparing it against C++, it be great to compare against Protocol Buffers for Java, and other similar solution.

This would be a great thing for an external contributor to try out (hint hint :)

from flatbuffers.

evolutional commented on April 28, 2024 1

I know this is ~11 months old, but are these Benchmarks something that we can use today? I'm looking at porting them to C# so we can compare C++/Java/C# - ideally add other languages too.

from flatbuffers.

ennerf commented on April 28, 2024

I was planning on writing some internal benchmarks comparing the Java implementations for FlatBuffers and Protobuf within the next few weeks.

If you give me access to the schema and maybe the source files you've used in http://google.github.io/flatbuffers/md__benchmarks.html, I may be able to provide the (non-Android) Java numbers.

from flatbuffers.

commented on April 28, 2024

I'll get you those files.

from flatbuffers.

ennerf commented on April 28, 2024

I've implemented a first version. Note that,

I currently don't have any information about GC triggers, so the numbers (especially for encoding/decoding protobufs) may be worse in real world use.
FlatBuffers-Java still needs some optimization
The test-data may not reflect your use case (!!!)
edit: 4) The encode/decode numbers look a bit better than I'd expect. There may be some optimizations that I'm not properly accounting for.

So far the numbers (sec for 1M operations) look like the following:

OS: Windows 8.1
JDK: Oracle JDK 1.8.0_20-b26
CPU: Intel i5-3427U @ 1.80 Ghz

Decode + Traverse
(FlatBuf direct) 0 + 0.639 = 0.639 us
(FlatBuf heap) 0 + 0.732 = 0.732 us
(ProtoBuf-Java) 6.903 + 0.037 = 6.94 us
(ProtoBuf-JavaNano) 1.101 + 0.024 = 1.125 us

Encode
(FlatBuf direct) 1.137 us
(FlatBuf heap) 1.576 us
(ProtoBuf-Java) 1.652 us
(ProtoBuf-JavaNano) 1.379 us

Preliminary Result
Traversing a ByteBuffer is very expensive compared to traversing objects directly. However, the deserialization time still ruins the overall performance of Protobuf.

[Edit 2015-12-10]
added units and results for protobuf-javanano

from flatbuffers.

dyu commented on April 28, 2024

Where's the benchmark code?

On Wed, Jan 28, 2015 at 11:49 AM, Florian Enner [email protected]
wrote:

I've implemented a first version. Note that

I currently don't have any information about GC triggers, so the
numbers (especially for encoding/decoding protobufs) may be worse in real
world use.

FlatBuffers-Java still requires some optimizations

The test-data may not reflect your use case!

So far the numbers (sec for 1M operations) look like the following:

OS: Windows 8.1
JDK: Oracle JDK 1.8.0_20-b26
CPU: Intel i5-3427U @ 1.80 Ghz

Decode + Traverse
(FlatBuf direct) 0 / 0.639 = 0.639
(FlatBuf heap) 0 / 0.732 = 0.732
(ProtoBuf) 6.903 / 0.037 = 6.94

Encode
(FlatBuf direct) 1.137
(FlatBuf heap) 1.576
(ProtoBuf) 1.652

Preliminary Result
The data looks quite a bit different than the C++ implementations.
Traversing a ByteBuffer is very expensive compared to traversing objects
directly. However, the deserialization time still ruins the overall
performance of Protobuf. As long as you don't fully traverse each message
more than 10 times, FlatBuffers will be faster (for similar use cases).
Serializing data via flatbuffers is faster as well, but not by a huge
margin.

—
Reply to this email directly or view it on GitHub
#55 (comment).

When the cat is away, the mouse is alone.

David Yu

from flatbuffers.

ennerf commented on April 28, 2024

benchmark

from flatbuffers.

ennerf commented on April 28, 2024

Btw. there may be some optimizations that I'm not accounting for. The protobuf encode/decode numbers look better than I'd expect. Let me know in case you go through the code and find something wrong with it.

from flatbuffers.

commented on April 28, 2024

Excellent work!

In my C++ benchmark I simply add up all numbers I read when traversing the FlatBuffers, and print it at the end, guaranteeing the optimizer can't cheat.

Yes, the accessor overhead is a lot bigger in Java than it is in C++, as to be expected. FlatBuffers is definitely optimized for use cases where access is infrequent (usually just once). I'd say that even if you access the data more than 10x, FlatBuffers may still be worth it because of the more predictable performance (no startup cost, less GC overhead).

from flatbuffers.

ennerf commented on April 28, 2024

Thanks :) For the "use" I add the numbers up in the same way you did, but I split up the encoding/decoding into separate steps in order to get individual results. In the past few days I've tried several different variations (including one monolithic function) and the numbers are coming back roughly the same. So maybe they are what they are.

I agree that predictability is a much more important benefit than raw performance.

from flatbuffers.

commented on April 28, 2024

There's been no progress made on improving the benchmarks. The problem with releasing them is that a) they only work on Windows (Windows timing functions and projects, shouldn't be too hard to fix) and b) they depend on a bunch of external projects (protobuf etc), for which proper dependencies (submodules) and build dependencies need to be set up, preferably in a cross platform way (CMake).

If someone wants to to work on cleaning this mess up, I'd be happy to give them the code.

For Java, there's already something here: https://github.com/ennerf/flatbuffers-java-benchmark

from flatbuffers.

evolutional commented on April 28, 2024

As a simple first step, a unified benchmark to compare FB implementations against each other may be good

from flatbuffers.

ennerf commented on April 28, 2024

I've based the Java benchmark on the original C++ benchmark. I unfortunately don't know whether or not the Java benchmark would still run, but please feel free to use the code in any way you want.

from flatbuffers.

commented on April 28, 2024

Ok, I'll make an effort to get the benchmark code out there. Probably initially with the warts mentioned above, on a separate branch in the FlatBuffers repo, and then we can start cleaning it up from there (and add languages).

from flatbuffers.

evolutional commented on April 28, 2024

I threw together a quick port of the benchmark to .NET (https://github.com/evolutional/flatbuffers-java-benchmark/commits/cs-port)

I didn't have the .fbs so crafted my own from the generated java code.

I'm not getting anywhere near the Java numbers above on the PC I ran it on (Win7 x64 Intel Xeon E5-1650v2 @ 3.5Ghz / .NET FX 3.5)

Name                                  Mean   StdD  Unit     1M/sec
Encode                               0.003  0.001 ms/op        3.4
Decode                               0.000  0.000 ms/op        0.0
Traverse                             0.002  0.001 ms/op        1.7

(10 warm up, 50 measurement iterations)

Either:
a) my benchmarks are bad (possible - we don't have JMH for .NET, I used a lambda for the test action, may have added overhead)
or
b) the .NET version of Flatbuffers needs optimization :) (possible)

I'll do some tinkering; re-run the benchmark on my main PC, across .NET 3.5, 4.0 & 4.5 and with/without the "unsafe" version of the bytebuffer. I'll run the java benchmark too, so I can compare them.

from flatbuffers.

ennerf commented on April 28, 2024

Here is the corresponding fbs https://gist.github.com/ennerf/cb7999bd30973acaf794

from flatbuffers.

ennerf commented on April 28, 2024

I've uploaded some numbers for my current Desktop machine in case you'd like more data for comparison: https://gist.github.com/ennerf/d108294623a8c52f984f

from flatbuffers.

evolutional commented on April 28, 2024

Thanks!

Not being too familiar with the Java benchmark suite, the "score" - is that the measured average?

So 1.137 us/op would be 1 operation in 1.137 microseconds?

from flatbuffers.

ennerf commented on April 28, 2024

The JMH score depends on the settings. In this case, it does represent the average time.

@BenchmarkMode(Mode.AverageTime)

from flatbuffers.

evolutional commented on April 28, 2024

I think I was measuring the timings wrongly. I've adopted the approach here (http://stackoverflow.com/questions/1206367/c-sharp-time-in-microseconds) to measure in microseconds and am now getting these figures:

50 iterations - Win7 x64 Intel Xeon E5-1650v2 @ 3.5Ghz & .NET FX 3.5

Safe ByteBuffer

Name                                  Mean   StdD  Unit
Encode                               1.264  0.486 us/op
Decode                               0.018  0.039 us/op
Traverse                             0.476  0.181 us/op

Unsafe ByteBuffer

Name                                  Mean   StdD  Unit
Encode                               1.068  0.377 us/op
Decode                               0.022  0.042 us/op
Traverse                             0.366  0.183 us/op

The unsafe bytebuffer is noticeably faster on the encode action. The performance still appears to be lower than the figures you posted for the java version, though - but I can't verify that until I can run the same Java benchmarks on this machine.

from flatbuffers.

evolutional commented on April 28, 2024

Cracking out dotTrace from Jetbrains, it looks like the hottest function is the FlatBufferBuilder.CreateString() method, taking over 50% of the time within the Flatbuffers assembly.

Specifically, the call to Encoding.UTF8.GetBytes() is expensive. This is the function that takes the .NET wide strings and converts to UTF8.

The next most expensive call is FlatBufferBuilder.EndObject(), but it's nowhere near as expensive as CreateString().

from flatbuffers.

evolutional commented on April 28, 2024

Made an optimization to CreateString (PR inbound) which changes the timings to:

Safe Bytebuffer

Name                                  Mean   StdD  Unit
Encode                               1.108  0.372 us/op
Decode                               0.018  0.039 us/op
Traverse                             0.478  0.249 us/op

Unsafe Bytebuffer

Name                                  Mean   StdD  Unit
Encode                               0.988  0.386 us/op
Decode                               0.018  0.039 us/op
Traverse                             0.380  0.218 us/op

from flatbuffers.

evolutional commented on April 28, 2024

Making a few more tweaks to how Pad/Prep is called and I can save a few more cycles on the benchmark (I'll prepare another PR)

Safe

Name                                  Mean   StdD  Unit
Encode                               1.042  0.358 us/op
Decode                               0.018  0.039 us/op
Traverse                             0.486  0.246 us/op

Unsafe

Name                                  Mean   StdD  Unit
Encode                               0.906  0.374 us/op
Decode                               0.020  0.040 us/op
Traverse                             0.358  0.205 us/op

from flatbuffers.

evolutional commented on April 28, 2024

Ran the Java and C# version of the benchmarks on the same machine (Intel i7-4770K @ 3.50Ghz Win10 x64).

Results in here https://gist.github.com/evolutional/85b9c6fac33a8455945d

With the optimizations I discuss in the gist, we achieve roughly equivalent performance (with Java direct bb being slightly quicker).

from flatbuffers.

commented on April 28, 2024

The C++ benchmark code is now in the repo (in its own branch, see comment at the end of https://google.github.io/flatbuffers/md__benchmarks.html

Anyone interested in integrating benchmarks for other languages, these could go in the same branch.

from flatbuffers.

evolutional commented on April 28, 2024

Nice, I'll pull over the C# benchmarks

On Wed, Jan 6, 2016 at 5:02 PM -0800, "Wouter van Oortmerssen" [email protected] wrote:

The C++ benchmark code is now in the repo (in its own branch, see comment at the end of https://google.github.io/flatbuffers/md__benchmarks.html

Anyone interested in integrating benchmarks for other languages, these could go in the same branch.

Reply to this email directly or view it on GitHub:
#55 (comment)

from flatbuffers.

LouisCAD commented on April 28, 2024

Would be nice to have a comparison with json parsed/serialized by LoganSquare too (this library doesn't use reflection, so a benchmark would be fair)

from flatbuffers.

stale commented on April 28, 2024

This issue has been automatically marked as stale because it has not had activity for 1 year. It will be automatically closed if no further activity occurs. To keep it open, simply post a new comment. Maintainers will re-open on new activity. Thank you for your contributions.

from flatbuffers.

Publish Benchmarks for Java and Android VMs about flatbuffers HOT 28 CLOSED

Comments (28)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent