Git Product home page Git Product logo

1brc's Introduction

The One Billion Row Challenge

This is my C# implementation for the One Billion Row Challenge (1BRC) as defined in the gunnarmorling/1brc GitHub repository.

Running my solution

Make sure you have the latest .NET 8 SDK installed. Building the project is done using the following commands (change the runtime identifier from win-x64 to the one for your OS):

dotnet build -c Release
dotnet publish -r win-x64 -c Release

Then you can find the 1brc.exe inside the bin/Release/net8.0/publish/win-x64 directory. This executable takes a single argument which is the file path to the measurements file.

While my solution doesn't require AVX2, it is written assuming that it is available on the machine this is running on. If AVX2 is not available, then it will fall back to simulating 256-bit vectors using two 128-bit vectors as is provided by the Vector256 type in the standard library.

Measurements

My system has the following specs

  • CPU: AMD Ryzen 9 5950X @ 3.4GHz (default clock speed)
  • RAM: 32GB 3600MHz DDR4
  • SSD: Samsung 980 PRO
  • OS: Windows 11

I generated my input file using the CreateMeasurements script from the original repo. They recently changed it so that it doesn't generate carriage returns on Windows, so my solution assumes that lines are separated only by newlines.

I have a Stopwatch that is started when the program starts and prints the elapsed time just before the program exits. I also am using pbench to time how long it takes to invoke the whole program so that it includes the time spent launching the runtime. I have my application compiled with NativeAOT so there is no JIT needed.

After running my program 10 times in a row, these are my measurements on my system:

  • Stopwatch: Min=1.318s, Avg=1.335s, Max=1.350s
  • Process Time: Min=1.329s, Avg=1.346s, Max=1.361s

Comparison to other solutions

I also ran buybackoff's C# solution which also has a Stopwatch at the start and stop of the program. I noticed when running it that there was a much larger gap between the Stopwatch time and the process time. I believe this is because the stopwatch time is not timing how long it takes to close/dispose any of the file handles. I have a suspicion that maybe these issues are only showing up because I am using Windows and the results would be different on Linux.

And I also ran royvanrijn's Java solution which is currently winning the competition in the original repo. I ran it using the latest GraalVM JDK. It does not have a stopwatch in the source code, so I can only time the whole process time.

buybackoff's C# solution:

  • Stopwatch: Min=1.433s, Avg=1.453s, Max=1.503s
  • Process Time: Min=2.202s, Avg=2.215s, Max=2.270s

royvanrijn's Java solution:

  • Process Time: Min=2.501s, Avg=2.549s, Max=2.597s

Right now I'm still waiting for some other people to test out my code on their hardware and get some more performance measurements. In particular I would like to see how it fares when comparing on Linux instead of Windows, as it may be that this improved performance is not reproducible on Linux. I would take these comparisons with a grain of salt for now until then.

1brc's People

Contributors

cameronaavik avatar

Stargazers

Tymek Majewski avatar  avatar shine avatar  avatar  avatar Nima Ara avatar Max Bo avatar well.james avatar shyboy avatar Nathan Hollis avatar Кирилл Алдашкин avatar Maksim Volkau avatar

Watchers

Maksim Volkau avatar  avatar  avatar

Forkers

dadhi

1brc's Issues

Ubuntu results

Machine:

tmaj@tm1brc:~/1brc.cameronaavik$ dotnet publish -r linux-x64 -c Release
  Determining projects to restore...
  All projects are up-to-date for restore.
  1brc -> /home/tmaj/1brc.cameronaavik/bin/Release/net8.0/linux-x64/1brc.dll
  Generating native code
  1brc -> /home/tmaj/1brc.cameronaavik/bin/Release/net8.0/linux-x64/publish/

tmaj@tm1brc:~$ time /home/tmaj/1brc.cameronaavik/bin/Release/net8.0/linux-x64/publish/1brc /home/tmaj/1brc.data/measurements-1000000000.txt > 1brc.cameronaavik.1_000_000_000.aot.out 

real	0m4.389s
user	0m32.109s
sys	0m1.546s

Without publish on the same machine:

tmaj@tm1brc:~$ time dotnet run -c Release --project=1brc.cameronaavik/1brc.csproj /home/tmaj/1brc.data/measurements-1000000000.txt > 1brc.cameronaavik.1_000_000_000.out

real	0m6.634s
user	0m33.661s
sys		0m1.929s

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.