Git Product home page Git Product logo

optimizedthreadsaferefcounting's Introduction

OptimizedThreadSafeRefCounting

What kind of optimization

#define cnter counter

  • Unsafe:Ignore thread safety.
  • Trivial: Use atomics directly.
  • Optimized: Use 2 kinds of counter:
    • If 2 ref_cnters are in the same thread, we can use local cnter whose overhead is very small.
    • Once when the local cnter is 0, we subtract the expensive global atomic cnter.

Result for Benchmark

I test the performance of different ref_cnters.

Options:

  • LIBC++(LLVM)
  • STD=C++20
  • OPTIM: O3
  • COMPILER: CLANG-8.0

Optimized version is 1.5x faster.

Acctually multi-threading performance is very hard to messure(as the overhead of creating a thread is a bit huge to the ref_cnter's operations). So I tested it in single-thread case. But they're equal as the aim of benchmark is how many benefits we can take when we are "local".

Code for Benchmark

C++ Quick Bench

Please add the ref_cnter class before using the following codes.

constexpr std::size_t sz = 5000;

static void opt_refcnt(benchmark::State& state) {
  using namespace ganler::opt;
  ref_cnter cnter;
  ref_cnter cnter_;
  benchmark::DoNotOptimize(cnter);
  benchmark::DoNotOptimize(cnter_);
  for (auto _ : state) {
    { // Test passed.
      cnter = cnter_;
      cnter_ = cnter;
    }
  }
}
BENCHMARK(opt_refcnt);

static void trivial_refcnt(benchmark::State& state) {
  using namespace ganler::trivial;
  ref_cnter cnter;
  ref_cnter cnter_;
  benchmark::DoNotOptimize(cnter);
  benchmark::DoNotOptimize(cnter_);
  for (auto _ : state) {
    { // Test passed.
      cnter = cnter_;
      cnter_ = cnter;
    }
  }
}
// Register the function as a benchmark
BENCHMARK(trivial_refcnt);

Details

We can see that the assembly of the benchmark:

For trivial implementation atomic operation is the bottleneck.

For optimized version ++/-- is no longer the bottleneck

Note that 405b86 is related to the for-loop stuff.

And GCC

Acctually I cannot understand the results of GCC 9.1. Optimized version is just 1.1x faster.

(Maybe gcc considered the optimization for local atomic operation).

(But the optimized version is as fast as the empty loop).

optimizedthreadsaferefcounting's People

Contributors

ganler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

perfcv

optimizedthreadsaferefcounting's Issues

How about using `Relaxed` when increase the counter

Typical use for relaxed memory ordering is incrementing counters, such as the reference counters of std::shared_ptr, since this only requires atomicity, but not ordering or synchronization (note that decrementing the shared_ptr counters requires acquire-release synchronization with the destructor)

The part is mentioned in CppReference, how about using Relaxed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.