Git Product home page Git Product logo

spmc-buffer's Introduction

SPMC Buffer: Triple-buffering for multiple consumers

Maintenance warning: This project has been abandoned in order to make room for others.

On crates.io On docs.rs Build status

What is this?

This is an extension of my earlier work on triple buffering, which supports readout from multiple consumers at the cost of some extra memory and CPU overhead. You may find it useful for the following class of thread synchronization problems:

  • There is one producer thread and several consumer threads
  • The producer wants to update a shared memory value periodically
  • The consumers wants to access the latest update from the producer at any time

It is currently used as follows:

// Create an SPMC buffer of any Clone type
use spmc_buffer::SPMCBuffer;
let buf = SPMCBuffer::new(2, 1.0);

// Split it into an input and output interface
let (mut buf_input, mut buf_output) = buf.split();

// Create as many extra output interfaces as needed
let mut buf_output2 = buf_output.clone();

// The producer can move a value into the buffer at any time
buf_input.write(4.2);

// A consumer can access the latest value from the producer at any time
let mut latest_value_ref = buf_output.read();
assert_eq!(*latest_value_ref, 4.2);
let latest_value_ref2 = buf_output2.read();
assert_eq!(*latest_value_ref2, 4.2);

Give me details! How does it compare to alternatives?

Compared to a triple buffer an SPMC buffer...

  • Supports multiple consumers (that's the point!)
  • Consumes more CPU time and memory in the single-consumer case
  • Is not always wait-free for the writer. The guarantee can be offered, but it has quite large memory costs for many readers.

In short, SPMC buffering is what you're after in scenarios where a shared memory location is updated frequently by a single writer, read by multiple reader who only wants the latest version, and you can spare some RAM.

  • If you need multiple producers, look somewhere else
  • If you only need one consumer, use a triple buffer instead
  • If you can't tolerate the RAM overhead or want to update the data in place, try a Mutex instead (or possibly an RWLock)
  • If the shared value is updated very rarely (e.g. every second), try an RCU
  • If the consumer must get every update, try a message queue

How do I know your unsafe lock-free code is working?

By running the tests, of course! Which is unfortunately currently harder than I'd like it to be.

First of all, we have sequential tests, which are very thorough but obviously do not check the lock-free/synchronization part. You run them as follows:

$ cargo test --release

Then we have concurrent tests, where we fire up concurrent readers and writer threads and check that the readers can never observe an inconsistent buffer state. These tests are more important, but they are also harder to run because one must first check some assumptions:

  • The testing host must have at least 3 physical CPU cores to test all possible race conditions
  • No other code should be eating CPU in the background. Including other tests.
  • Some tests have timing-dependent behaviour, and may require manual tweaking of sleep periods for your specific system.

Taking this and the relatively long run time (~10 s) into account, these tests are ignored by default.

Finally, we have benchmarks, which allow you to test how well the code is performing on your machine. Because cargo bench has not yet landed in Stable Rust, these benchmarks masquerade as tests, which make them a bit unpleasant to run. I apologize for the inconvenience.

To run the concurrent tests and the benchmarks, make sure no one is eating CPU in the background and do:

$ cargo test --release -- --ignored --test-threads=1

Here is a guide to interpreting the benchmark results:

  • clean_read measures the triple buffer readout time when the data has not changed. It should be extremely fast (a couple of CPU clock cycles).
  • write measures the amount of time it takes to write data in the triple buffer when no one is reading.
  • write_and_dirty_read performs a write as before, immediately followed by a sequential read. To get the dirty read performance, substract the write time from that result. Writes and dirty read should take comparable time.
  • concurrent_write measures the write performance when a reader is continuously reading. Expect significantly worse performance: lock-free techniques can help against contention, but are not a panacea.
  • concurrent_read measures the read performance when a writer is continuously writing. Again, a significant hit is to be expected.

On my laptop's CPU (Intel Core i7-4720HQ), typical results are as follows:

  • Write: 12 ns
  • Clean read: 1.3 ns
  • Dirty read: 17 ns
  • Concurrent write: 100 ns
  • Concurrent read: 38 ns

License

This crate is distributed under the terms of the MPLv2 license. See the LICENSE file for details.

More relaxed licensing (Apache, MIT, BSD...) may also be negociated, in exchange of a financial contribution. Contact me for details at knights_of_ni AT gmx DOTCOM.

spmc-buffer's People

Contributors

hadrieng2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

baitcenter

spmc-buffer's Issues

Relax requirement on Clone

This mirrors HadrienG2/triple-buffer#5 . Given that I only need the inner type to be clonable at buffer creation time, imposing a Clone requirement is actually a bit too strong, and I would be fine with a Default type as well for example.

Inner type should be Sync

In spmc_buffer, multiple readers can be accessing a buffer's inner data concurrently. Therefore, the inner type should be Sync in order to avoid data races with types exposing inner mutability such as Cell.

Use a proper Condvar for blocking the producer

Unlike TripleBuffer, SPMCBuffer is not always wait-free for the producer. There are situations where the producer will block for an unbounded amount of time, the simplest scenario being when the producer attempts a buffer swap and all other buffers are currently held by consumers who do not want to move to the latest buffer version yet.

We currently handle this by with a spin loop. But that can be a waste of CPU time. Instead, we should use the OS facilities for blocking the thread. Rust provides two different accesses to them, one being thread::park/unpark and the other being Condvars. Since the identity of the producer thread can change over time, I think it's best to use a Condvar.

There is a catch, however. When the producer looks for spare buffers, it does not lock a mutex, and allows consumers to liberate buffers concurrently. That's a feature, because SPMCBuffer is designed to be efficient in the wait-free regime, where producer blocking is rare, and having consumers lock a mutex on every read buffer switch comes with a scalability tax. But it means that at the time where the producer does decide to go to sleep due to storage exhaustion, storage may actually already have been concurrently liberated by consumers, which means that a producer should not solely count on consumers to wake it up on storage liberation, but also check again from time to time.


Here is how I propose to handle this. On the data structure side, I would add the following:

  • A private mutex in the producer interface
  • A shared Condvar, used for producer sleep/wakeup
  • A shared AtomicBool, used to signal when the producer is going to sleep, initially false

The producer's slow "no buffer found" path would be modified as follows:

  • Lock the private mutex (we don't need it, but pthread insists on it ^^)
  • Set the shared flag, notifying consumers that we are going to sleep
  • Block on the condvar, with a certain initial timeout T0
  • Once the wait is over, clear the shared flag, unlock the mutex, and try again
  • As long as we don't find a spare buffer, increase the timeout, up to a certain limit T1, by a multiplicative factor M > 1 (like exponential back-off). When we find a buffer, reduce or reset the timeout.

The consumer's buffer liberation path would be extended with the following procedure:

  • Check if the shared "producer going to sleep" flag is set (keeping the fast path as fast as possible)
  • If so, notify the producer that a buffer has been liberated (by calling notify_one() on the condvar)

It would be tempting to also unset the flag along the way. But wakeups may be lost since there is a delay between the moment where a producer sets the "going to sleep" flag and actually goes to sleep. As the producer will clear the flag right after waking up, I think that optimizing the optimization like this would just be an unnecessary increase of the lost wakeup risk for the sake of hypothetical CPU performance in an overall rare event.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.