m-ou-se / rust-atomics-and-locks Goto Github PK

Code examples, data structures, and links from my book, Rust Atomics and Locks.

License: Other

Rust 100.00%

rust-atomics-and-locks's Introduction

This repository contains the code examples, data structures, and links from Rust Atomics and Locks.

The examples from chapters 1, 2, 3, and 8 can be found in examples/. The data structures from chapters 4, 5, 6, and 9 can be found in src/.

Chapter 1 — Basics of Rust Concurrency

Chapter 2 — Atomics

Chapter 3 — Memory Ordering

Chapter 4 — Building Our Own Spin Lock

Chapter 5 — Building Our Own Channels

Chapter 6 — Building Our Own “Arc”

Chapter 7 — Understanding the Processor

https://godbolt.org/

Chapter 8 — Operating System Primitives

examples/ch8-01-futex.rs

Chapter 9 — Building Our Own Locks

Chapter 10 — Ideas and Inspiration

License

You may use all code in this repository for any purpose.

Attribution is appreciated, but not required. An attribution usually includes the book title, author, publisher, and ISBN. For example: "Rust Atomics and Locks by Mara Bos (O’Reilly). Copyright 2023 Mara Bos, 978-1-098-11944-7."

rust-atomics-and-locks's People

Contributors

Stargazers

Watchers

Forkers

shabbirhasan1 isgasho oyelowo tomaszwaszczyk nullabler lveyde manojkumar1053 didoloan duonghb53 hammerhg bkolad vorachet baro77 phreno lqsae learnrust halfgork riipandi magj2006 magnologan elongbug isnlan abchamp chyyuu leejw51 yuan-yuan-jia cedrickchee thekissofdragon mail4metablocks gmh5225 hegade84 isathsara fx-kirin obeis azurecloudmonk liukun4515 wds33817 37ng gopherj sbdiarra geekwolverine snakeliwei sergiupopescu199 qilol hemantku kidmam rustworks abstecker lmtan91 younghakim7 beardev118 artoria302 shaojiedong tomaslucas naoehsavio aallamaa timmyjose-forks iautushka ksaritek ysfkel ssh352 royge iq-scm zhwei820 zen0fpy rzyaly muthhukumar hjgraca mattngaw purewhitewu kingdream123 asura7969 littlehug squeeko mag1c1an1 noorahsmith sangelxyz qscuio francescoluzzi ericbarber kamitani08 fallengamer ryancee loredp aheimberger renoldt justanotherdot taku-shimomura thinkgos lee-hen denilson020898 bahador1 sakthimadhu07 joetondev cyberbono3 hippalus walhalla-summary weiliy emptybytes itsukichan

rust-atomics-and-locks's Issues

Chapter 9 typo

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-locks.html#avoiding-spurious-wake-ups

Description of the error

"whether we should use an optimized aproach" -> "whether we should use an optimized approach"

Chapter 9 typo

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-locks.html#condition-variable

Description of the error

"In contrtast to what we did" -> "In contrast to what we did"

Anti-Pattern that I feel is a bit confusing

The content that the question is about

https://marabos.nl/atomics/basics.html#naming-clones

The question

This section contains the following code example:

let a = Arc::new([1, 2, 3]);

thread::spawn({
    let a = a.clone();
    move || {
        dbg!(a);
    }
});

dbg!(a);

I feel this is a bit confusing, especially to new users. At first glance this may be read as the arc clone happening on the second thread because it looks like you are passing the entire scope to the thread::spawn() function, when really the let a = a.clone(); line is still on the main thread.

In my opinion a better way of writing the same thing would be:

let a = Arc::new([1, 2, 3]);

{
    let a = a.clone();
    thread::spawn(move || {
        dbg!(a);
    })
};

dbg!(a);

This accomplishes the same thing but I think is less misleading and easier to understand at a glance, as well as making it more clear that the clone happens on the main thread.

Also sorry, not sure if this belongs under Technically Discussion, but putting it under error also didn't seem right as I thought maybe it was like this for a reason, very open to it staying as is if anyone sees it another way. Thanks for your hard work!

[Chapter 5] Saving one byte of memory when controlling channel state - Missing state reset?

The content that the question is about

Chapter 5 - Using a Single Atomic for the Channel State

The question

Hi Mara! 👋

I bought your physical book a couple weeks ago and I'm learning a ton! Really appreciate the contents there! ❤️

While reading Chapter 5 - Using a Single Atomic for the Channel State, I came across this great example where we can save one byte by using an atomic state variable to control when the channel is ready to send/receive messages:

But while trying this example, I was left wondering whether the example is missing a store call to the state control variable so messages to the channel can be sent again?

Here is what I came up with, but I'm not quite sure whether this is right:

pub fn receive(&self) -> T {
        if self.state.compare_exchange(
            READY, READING, Ordering::Acquire, Ordering::Relaxed
        ).is_err() {
            panic!("No message available!");
        }

-       unsafe { (*self.message.get()).assume_init_read() }
+       unsafe {
+           let v = (*self.message.get()).assume_init_read();
+           self.state.store(EMPTY, Ordering::Release);
+           v
+       }
    }

If you have the time, could you have a look at this and point me to the right direction?

Thanks again for this great book and also for making it available online for free! ❤️

About 'Copy' on chpt. 5 summary

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-channels.html#summary

Description of the error

The summary statement on MaybeUninit tells that users are responsible for not duplicating Copy data. I guess it should state that users are responsible for not duplicating non-Copy data.
Thank you.

Question about correctness of example in chapter 5

The content that the question is about

fn main() {
    let channel = Channel::new();
    let t = thread::current();
    thread::scope(|s| {
        s.spawn(|| {
            channel.send("hello world!");
            t.unpark();
        });
        while !channel.is_ready() {
            thread::park();
        }
        assert_eq!(channel.receive(), "hello world!");
    });
}

...
/// Panics when trying to send more than one message.
pub fn send(&self, message: T) {
    if self.in_use.swap(true, Relaxed) {
        panic!("can't send more than one message!");
    }
    unsafe { (*self.message.get()).write(message) };
    self.ready.store(true, Release);
}

pub fn is_ready(&self) -> bool {
    self.ready.load(Relaxed)
}

The question

How is it guaranteed that after t.unpark(), inside the while loop, the most recent value will be visible? If not, it would be possible that unpark() wakes up the thread, it loads the value but still sees that ready is not true and will park the current thread - afterwards there won't be anything waking up that thread anymore.

Is using `Box::into_raw` maybe preferable to `Box::leak` when paired with later `Box::from_raw`?

rust-atomics-and-locks/src/ch6_arc/s1_basic.rs

Line 22 in af59f9b

ptr: NonNull::from(Box::leak(Box::new(ArcData {

Examples/source in the book, when a raw pointer is needed, pair Box::leak with Box::from_raw. Naively, I would have expected from name -- and documentation seems to suggest similarly ("The easiest way to do this is to...") -- that it would be preferable to pair Box::into_raw with Box::from_raw.

Is there a special reason to use Box::leak instead of Box::into_raw? I guess the former gives you a reference, which means you don't need to use NonNull::new(ptr).unwrap() or unsafe NonNull::new_unchecked maybe? But the unpaired names (without obvious reason) are just the teensiest stumbling block for the pedant in me, more than either of those would be.

Love the book so far through chapter 6 (online so far, in theory Amazon ships my print copy any day now), this is not a complaint about book or code, just a question I haven't been able to get out of my mind since I started the chapter. :-)

More explanation needed on why use SeqCst in Chapter 3

The content that the question is about

In section 'Sequentially Consistent Ordering' of the Chapter 3, it says the example 'depends on sequentially consistent ordered operations'.

The question

However, I tried use the happen-before relationship to analyze: what if I use 'Release-store' and 'Acquire-load' instead of SeqCst. But I cannot find any problems and I thought it would work well even if no SeqCst. The fact is not.

I cannot figure out why using the happen-before relationship.

Appendix:
I use a slightly modified code as following:

use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;

static A: AtomicBool = AtomicBool::new(false);
static B: AtomicBool = AtomicBool::new(false);
static mut AS: isize = 0;
static mut BS: isize = 0;

fn main() {
    let a = thread::spawn(|| {
        A.store(true, Ordering::Release);
        if !B.load(Ordering::Acquire) {
            unsafe { AS += 1 };
        }
    });
    let b = thread::spawn(|| {
        B.store(true, Ordering::Release);
        if !A.load(Ordering::Acquire) {
            unsafe { BS += 1 };
        }
    });
    a.join().unwrap();
    b.join().unwrap();
    unsafe {
        println!("{} {}", AS, BS);
    }
}

In a large number of runs, there is a small probability of "1 1" outputs, which means both B.load() and A.load() return false.
So, the SeqCst is needed.

But why?

Typos in chapter 7

Type of error

Typo

Location of the error

https://marabos.nl/atomics/hardware.html#mesi

Description of the error

"the MOESI protcol" -> "the MOESI protocol"
"the MESIF protcol" -> "the MESIF protocol"

Simple typo in Memory Ordering, Example: Lazy Initialization with Indirection

Type of error

Typo

Location of the error

https://marabos.nl/atomics/memory-ordering.html#example-lazy-initialization-with-indirection

Description of the error

In the following paragraph:

For this example, let’s say we want to maintain the non-blocking behavior, so that threads never wait for another thread, but instead race and take the value from the first thread to complete initialization. This means we still need to be able to go from "uninitalized" to "fully initialized" in a single atomic operation.

The bold word "uninitalized" should be "uninitialized".

Figure 9-1

The content that the question is about

No response

The question

Isn't the final wake_one() call of the second thread redundant?

Typo in chapter 9

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-locks.html#mutex-avoid-syscalls

Description of the error

"that means we’ve succesfully locked the mutex" -> "that means we’ve successfully locked the mutex"

Chapter 1. Basics of Rust Concurrency has a typo

Type of error

Typo

Location of the error

https://marabos.nl/atomics/basics.html#parking

Description of the error

- Scoped threads are useful to limit the lifetime of a thread to alllow
+ Scoped threads are useful to limit the lifetime of a thread to allow

Potential infinite loop in the end of chapter 5

The content that the question is about

No response

The question

In the last example in chapter 5, if the sender is dropped without sending, won't the reciever be stuck forever waiting for a message?

Error in read boundary check for RwLock 3

Type of error

Minor technical mistake

Location of the error

Line in code sample
Section in book

Description of the error

In chapter 9 in the third version of the book's RwLock (avoiding write starvation), I believe the boundary assert check in the read guard for too many readers is slightly wrong.

The code is checking whether s is even, and afterwards asserts whether we have reached the maximum number of readers. As stated in the beginning of this section, just before the first code sample, u32::MAX is odd, which means u32::MAX - 2 will also be odd, and the assert statement will therefore never be true. I would argue the code should instead be s != u32::MAX - 3 to get the largest possible even number.

Using u32::MAX - 1 would also give an even number, but it would then be impossible to distinguish between the state of many readers and one waiting writer.

Relevant code sample:

pub fn read(&self) -> ReadGuard<T> {
    let mut s = self.state.load(Relaxed);
    loop {
        if s % 2 == 0 { // Even.
            assert!(s != u32::MAX - 2 /* should be 3 */, "too many readers");
            ...
}

Hope this helps!

Thank you for your excellent book! ❤️

Minor technical mistake in a chapter 2 paragraph Example: Progress Reporting

Type of error

Minor technical mistake

Location of the error

https://marabos.nl/atomics/atomics.html#example-progress-reporting

use std::sync::atomic::AtomicUsize;

fn main() {
    let num_done = AtomicUsize::new(0);

    thread::scope(|s| {
        // A background thread to process all 100 items.
        s.spawn(|| {
            for i in 0..100 {
                process_item(i); // Assuming this takes some time.
                num_done.store(i + 1, Relaxed);
            }
        });

        // The main thread shows status updates, every second.
        loop {
            let n = num_done.load(Relaxed);
            if n == 100 { break; }
            println!("Working.. {n}/100 done");
            thread::sleep(Duration::from_secs(1));
        }
    });

    println!("Done!");
}

Description of the error

First issue:
Scoping should be changed to thread joining. Otherwise it will keep it untill spawned thread will finish proccess.
Second issue:
If num_done is not static it will be sugessted to be moved to thread (if joining used).

Suggestion

use std::sync::atomic::Ordering::Relaxed;
use std::sync::atomic::AtomicUsize;

fn main() {
    static NUM_DONE: AtomicUsize = AtomicUsize::new(0);
    let t = std::thread::spawn(|| {
        for i in 0..100 {
            process(i);
            NUM_DONE.store(i + 1, Relaxed);
        }
    });

    loop {
        let n = NUM_DONE.load(Relaxed);
        if n == 100 { break; };

        println!("working.. {n}/100 done");
        std::thread::sleep(std::time::Duration::from_secs(1));
    }
    t.join().unwrap();
    println!("done!");
}

fn process(_index: usize) {
    std::thread::sleep(std::time::Duration::from_millis(400));
}

Of course it might be platform specific issue:
macOs 14.1 Beta (23B5056e)
platform arm64
rust: stable 1.72.1

Ch.6 - Cryptic/Challenging explanation regarding 'special "locked" state'

The content that the question is about

Regarding this section roughly around:

A way out of this dilemma is to briefly block the downgrade() operation

it's not readily clear to me how the facade of a mutex over usize::MAX is used to guarantee the correctness of get_mut for the given snippet:

use std::sync::atomic::{
    Ordering::{Acquire, Relaxed, Release},
    fence,
};
struct Weak<T> { ... }
impl<T> Weak<T> {
    // ...
    pub fn get_mut(arc: &mut Self) -> Option<&mut T> {
        if arc.data().alloc_ref_count.compare_exchange(1, usize::MAX, Relaxed).is_err() { return None; }
        let is_unique = arc.data().data_ref_count.load(Relaxed) == 1;
        arc.data().alloc_ref_count.store(1, Release);
        if !is_unique { return None; }
        fence(Acquire);
        unsafe { Some(&mut *arc.data().data.get()) }
    }
}

is there an unspoken behaviour that the count cannot be exceeded (without overflow), thus allowing the implementation to leverage a stop condition elsewhere (like in downgrade)?

The question

Can there be a simpler explanation for the Ch.6 Arc optimization?

Question about drop on the condition variable example of Chapter 1

The content that the question is about

In the section of Chapter 1 related to condition variables (https://marabos.nl/atomics/basics.html#condvar), we can found the following example:

use std::sync::Condvar;

let queue = Mutex::new(VecDeque::new());
let not_empty = Condvar::new();

thread::scope(|s| {
    s.spawn(|| {
        loop {
            let mut q = queue.lock().unwrap();
            let item = loop {
                if let Some(item) = q.pop_front() {
                    break item;
                } else {
                    q = not_empty.wait(q).unwrap();
                }
            };
            drop(q);
            dbg!(item);
        }
    });

    for i in 0.. {
        queue.lock().unwrap().push_back(i);
        not_empty.notify_one();
        thread::sleep(Duration::from_secs(1));
    }
});

The question

This example is adapted from the one using park and unpark mentioned previously. However, in this example, we can see the line:

drop(q)

I guess this drop is about releasing the lock sooner, but it seems optional. Is there any particular reason to introduce it?

will fetch_add with relaxed ordering cause update loss?

Given the code above, the fetch_add is used with relaxed ordering:
I thought the update in one thread might not be visible in another thread, therefore the result might be less than 100.
This has no issue in X86 as it has a strong memory model by default.

Am I missing anything or misunderstanding the ordering concept?

Grammar issue in chapter 7

Type of error

Language or grammar issue

Location of the error

https://marabos.nl/atomics/hardware.html#mesi

Description of the error

"the four possible states it defines for cache line" doesn't sound "fluid", probably either "the four possible states it defines for a cache line" or "the four possible states it defines for each cache line"?

Question about the memory ordering of the store in the last fence example in chapter 3

The content that the question is about

The store in READY: https://github.com/m-ou-se/rust-atomics-and-locks/blob/main/examples/ch3-11-fence.rs#L17

The question

The atomics in READY are only read with Relaxed ordering in line 21. Can't the store in line 17 be done with memory ordering Relaxed (instead of Release), since relaxed is enough to establish a total order between stores and loads of the same variable?

Possible race condition

Type of error

Minor technical mistake

Location of the error

https://github.com/m-ou-se/rust-atomics-and-locks/blob/d945e828bd08719a2d7cb6d758be4611bd90ba2b/src/ch5_channels/s6_blocking.rs

Description of the error

Consider this:

At time point 1, the receiver goes to here -- it finds the state is false, but hasn't parked yet.
At time point 2, the sender goes to here -- it sets the state to true, and tries to unpark the receiver.
At time point 3, the receiver continues -- it parks itself. Then boom, no thread will unpark it.

Maybe a typo in a comment rwlock_3.rs

https://github.com/m-ou-se/rust-atomics-and-locks/blob/d945e828bd08719a2d7cb6d758be4611bd90ba2b/src/ch9_locks/rwlock_3.rs#L61C26-L61C26

I guess it checks if it's even number.

mutex_3.rs test panic

Type of error

Typo

Location of the error

Description of the error

win7
msvc2015
rust nightly 1.72 i686

What prevents reordering between wait and the decrementing of num_waiters?

The content that the question is about

https://marabos.nl/atomics/building-locks.html#avoiding-syscalls (p. 199 of the second printed release)

The question

Hi Mara,

I'm struggling to understand one use of the Relaxed memory ordering in chapter 9. Hopefully you can point me in the right direction?

In Condition Variable: Avoiding Syscalls, what prevents reordering between the call to wait and the decrementing of num_waiters, from the point of view of the notifying thread?

    pub fn wait<'a, T>(&self, guard: MutexGuard<'a, T>) -> MutexGuard<'a, T> {
        self.num_waiters.fetch_add(1, Relaxed); // New!

        let counter_value = self.counter.load(Relaxed);

        let mutex = guard.mutex;
        drop(guard);

        wait(&self.counter, counter_value);

        self.num_waiters.fetch_sub(1, Relaxed); // New!

        mutex.lock()
    }

The text includes the following paragraph, which makes sense (to me) for the waiting thread:

We also don’t need to worry about the notifying thread observing the decremented value "too soon," because once the decrementing operation is executed, perhaps after a spurious wake-up, the waiting thread no longer needs to be woken up anyway.

But, unlike the happens-before relationship between incrementing num_waiters and unlocking the mutex, it isn't clear to me what ensures that the notifying thread must observe num_waiters being decremented only after (or, more importantly, if) the waiting thread has/will wake up anyway.

For example, couldn't the compiler reason that it's fine to decrement num_waiters before calling wait? (Apart from it probably not being able to see much into wait and, thus, being constrained to a conservative approach).

Thanks!

P.S. I really enjoyed the book! it's very precise but also super approachable. Differently from most resources on the subject, you encourage understanding and leveraging of weaker memory orderings, and I found that really refreshing and stimulating.

About example of Ch3 on SeqCst

The content that the question is about

the example of ch3 for SeqCst

https://marabos.nl/atomics/memory-ordering.html#seqcst

The question

I think the example used is not appropriate for SeqCst.
SeqCst guarantees that all operations are sequencially consistant globally,
but that doesn't mean context switch does not happen between store and load operation.

therefore below order is possible in the example code

store at Thread A => store at Thread B => load at Thread A (access blocked) => load at Thread B (access blocked)

For the proof, if I run the below test, there are some cases where DATA is actually zero,
which implies that both threads never accessed to data.

use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;

static A: AtomicBool = AtomicBool::new(false);
static B: AtomicBool = AtomicBool::new(false);
static mut DATA: isize = 0;

fn trial() {
    A.store(false, Ordering::SeqCst);
    B.store(false, Ordering::SeqCst);
    unsafe { DATA = 0; }

    let a = thread::spawn(|| {
        A.store(true, Ordering::SeqCst);
        if !B.load(Ordering::SeqCst) {
            unsafe { DATA += 1 };
        }
    });
    let b = thread::spawn(|| {
        B.store(true, Ordering::SeqCst);
        if !A.load(Ordering::SeqCst) {
            unsafe { DATA += 1 };
        }
    });
    a.join().unwrap();
    b.join().unwrap();
}

fn main() {
    for _ in 0..10_000_000 {
        trial();
        if unsafe {DATA == 0} {
            unsafe {
                println!("{}", DATA);
            }
        }
    }
}

Extra word

Type of error

Language or grammar issue

Location of the error

https://marabos.nl/atomics/hardware.html

that keeps the mutexes are spaced further apart.

Description of the error

I suspect the "are" is an error. "keeps" and "are" are two verbs, and, even not being native speaker, I believe there is one verb too much here.

Typo in chapter 2

Type of error

Typo

Location of the error

https://marabos.nl/atomics/atomics.html#example-id-allocation

Description of the error

"While this might seeem like a huge number" -> "While this might seem like a huge number"

Grammar issue in chapter 7

Type of error

Language or grammar issue

Location of the error

https://marabos.nl/atomics/hardware.html#lock-prefix

Description of the error

"it’s be possible to use the bts" doesn't sound correct, probably "it’s possible to use the bts"?

Closure move variable don't always need move keyword.

Type of error

Language or grammar issue

Location of the error

In https://marabos.nl/atomics/basics.html

let numbers = vec![1, 2, 3];

thread::spawn(move || {
    for n in numbers {
        println!("{n}");
    }
}).join().unwrap();

Here, ownership of numbers is transferred to the newly spawned thread, since we used a move closure. If we had not used the move keyword, the closure would have captured numbers by reference. This would have resulted in a compiler error, since the new thread might outlive that variable.

Description of the error

In this example, closure already auto move variable even without move keyword.

let numbers = vec![1, 2, 3];

thread::spawn(|| {
    for n in numbers {
        println!("{n}");
    }
}).join().unwrap();

Above will work well. May another example will be better.

Understated "worst that can happen" in Statistics Example

Type of error

Minor technical mistake

Location of the error

https://marabos.nl/atomics/atomics.html#example-statistics

Description of the error

Since the three atomic variables are updated separately, it is possible for the main thread to load the values after a thread has incremented num_done, but before it has updated total_time, resulting in an underestimate of the average. More subtly, because the Relaxed memory ordering gives no guarantees about the relative order of operations as seen from another thread, it might even briefly see a new updated value of total_time, while still seeing an old value of num_done, resulting in an overestimate of the average.

Neither of this is a big issue in our example. The worst that can happen is that an inaccurate average is briefly reported to the user.

I believe the worst that can happen is you divide by zero.

is it possible for the receiver thread hanging forever?

The content that the question is about

In Chapter 5. Building Our Own Channels

the implementation is from Safety Through Runtime Checks

The question

is it possible for the receiver thread hanging forever?

my reasoning

For is_ready, the sender thread stores true with Release ordering and then unpark the Receiver thread.
After woken up, the receiver thread will check the is_ready again. However, it's possible that the return value is still false because we are using simple load.

This will park the receiver thread again and no other events will wake it up, thus hanging forever.

Typo in Chapter 4

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-spinlock.html

Description of the error

this is_ok should be is_err.

Chapter 9 slightly inaccurate description?

Type of error

Minor technical mistake

Location of the error

https://marabos.nl/atomics/building-locks.html#optimizing-further

Description of the error

"After the lock state has changed" - if I'm following correctly then we don't know that the lock state has necessarily changed, either the lock state changed or we did 100 "spins"?

Question about ch9 3-state mutex impl

The content that the question is about

Chapter 9. Building Our Own Locks > Avoid Syscalls

The question

The unlocking is implemented like

rust-atomics-and-locks/src/ch9_locks/mutex_3.rs

Lines 68 to 74 in af59f9b

 impl<T> Drop for MutexGuard<'_, T> { 

 fn drop(&mut self) { 

 if self.mutex.state.swap(0, Release) == 2 { 

 wake_one(&self.mutex.state); 

 } 

 } 

 }

But I don't understand why the first line of the lock function cannot load the zero value released by unlocking when other threads are waiting for it.

rust-atomics-and-locks/src/ch9_locks/mutex_3.rs

Lines 42 to 49 in af59f9b

 pub fn lock(&self) -> MutexGuard<T> { 

 if self.state.compare_exchange(0, 1, Acquire, Relaxed).is_err() { 

 // The lock was already locked. :( 

 lock_contended(&self.state); 

 } 

 MutexGuard { mutex: self } 

 } 

 }

For example, thread A locked the mutex, and B and C waited for the mutex. A released the mutex, while concurrently, D tried to lock to mutex. If D loads the zero value released by A with if self.state.compare_exchange(0, 1, Acquire, Relaxed).is_err(), D gets the lock and sets the lock state to 1. But the state should be 2 because multiple threads are waiting for it.

Another question is if D and B acquired and released the lock, then thread C acquired the lock by

rust-atomics-and-locks/src/ch9_locks/mutex_3.rs

Lines 63 to 65 in af59f9b

 while state.swap(2, Acquire) != 0 { 

 wait(state, 2); 

 }

It will set the state to 2, but the correct state should be 1 because there is no thread waiting for the mutex.

What do I miss?

Typo in chapter 4 summary

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-spinlock.html#summary

Description of the error

+ A SpinLock<T> can be implement
- A SpinLock<T> can be implemented

Include safety comments on unsafe blocks in chapter 3?

Type of error

Minor technical mistake

Location of the error

https://marabos.nl/atomics/memory-ordering.html#fences

Description of the error

In the "Let’s take a look at a more complicated use case of release and acquire fences" example, there are no // safety: ... comments on the unsafe blocks

I was finding the safety comments on other unsafe blocks to be helpful and I know this is considered a best practice so seems worth pointing out their absence here

Clarification needed for order in the same thread.

First of all, Thanks for the great book.
Secondly, if this is not the right place to talk about the book, let me know.

From the Chapter 3:

AFAIK, compiler reorder, runtime instruction reorder and store buffer can all change the order of the
of the execution.
Not sure how to digest the statement in the picture above, and it's so hard to reason about the 0 20 example by following it ↑.

Typo in Chapter 7 (Processor Instructions)

Type of error

Typo

Location of the error

https://marabos.nl/atomics/hardware.html#instructions

Description of the error

A more direct option is to ask the compiler to produce assembly instead a binary by using the --emit=asm flag to rustc.

instead a binary -> instead of a binary or instead of binary?

Chapter 9 typo

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-locks.html#thundering-herd

Description of the error

"requeuing" used in one place, "requeueing" in another (not sure which is the correct spelling?)

Typo in chapter 9

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-locks.html#mutex

Description of the error

"returning if it succesfully changed the state" -> "returning if it successfully changed the state"

Typo in chapter 1 summary

Type of error

Typo

Location of the error

https://marabos.nl/atomics/basics.html#summary

Description of the error

Scoped threads are useful to limit the lifetime of a thread to alllow it to borrow non-'static data, such as local variables.

'alllow' has an extra 'l'

How is the result 0 20 possible in figure 3.1

The content that the question is about

https://marabos.nl/atomics/memory-ordering.html#happens-before

More interestingly, the output can also be 0 20, even though there is no possible globally consistent order of the four operations that would result in this outcome. When 3 is executed, there is no happens-before relationship with 2, which means it could load either 0 or 20. When 4 is executed, there is no happens-before relationship with 1, which means it could load either 0 or 10. Given this, the output 0 20 is a valid outcome.

The question

Firstly, I want to thank you for writing this book and making it freely available online. I have been enjoying reading it. Recently, I was reading your book with a group, and almost all of us found this section somewhat difficult to understand.

Anyways to the question: How could 0 20 be possible if there are no possible globally consistent order of the four operations that would result in this outcome? I could only see it possible if instruction scheduling/reordering were allowed, which would make sense in this case since we have relaxed memory ordering. I thought with relaxed memory ordering the compiler should be able to reorder those instructions as relaxed memory ordering implies no additional dependency between the atomic memory related instructions.

However given the definition of the happens before relationship and figure 3.1, i.e. no instruction reordering within a thread:

The basic happens-before rule is that everything that happens within the same thread happens in order. If a thread is executing f(); g();, then f() happens-before g().
it causes me confusion.

The only legal states following the happens before relationship I thought would be:

1, 2, 3, 4: 10 20
1, 3, 2, 4: 10 0
1, 3, 4, 2: 10 0
3, 1, 2, 4 : 10 0
3, 1, 4, 2 : 10 0
3, 4, 1, 2 0 0

Also when running your provided code I get 10 20 ~ 92% of the times and the rest 0 0. Would it be possible somehow to force it to get 0 20 in this example? Perhaps a compiler flag?

Later on it does seem to be implied that this might only be the case when instruction scheduling/reordering is allowed:

Our intuitive understanding of the concept of "before" breaks down when things don’t necessarily happen in a globally consistent order, such as when instruction reordering is involved.

I am probably missing something, otherwise in my opinion it is inconsistent.

Guard in Chapter 4 is missing PhantomData<&'a mut T> field

Type of error

Serious technical mistake

Location of the error

https://marabos.nl/atomics/building-spinlock.html#building-safe-spinlock

Description of the error

The guard is missing PhantomData<&'a mut T> field. This means it's possible to write code like the following:

use std::cell::Cell;
use std::thread;
let x = SpinLock::new(Cell::new(42));
let locked = x.lock();
thread::scope(|s| {
    s.spawn(|| locked.set(1));
    s.spawn(|| locked.set(2));
});

Which is a data race as Cell is available in two threads at once.

This problem also occurs in Chapter 9 with its MutexGuard.

A couple less "concrete" suggestions

The content that the question is about

No response

The question

Here are a couple of other notes that I took while going through the book that are more general feedback/suggestions about particular sections (vs typos etc):

Chapter 3 - the explanation of fences feels like it could have been fleshed out a bit more, I came away slightly confused as to what exactly it guarantees and why/how
Chapter 5: I could use a bit more of an explanation of the stuff around Channel<T> being Sync if T: Send (I've read about Send/Sync but feel a bit fuzzy on those generally) eg why exactly is it ok to implement Sync under those conditions and how exactly does that flow through the code to give us the desired/expected use of the channel across threads?

(no need to try and explain these here unless you feel very compelled to, more just suggestions for those sections of the book)

I found the book to be written at a very high level of quality throughout, I appreciated how "focused" and "sober" it was, so thank you!

Trying to clarify futex wait/wake behavior.

The content that the question is about

https://marabos.nl/atomics/os-primitives.html#futex

fn main() {
    let a = AtomicU32::new(0);

    thread::scope(|s| {
        s.spawn(|| {
            thread::sleep(Duration::from_secs(3));
            a.store(1, Relaxed);
            wake_one(&a);
        });

        println!("Waiting...");
        while a.load(Relaxed) == 0 {
            wait(&a, 0);
        }
        println!("Done!");
    });
}

The question

(This is somewhat similar to the question in #10)

I'm trying to understand what guarantees that the wait will always see the relaxed store of 1 if wake_one has already been called.

The chapter notes:

The wait operation behaves atomically with respect to the wake operation, meaning that no wake signal can get lost between the check of the expected value and the moment it actually goes to sleep.

If we make sure that the value of the atomic variable is changed right before a wake operation, we can be sure that a thread that’s about to start waiting will not go to sleep, such that potentially missing the futex wake operation no longer matters.

However, I'm having trouble connecting the atomic-ness of the wait operation to the implication that it will observe the store that occurs right before the wake.

The linux futex docs note:

This load, the comparison with the expected value, and starting to sleep are performed atomically and totally ordered with respect to other futex operations on the same futex word.

Does the "totally ordered" play a role in this somehow? From the memory ordering chapter, I do know that a particular atomic variable will have a total modification order, but for me it isn't clear how this is connected to the order of futex operations (which aren't writes to that atomic).

I guess the next paragraph in the futex docs pretty much guarantees that the changed value will be observed, but it doesn't really provide any additional explanation:

The purpose of the comparison with the expected value is to prevent lost wake-ups. If another thread changed the value of the futex word after the calling thread decided to block based on the prior value, and if the other thread executed a FUTEX_WAKE operation (or similar wake-up) after the value change and before this FUTEX_WAIT operation, then the calling thread will observe the value change and will not start to sleep.

A question about the test of `Arc<T>`

The content that the question is about

rust-atomics-and-locks/src/ch6_arc/s1_basic.rs

Lines 103 to 112 in af59f9b

 // One Arc, x, should be dropped by now. 

 // We still have y, so the object shouldn't have been dropped yet. 

 assert_eq!(NUM_DROPS.load(Relaxed), 0); 

 // Drop the remaining `Arc`. 

 drop(y); 

 // Now that `y` is dropped too, 

 // the object should've been dropped. 

 assert_eq!(NUM_DROPS.load(Relaxed), 1);

The question

In the test implementation, we use a Relaxed ordering to load the result of the current times of drops. However, AFAIK, would Relaxed ordering let the compiler or CPU move the instruction above or beneath the dropping of y, which results in a failure in the test? Dropping y seems not to have happens-before relationship with the last two atomic operations. Would SeqCst be a more appropriate one to have?

Chapter 7 wrong module name?

Type of error

Minor technical mistake

Location of the error

https://marabos.nl/atomics/hardware.html#reordering-experiment

To be sure the compiler won’t be the one to reorder our operations, we’ll use the std::sync::compiler_fence() function to inform the compiler of the operations that should have been Acquire or Release, without telling the processor.

Description of the error

The compiler_fence() function seems to be in the std::sync::atomic module, not std::sync module.

https://doc.rust-lang.org/std/sync/atomic/fn.compiler_fence.html

Missing code-markup in chapter 5?

Type of error

Formatting error

Location of the error

https://marabos.nl/atomics/building-channels.html#safety-through-runtime-checks

Description of the error

"Since we now have an acquire-load of the ready flag" -> "Since we now have an acquire-load of the ready flag" (since it looks like "ready flag" is used elsewhere vs "ready flag")?

Typo in chapter 5 summary

Type of error

Typo

Location of the error

https://marabos.nl/atomics/building-channels.html#summary

Description of the error

Taking a non-Copy object by value can be used to prevent something to be done more than once.

Potential fix:

"Taking a non-Copy object by value can be used to prevent something ~~to be~~ from being done more than once."

(Marked as a typo but maybe this could be considered a grammer issue?)

	impl<T> Drop for MutexGuard<'_, T> {
	fn drop(&mut self) {
	if self.mutex.state.swap(0, Release) == 2 {
	wake_one(&self.mutex.state);
	}
	}
	}

	pub fn lock(&self) -> MutexGuard<T> {
	if self.state.compare_exchange(0, 1, Acquire, Relaxed).is_err() {
	// The lock was already locked. :(
	lock_contended(&self.state);
	}
	MutexGuard { mutex: self }
	}
	}

	// One Arc, x, should be dropped by now.
	// We still have y, so the object shouldn't have been dropped yet.
	assert_eq!(NUM_DROPS.load(Relaxed), 0);

	// Drop the remaining `Arc`.
	drop(y);

	// Now that `y` is dropped too,
	// the object should've been dropped.
	assert_eq!(NUM_DROPS.load(Relaxed), 1);