How Rust Prevents Data Races

Published in

CodeX

10 min readJun 29, 2021

tl;dr: Rust’s memory management model, discussed in part 1 of this series, extends nicely to provided similar protection against data races when using shared-memory concurrency. This post looks at two specific examples of how the compiler prevents what could be hard-to-reproduce and time-consuming errors at compile time.

Managing interacting thread can be hard or impossible to reason about. Static analysis can untangle many issues where shared data interactions cannot be avoided. Photo by Nima Shabani on Unsplash

The Only Good Shared Data are Immutable Shared Data

EDA (Electronic Design Automation) tools are a class of technical computing applications focused on the design and validation of semiconductors and electronic systems in general. As discussed in part 1 of this series, elapsed runtime is commonly a competitive differentiator for these compute-intensive tools. Efficient algorithms that can maximize the hardware performance are a must and, with server-class processors shipping with 16, 20, or more cores per-device, that means parallel algorithms.

In an ideal world, single-threaded processes communicate by transmitting data without sharing data. In the real-world, copying GB’s and TB’s of data between processes is often impractical when an 8-byte pointer could instead be passed to another thread.

Unfortunately, shared-memory parallelism can be difficult to get, and keep, correct. All accesses to the shared data must be guarded and the guards used in every case. No accesses to shared data can be missed, even a buried call to an innocuous, non-re-entrant function originally intended for single-threaded usage.

The Rust Language was explicitly designed with one of the goals to enable safe and efficient concurrent programming, to the extent that the community has adopted the mantra “fearless concurrency”. This post will delve into how Rust’s unique ownership model not only ensures memory safety, even across threads, but also detects and prevents data races at compile time.

Eliminating Data Races, Statically

The previous post summarized Rust’s ownership rules like this:

Every value has a single owner (e.g., variable, structure fields) and the value is released (dropped) when the owner goes out of scope;
There may exist at most one mutable reference to a value; or
There may be any number of immutable references to a value and while they exist the value may not be mutated;
All references must have a lifetime no longer than the value being referred to.

Rules 2 and 3 turn out to be the same rules as those for a readers-writer lock: multiple readers (immutable references) or a single writer (mutable reference) are allowed access. Coupled with the lifetime analysis of values and references for memory management, the compiler now has a basis to statically detect potential data races.

To understand this better, let’s look at a simple example in C++:

int main() {
  std::string msg = "Hello";  std::thread t1([&](){ 
    std::cout << msg << std::endl;
  });
  msg += ", world!";
  t1.join();
  return 0;
}

The example spawns a thread to print the content of msg while the main thread continues on to mutate that message. What result gets printed? On my laptop using g++ 10.2 I get “Hello, world!”. With minor changes, the result changed to “Hello”. If I’m really (un)lucky, I might even see a data corruption issue if the accesses overlap.

Here is the same example in Rust:

fn main() {
  let mut msg = "Hello".to_string();  let handle = thread::spawn(|| {
    println!("{}", &msg);
  });
  msg.push_str(", world!");
  handle.join().unwrap();
}

Rust Playground — run this example to see the full compiler output

The compiler produces two errors, the most important one for this example being:

error[E0502]: cannot borrow `msg` as mutable because it is also borrowed as immutable
 --> src/main.rs:9:5

The closure passed to thread::spawn implicitly creates an immutable reference to msg. However, msg is also used mutably afterwards, violating the prohibition against having both mutable and immutable references to a value at the same time. The compiler has caught the data race. Moving the push_str command above the call to thread::spawn is one potential solution to the problem since the mutable reference is no longer needed by the time the thread is created.

The second error reported is that msg potentially does not live long enough because it is stack allocated and will be released when main ends. Even though the example does join the thread, avoiding this issue, the compiler does not analyze what the spawned thread may do with the value.

The final, correct version modifies the value of msg and then moves it to the spawned thread, guaranteeing that it remains valid as long as needed. The compiler also guarantees that the main thread no longer accesses the value — it has been moved. Thus the compiler also prevents the possibility of a use-after-free error where the spawned thread releases the string memory while the original thread has a dangling reference to it.

Rust Playground — final working version

This example shows how the compiler can detect and prevent data races by preventing two threads from accessing the same value, thus helping enforce the ideal case of transmitting, instead of sharing, data. But what about when sharing, and modifying, data from different threads is necessary? The next example explores how Rust’s ownership model can be leveraged to guarantee synchronized data are accessed correctly, and prevent accidental (or intentional?) leaks of references which by-pass the synchronization.

Sharing Mutable Data, Safely

Sharing mutable data requires some means of gaining exclusive access to the data to prevent modification when one or more readers are active. Frequently this is done with a mutex (MUTual EXclusion lock).

In C++, one common pattern is to use RAII to acquire the lock and guarantee the lock is released at the end of a block. A typical example looks something like this:

void MyClass::method() {
  Lock lck(mut);    // Acquire lock on mutex 'mut'
  ... use the synchronized data ...
}

This pattern has the advantage that it guarantees the lock is released at the end of the block by the Lock destructor, even if an exception occurs within the synchronized code. Unfortunately this pattern doesn’t explicitly tell us which members of MyClass need to be synchronized, and doesn’t prevent code from accessing synchronized data without holding the lock.

Rust leverages the single-ownership rule for values to eliminate both of these weaknesses. Specifically, the mutex owns the value so all accesses must go through it and there is no question about whether a given data element needs to be synchronized or not. Further, no references to the data can exist longer than the lifetime of the lock. How?

Let’s look at an example and walk through exactly what is happening.

fn main() {
    let str = "Hello".to_string();
    let sref = &str;  // Attempt to cheat!
    
    // Atomic ref count (Arc) owns mutex, which owns the string
    let rc = Arc::new(Mutex::new(str));
    
    let rc_to_thread = Arc::clone(&rc);
    let h = thread::spawn(move || {
        let mut thread_str = rc_to_thread.lock().unwrap();
        (*thread_str).push_str(", world!");
    });    h.join().unwrap();
    println!("m = {:?}", sref);
}

Rust Playground — run the example here

Similar to the example before, the string Hello, is created and the rest of the string will be appended. However, in this case, a spawned thread will perform the append and the main thread will write out the result. Once the string is created, it is moved into (owned by) a Mutexinstance, and the mutex is owned by an Arc (Atomic Reference Count) value.

Remember in Rust that every value must have exactly one owner. Arc is the single-owner of the Mutex and the string, but it is special: the actual ‘owner’ is a heap-allocated, atomically-reference-counted container and the rc value here is a smart pointer to that container. The following line:

let rc_to_thread = Arc::clone(&rc);

clones the smart pointer, which increments the reference count. Thus rc_to_thread can be moved to the new thread — it will have a new owner — but both Arc instances reference the same value. (For a description of how Arc and Mutex get around Rust’s rules, see “Appendix: Into the Unsafe Weeds of Arc and Mutex” at the end.)

The second thread, having received rc_to_thread, is able to access the string. We have successfully made the string available across two threads and the string will remain valid as long as any reference to it exists. However, Arc doesn’t have any synchronization behavior so it can only return immutable references, thus the spawned thread would not be able to append to it. (Rust does not allow the equivalent of C++’s const_cast in safe code).

This is where Mutex comes into play: by providing synchronization, the lock().unwrap() call is able to return a mutable reference to the string. The value thread_str is a smart pointer that both holds the lock until the value is dropped (goes out of scope) and provides access to the string value.

Here again is the lambda function passed to the thread:

{
  let mut thread_str = rc_to_thread.lock().unwrap();
  (*thread_str).push_str(", world!");
}

Notice that this is similar to the C++ locking pattern in that thread_str holds the lock until the end of the scope. The big difference is that thread_str is a smart pointer that both holds the lock and provides access to the contained value when dereferenced.

Why does this difference matter? Consider a slightly more complex lambda function that attempts to cheat:

{
  let str_ref : &String;
  if true {
    let mut thread_str = rc_to_thread.lock().unwrap();
    (*thread_str).push_str(", world!");
    str_ref = &(*thread_str);  // Save a reference outside the lock
  }
  ...
}

Here the code attempts to save a reference to the synchronized data outside of the lock. However, this attempt is prevented by Rule #4: references cannot outlive the value they reference. str_ref lives longer than thread_str, the holder of the lock, and thus is prevented.

The final, working version avoids any attempt to bypass the locking mechanism and instead goes through rc to lock the value when printed. The complete code can be seen and experimented with at the link below.

Rust Playground — run the final version here

Try to inject a data-race that the compiler can’t detect!

Summary

At one point in my career, I was part of a team building a cross-platform, Python-based GUI. We were plagued by an infrequent memory corruption issue that only showed up on Mac OS. As the GUI stabilized, it became more urgent because, while not frequent, it wasn’t something we would ship with yet it resisted debugging. aSan revealed nothing.

After focusing on it for a day or two and getting nowhere, I woke up one morning and realized it wasn’t a memory allocation issue but a timing issue. The application itself was singled-threaded, but the underlying GUI library did use threads. An hour of reviewing any code remotely related to the stack traces turned up an unsynchronized use of a pointer in the binding library.

Errors like this are easy to make yet can be very difficult to reproduce and thus difficult to find and correct. In this case, we easily spent a week, maybe more, of valuable developer time trying to find the problem. This is exactly the sort of problem which can now be caught, and eliminated, at compile time.

Hopefully the examples here have provided enough detail, and weren’t tool overwhelming, to give a sense for how Rust works and is able to prevent such errors. If you are interested in reading more, consider Mozilla’s “Fearless Concurrency” post about the success they had in parallelizing the CSS renderer in Firefox after two failed attempts, or ’The Rust Book’ chapter on concurrency for more details.

The first two posts in this series have focused on very quantitative benefits: memory and concurrency safety. In the final post I will switch gears and focus on what I think is a potential to significantly increase developer productivity by increasing reuse of both open-source and company-proprietary code.

Appendix: Into the Unsafe Weeds of Arc and Mutex

The unsafe keyword allows code to step out of the normal safe bounds, when needed. Photo by Jack Sloop on Unsplash

The Arc and Mutex types both appear to violate the rules of the Rust compiler: Arc allows multiple owners for a single value and Mutex can generate a mutable reference from an immutable one, similar to const_cast in C++. Both are good examples of how Rust’s unsafe keyword can be leveraged to implement primitives which present safe abstractions. Let’s look at how they work.

Calling Arc::new creates the smart pointer reference holder, as well as a separately allocated block of memory which holds the reference count and the value. Allocating this block of memory requires an unsafe operation since the lifetime is not tied to a specific value. As the smart pointer instances are created and destroyed (‘dropped’), the reference count is updated. Once it reaches zero, another unsafe operation is performed to drop the value and release the allocated memory.

Similarly, the Arc smart pointer only returns immutable references to the contained value yet calling lock() on an immutable Mutex returns a mutable reference to the contained value. Mutex is able to do this by calling the equivalent of the C++ const_cast (an unsafe operation) once the locking semantics are satisfied.

The important part here is that Arc and Mutex are not given special treatment by the compiler, anyone can write new primitives with the same capabilities. So then you may be wondering, if a developer can just insert an unsafe code block, doesn’t that negate Rust’s safety advantages?

The answer depends on how unsafe is used. In the two cases discussed here, a type presents a well-defined, safe interface which encapsulates a small amount of unsafe code. This allows the type to be carefully reviewed to ensure correct behavior, and utilize the compiler safety guarantees for the bulk of the code.

One certainly could declare large amounts of code in an application unsafe and defeat the compiler checks if desired, just as it is possible to write C-style, pointer-heavy code in C++ and ignore the STL. There are legitimate uses for both, but typically are a small percentage of the lines of an application. Or, to paraphrase Bill Clinton, such code should be unsafe, legal, and rare.