The Dangers of Race Conditions in Five Minutes

Key Takeaways

Race conditions are a potentially dangerous property of multithreaded code, where the program’s outcome is dependent on the specific order of operations, which is not guaranteed by the underlying platform. This can result in fluctuating outcomes across different runs of the program, as the outcome depends on how operations from different threads interleave.
Race conditions can be caused by multiple threads sharing and mutating the same object, leading to unpredictable outcomes. This can occur even in simple scenarios, such as multiple threads incrementing the same counter, or in more complex scenarios, such as multiple threads updating the same instance of a LinkedList.
Preventing race conditions can be achieved in several ways, including synchronization and locks, using concurrent collections and atomic values provided by JDK, or by avoiding mutable state altogether by using immutable objects and data structures. Proper synchronization ensures that only one thread can access a shared resource at a time, thus preventing race conditions.

Race Conditions
A Simple Example
How Increment Creates a Race Condition
A Not so Simple Example
Conclusions
Comments

A race condition is an undesired property of multithreaded code. It expresses that the program’s outcome depends on a particular order of operations but that the underlying platform (in the case of Java, the JVM) does not guarantee that order. As a consequence the outcome is often fluctuating across runs as it depends on how exactly operations from different threads interleave. In Java, race conditions occur most often when multiple threads share and mutate the same object.

The only prerequisite for this article is a basic knowledge of threads.

Race Conditions

A Simple Example

Let’s start with an example. Here I’ve implemented a Runnable that increments a shared counter ten thousand times:

class IncrementingRunnable implements Runnable {

    private final MutableInteger mutableInteger;

    public IncrementingRunnable(MutableInteger mutableInteger) {
        this.mutableInteger = mutableInteger;
    }

    @Override
    public void run() {
        for (int i = 0; i < 10_000; i++) {
            mutableInteger.increment();
        }
    }

}

Since a value of a standard Java Integer cannot be updated after it was created, I’ve implemented a simple MutableInteger class that has two methods:

increment to increment its value
getValue to get the result value.

class MutableInteger {

    private int value = 0;

    public void increment() {
        value++;
    }

    public int getValue() {
        return value;
    }

}

With this, we can pass the same counter to multiple threads that will increment it in parallel:

List<Thread> threads = new ArrayList<>();

// Variable to increment from multiple threads
MutableInteger integer = new MutableInteger();
// Run 10 threads to increment the same variable
for (int i = 0; i < 10; i++) {
    Thread thread = new Thread(new IncrementingRunnable(integer));
    thread.start();
    threads.add(thread);
}

// Wait until all threads are finished
for (Thread thread : threads) {
    thread.join();
}

System.out.println("Result value: " + integer.getValue());

(If you’re not sure how Thread::start or join work, read this five-minute intro to threads.)

Since we have ten threads and every thread performs 10,000 increments, we might expect the final value to be 100,000. But the actual result may surprise you:

Result value: 61505

What is more astonishing is that the total sum is different every time the application is executed, but it is always less than 100,000.

How Increment Creates a Race Condition

To understand what happens here we need to cover how the increment operation works. You may think that it is an indivisible operation and does not have any intermediate states, but in reality, a CPU executes several low-level instructions like these:

//i++
$tmp = i; // Store value to a temporary storage
$tmp = $tmp + 1; // Increment value in the temporary storage
i = $tmp; // Write result value back

This does not cause any issues when only one thread is updating a counter. But when there are multiple threads they can step on each other’s toes and execute those low-level operations in any order:

// T1 is Thread 1; T2 is Thread 2
T1: $tmp_1 = i;
T1: $tmp_1 = $tmp_1 + 1;
T2: $tmp_2 = i;
T2: $tmp_2 = $tmp_2 + 1;
T2: i = $tmp_2;
T1: i = $tmp_1;

In this particular example, threads 1 and 2 both load the same i, let’s say 5, into their respective temporary variable. Hence, after incrementing both write 6 back but the desired result for two increments would have been 7.

This is why when we run our application the final result is always less than 100,000. An individual thread is not aware that there are other threads that are also incrementing the same number. As a result, the final outcome depends on which thread accesses memory in which order.

A Not so Simple Example

Race conditions can also occur when we work with complex data structures. Let’s take a look at an example when multiple threads are updating the same instance of a LinkedList:

class ListAddRunnable implements Runnable {

    private final LinkedList<Integer> list;

    public ListAddRunnable(LinkedList<Integer> list) {
        this.list = list;
    }

    @Override
    public void run() {
        for (int i = 0; i < 10_000; i++) {
            list.add(i);
        }
    }

}

Using this Runnable, we can again start 10 threads that each is supposed to add 10_000 elements to the same list and wait for them to finish. Here is the output that I got:

Result value: 44157

What happened here? To understand it we first need to understand how LinkedList and particularly add work. Internally LinkedList is represented as a chain of nodes and every node contains two references: one to an element added to the list and another one to the next node. The add method adds a node to the end of the list – here’s a simplified version of that method:

void add(E element) {
    Node<E> newSecondLast = this.last;
    // create the new node (null says there is no next node)
    Node<E> newNode = new Node<>(element, null);
    // use it as the last node and let the old 'last' refer to it
    this.last = newNode;
    newSecondLast.next = newNode;
}

(You can read more about how LinkedList is implemented on Wikipedia.)

Now let’s take a look at one sequence of events that can happen if two threads are adding elements at the same time:

// First thread starts adding an element to the end of the list
T1: Node<E> newSecondLast = this.last;
T1: Node<E> newNode = new Node<>(element, null);
// Second thread adds an element to the end of the list
T2: Node<E> newSecondLast = this.last;
T2: Node<E> newNode = new Node<>(element, null);
T2: this.last = newNode;
T2: newSecondLast.next = newNode;
// First thread continues and overwrites the last element of the list
T1: this.last = newNode;
T1: newSecondLast.next = newNode1

Note that all variables are local to the two threads except this.last. It is the order in which they read from and write to that shared field that makes or breaks the program. In the case above both read the same last node and from there on there is no way to get to a correct state because either T1 overrides T2 or the other way around.

Again, the program’s behavior depends on the unpredictable interaction between threads.

Race Condition on Horse

Conclusions

In this post, you’ve learned that a race condition occurs when multiple threads update shared state in a way that is not guaranteed to be correct. It may be tricky to spot, especially if multithreading programming is new for you, but, as a rule of thumb, you should be especially careful every time you have an object accessible by multiple threads.

There are several ways to prevent race conditions. The most basic way to do it is to use synchronization and locks – low-level primitives to build concurrent algorithms. JDK also provides a wide range of concurrent collections and atomic values that can be used as high-level building blocks.

Another way to avoid race conditions is to avoid a mutable state in the first place. To do this, you need to use immutable objects and immutable data structures that create copies of themselves on every update instead of performing a change in-place.

Frequently Asked Questions (FAQs) about Race Conditions

What are the common causes of race conditions in programming?

Race conditions in programming are typically caused by the improper use of shared resources. This can occur when two or more threads access shared data and try to change it at the same time. As a result, the values of variables may be unpredictable as they can be dependent on the timings of context switches of the processes. Other causes include improper use of multi-threading and lack of synchronization when accessing shared resources.

How can race conditions be prevented?

Race conditions can be prevented by proper synchronization of threads. This can be achieved by using various synchronization techniques such as mutexes, semaphores, monitors, and condition variables. These techniques ensure that only one thread can access the shared resource at a time, thus preventing race conditions. Additionally, careful design of the program can also prevent race conditions.

What are the potential dangers of race conditions?

Race conditions can lead to unpredictable results and bugs that are hard to reproduce and fix. They can cause data corruption, system crashes, and security vulnerabilities. In severe cases, they can lead to a complete system failure. Therefore, it is crucial to prevent race conditions in the design and implementation stages of software development.

How are race conditions detected?

Detecting race conditions can be challenging as they do not occur every time the program runs and their effects can be subtle. However, there are tools and techniques available to detect race conditions. These include static analysis tools, dynamic analysis tools, and stress testing. Additionally, careful code review and testing can also help in detecting race conditions.

What is the difference between a race condition and a deadlock?

A race condition is a situation where the output or result of a process is dependent on the sequence or timing of other uncontrollable events. On the other hand, a deadlock is a situation where two or more tasks permanently block each other by each task waiting for the other to release a resource.

Can race conditions be exploited for malicious purposes?

Yes, race conditions can be exploited for malicious purposes. An attacker can take advantage of the timing discrepancy to manipulate the system. This can lead to unauthorized access, data corruption, and other security breaches.

Are race conditions only a concern in multi-threaded environments?

While race conditions are more common in multi-threaded environments, they can also occur in single-threaded environments. This can happen when the sequence or timing of events affects the program’s outcome, such as when signals, interrupts, or other asynchronous events are involved.

What is a data race and how is it different from a race condition?

A data race occurs when two instructions from different threads access the same memory location, with at least one of them writing, and the operations are not ordered. A race condition, on the other hand, is a higher-level semantic error where the incorrectness of the program’s behavior depends on the relative timing of events beyond just memory accesses.

How does the operating system handle race conditions?

Operating systems handle race conditions by providing mechanisms for process synchronization. These mechanisms include locks, semaphores, and condition variables. These tools allow processes and threads to coordinate their actions to prevent race conditions.

Are race conditions a hardware or software issue?

Race conditions are primarily a software issue. They occur due to the non-deterministic timing of events in a concurrent system. However, hardware can also play a role in race conditions, particularly in multi-processor systems where memory access times can vary.