Sunday, May 19, 2013

Lock-Free and Wait-Free, definition and examples

After searching around in Google, I found that there is no single place that properly explains the different Progress Conditions that a concurrent algorithm or data structure may provide. Even "The Art of Multiprocessor Programming" has these definitions a bit spread around the book, and most of them are single one-line sentences that are way to cryptic for common mortals like myself, so I decided to put together what I know on the subject, and add some examples for each case.

We start with a list of the different classifications for Progress Conditions separated into the four main groups Blocking, Obstruction-Free, Lock-Free, and Wait-Free:

Blocking
    1. Blocking
    2. Starvation-Free

Obstruction-Free
    3. Obstruction-Free
Lock-Free
    4. Lock-Free (LF)
Wait-Free
    5. Wait-Free (WF) 
    6. Wait-Free Bounded (WFB)

    7. Wait-Free Population Oblivious (WFPO)


Before we dive into the definition of each of these terms, let us look at some of the details behind them.

From what's written on section 3.6 of "The Art of Multiprocessor Programming", it is difficult to understand whether or not the definitions for Wait-Free Bounded and Wait-Free Population Oblivious are orthogonal or not. Personally, I think that having an algorithm that is Wait-Free Population-Oblivious implies that it is bounded and, therefore, it is also Wait-Free Bounded. This means that a function (or algorithm, or class) can be WF, or WFB, or WFPO, this last one being the "best" of all the Progress Conditions.

The examples we will give here are all concerning the progress condition of a particular function (of an algorithm, or class), and not the full algorithm. We do this because different functions in a given algorithm may have very different progress guarantees, like for example: the Write operations may be Blocking, while the Read operations be Wait-Free Population Oblivious, something that is not obvious when you first look at Lock-Free and Wait-Free data structures.

Yes, you read it correctly: a data structure may have some of its functions Blocking and others Lock-Free and others even Wait-Free. On the other hand, when we say that a particular data structure is Lock-Free it means that all its operations are Lock-Free, or better.
There is no standard strict ranking in the literature of which one is better, but it is usually implied as :
Wait-Free is better than Lock-Free which is better than Blocking.

The reason for this is that, there aren't that many (practical) data structures that are Lock-Free or Wait-Free, so the issue of properly classifying them doesn't usually show up. Andreia and I have come up with so many different variations of different concurrent algorithms with mixed Progress Conditions that we had to start naming them properly and "sorting" them, to keep a handle on it, and to figure out which ones were worth researching.
Our order is the one described in the beginning of the post, where item 1. Blocking is the worst, and 7. WFPO is the best possible kind of guarantee you can get.

Something else that is a common cause of confusion is that "Wait-Free implies Lock-Free" (but not the other way around). This means that a function that is said to be Wait-Free gives that same (and more) progress guarantees than a function that is Lock-Free.
Here is a Venn diagram that might explain better what we mean by it:


This diagram represents sets of algorithms, where an algorithm that is WFPO is also part of the algorithms that are Lock-Free.

1. Blocking

This one is the most commonly known. Basically, almost everything with a lock is Blocking. An unexpected delay by one thread can prevent others from making progress. In the worst case, a thread holding the lock may be put to sleep and thus block every other thread (waiting on that lock) preventing them from making any progress.
Definition: A function is said to be Blocked if it is unable to progress in its execution until some other thread releases a resource.
Example: A simple CAS in a loop for a two state variable:

AtomicInteger lock = new AtomicInteger(0);
public void funcBlocking() {
         while (!lock.compareAndSet(0, 1)) {
               Thread.yield();
        }
}

2. Starvation-Free

This is sometimes called Lockout-Free. This is a dependent propriety in the sense that the progress occurs only if the underlying platform/OS provides certain guarantees.
Definition: As long as one thread is in the critical section, then some other thread that wants to enter in the critical section will eventually succeed (even if the thread in the critical section has halted).
Example: An exclusive lock with strict fairness is usually Starvation-Free.
A Ticket Lock in x86 is starvation free because it uses a single atomic_fetch_add() to place itself on the virtual queue of waiting threads.
Code for the Ticket Lock can be seen here.

3. Obstruction-Free

A non-blocking propriety. For more details on Obstruction-Free and Starvation-Free take a look at page 60 of "The Art of Multiprocessor Programming" (revised edition).
Definition: A function is Obstruction-Free if, from any point after which it executes in isolation, if finishes in a finite number of steps.
Example: Only example I know of, is the Double-ended Queue made by Maurice Herlihy, Mark Moir, and Victor Luchangco.

4. Lock-Free

The Lock-Free property guarantees that at least some thread is doing progress on its work. In theory this means that a method may take an infinite amount of operations to complete, but in practice it takes a short amount, otherwise it won't be much useful.
Definition: A method is Lock-Free if it guarantees that infinitely often some thread calling this method finishes in a finite number of steps.
Example: Incrementing an atomic integer in a CAS loop:

AtomicInteger atomicVar = new AtomicInteger(0);
public void funcLockFree() {
        int localVar = atomicVar.get();
        while (!atomicVar.compareAndSet(localVar, localVar+1)) {
               localVar = atomicVar.get();
        }
}
Another notable example of Lock-Free is the ConcurrentLinkedQueue in java.util.concurrent, whose add() and remove() operations are Lock-Free.

5. Wait-Free

The Wait-Free property guarantees that any given thread provided with a time-slice will be able to make some progress and eventually complete. It guarantees that the number of steps is finite, but in practice it may be extremely large and depend on the number of active threads, which is why there aren't many practical Wait-Free data structures.
Definition: A method is Wait-Free if it guarantees that every call finishes its execution in a finite number of steps.
Example: This paper has an example of an algorithm that is Wait-Free (unbounded)
"Wait-Freedom vs. Bounded Wait-Freedom in Public Data Structures"

6. Wait-Free Bounded

Any function that is Wait-Free Bounded is also Wait-Free (but not necessarily Wait-Free Population Oblivious)
Definition: A method is Wait-Free Bounded if it guarantees that every call finishes its execution in a finite and bounded number of steps. This bound may depend on the number of threads.
Example: A function that scans/writes a value in an array whose size depends on the number of threads. If the number of operations per entry in the array is constant, then it is easy to see that it is Wait-Free Bounded but not Population Oblivious because the size of the array depends on the number of threads:

AtomicIntegerArray intArray = new AtomicIntegerArray(MAX_THREADS);
public void funcWaitFreeBounded() {
         for (int i = 0; i < MAX_THREADS ; i++) {
                intArray.set(i, 1);
        }
}

7. Wait-Free Population Oblivious

These functions complete in a number of instructions that does not depend on the number of active threads. Any function that is Wait-Free Population Oblivious is also Wait-Free Bounded.
Definition: A Wait-Free method whose performance does not depend on the number of active threads is called Wait-Free Population Oblivious.
Example: The most simple example is incrementing an atomic variable using a fetch-and-add primitive (XADD instruction in x86 CPUs). This can be achieved using C11/C++11 atomic function fetch_add():

atomic<int> counter;
void funcWaitFreeBoundedPopulationOblivious() {
    counter.fetch_add(1);
}



Closing thoughts

This isn't the whole story. There are two very important points we're missing here to get the full picture:

1st, if your function needs to allocate memory, then the progress guarantee it can give is bounded in practice by the progress condition given by the memory allocation mechanism. In my view, we should have a different sub-classification to distinguish between the functions that need to allocate memory and those who don't. You can check out this post for more details, but the basic idea is that there is not much of a point in creating a Wait-Free Population Oblivious function which always needs to allocate memory using a Blocking mechanism.

2nd, the whole concept of Progress Condition is to classify algorithms and functions in terms of time guarantees, yet, the definitions are made in terms of number of operations. This is done under the assumption that the time it takes for an operation to complete is the same regardless of the number of active threads, which is a correct assumption for single-threaded code, but an invalid one when it comes to multi-threaded programs.
The way the CPU's cache-coherence mechanisms work, makes it so that contention on (atomic) variables that are being accessed by multiple threads/cores can cause an operation/instruction to take much longer in certain situations (due to cache-misses). If you don't believe me then take a look at this post in the Mechanical Sympathy blog.
This is the main reason (or one of the main reasons) why many implementations of Wait-Free data structures are much slower than Lock-Free implementations of the same data structures, because although they have a stronger guarantee on the total number of operations, each operation may take much longer to complete because it causes more contention... and because they may have a larger average number of operations.


In practice, there is little difference between "Wait-Free" and "Wait-Free Bounded". This paper has a good explanation on it: "Wait-Freedom vs. Bounded Wait-Freedom in Public Data Structures"

Another detail is that there is no "Wait-Free Bounded Population Oblivious" because being Population Oblivious implies an upper bound on the worst-case number operations it takes to complete.



Hope that this post has helped you understand the differences between Lock-Free and Wait-Free.



5 comments:

  1. Hi. Thanks for the article. I think I'm missing something since it looks like the vien digram is backwards. You write that 'Something else that is a common cause of confusion is that "Wait-Free implies Lock-Free" (but not the other way around). This means that a function that is said to be Wait-Free gives that same (and more) progress guarantees than a function that is Lock-Free.' I read this as that the set of all Wait-Free functions is a subset of all Lock-Free functions (since the conditions for Wait-Free are stronger than for Lock-Free). In your diagram it's other way around: set of Wait-Free functions contains the set of Lock-Free...

    ReplyDelete
    Replies
    1. Hi Roman. You are right, Andreia had already pointed it out before and I forgot to change it. I was thinking in terms of "stronger guarantees" but that would not be a proper usage of a Venn diagram.
      I've replaced it with a new diagram where the order is reversed as you suggested.
      Thanks for pointing it out!

      Delete
  2. I might be wrong, but shouldn't the definition of wait-free explicitly require being lock-free. Because an obstruction-free operation that uses a lock which is exclusive only to that operation will finish in a bounded number of steps (depending on the population), right?

    ReplyDelete
    Replies
    1. Hi Jaksa,
      The definition of wait-free doesn't explicitly require lock-free, but it does so implicitly:
      If we provide the guarantee that N threads are making progress (wait-free) then this implies also that at least one thread is making progress (lock-free).

      An obstruction-free method will complete in a finite number of steps if it is running in isolation (i.e. no other threads are attempting to access the same object/lock/whatever). The same is true for lock-free.
      If it is _not_ running in isolation, then obstruction-free does _not_ give you the guarantee to complete in a finite number of steps, and lock-free gives you the guarantee that at least _one_ thread will complete in a finite number of steps (but maybe none of the other N-1 threads will ever complete).
      Wait-free gives you the guarantee that _all_ threads will always complete in a finite number of steps.

      Hope this answers your question.

      Delete