27 Nov 2015
We’ll first introduce the concept of software transactional memories, then explain what problem it tries to solve and then go through some toy examples in Haskell to learn about the API provided by the
What is Software Transactional Memory? According to :
Software Transactional Memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. STM is strategy implemented in software, rather than as a hardware component.
It’s one of many alternatives to deal with shared memory among several concurrent threads in concurrent environments. We’ve discussed parallel and concurrent programming in a previous post. In that case, our choice for thread synchronization was using locks implemented with mutable variables.
One problem with using locks to grant exclusive access to memory is that we need to careful with pitfalls like deadlocks. For example, consider the following function that implements transferring values between two variables by using locks:
First, lets recap how
MVars behave: an
MVar either has a value or is empty. If a thread calls
takeMVar() on an empty
MVar, it blocks until another thread puts a value there. On the other hand, a thread blocks if it tries to
putMVar() on a
MVar that is not empty.
In our example, we’re basically acquiring a lock to the variables
b, so no other thread can read/write to them while we’re performing our operations. After we’re done, we write the new values back to the variables, releasing the acquired locks.
The problem here is the potential of a deadlock. Imagine a scenario in which two threads, T1 and T2, are calling
transfer() at the same time, but T1 wants to transfer funds from
transfer a1 a2) while T2 do the opposite (calling
transfer a2 a1). It could happen that T1 acquires the lock to
a1, but before it can acquire the lock to
a2, T2 gets it. Now, T1 is blocked waiting for the lock to
a2 and T2 waiting for
a1, causing a deadlock. A proposed solution is to always acquire the locks in a consistent order. For example by assigning IDs to the variables and acquiring the locks ordered by IDs. In this case, both T1 and T2 would try to acquire a1 and then a2.
This issue might not be as obvious if we’re dealing with a complex real-world application, so we want models to prevent such cases. One such solution is using Software Transactional Memory (STM). In this model, the reading and writing to shared variables can be done atomically by using transactions, that is, either all operations succeed or none of them are executed (we rollback to the initial state).
We can draw some parallels between
MVars and the STM library (I recommend reading this post for an introduction).
Transaction variable. or
TVar, is a parametrized type from the
Control.Concurrent.STM library, similar to
MVar. To create a new
TVar, we can use the function
newTVar. Let’s take a look at the type interface for this function:
We see it return a
TVar wrapped in a
STM monad (remember the analogous function for MVars,
newMVar(), returns it wrapped in the
IO monad). Before talking about this monad, let’s describe the ways to access the contents of a
The STM monad., similar to the
IO monad, allow us to perform side effect actions, but
STM limits side-effects to only
TVars.  provides a great explanation about the difference between these two monads:
Why is STM a different monad from IO? The STM implementation relies on being able to roll back the effects of a transaction in the event of a conflict with another transaction (…). A transaction can be rolled back only if we can track exactly what effects it has, and this would not be possible if arbitrary I/O were allowed inside a transaction—we might have performed some I/O that cannot be undone, like making a noise or launching some missiles. For this reason, the STM monad permits only side effects on TVars, and the STM implementation tracks these effects to ensure the correct transaction semantics.
Composability. One feature of STM is composability, since we can combine two or more
STM action in another
STM. This enables better reuse, something we can’t do easily if we use mechanisms like locks.
For an example, imagine we have a function that uses
MVars to modify a variable atomically. Now suppose we want to modify two variables atomically. Even though we have the function to do this independently to each variable, we can’t combine those in order to get a new function that modifies both variables atomically.
Since STM are modelled as monads, we can simply combine them using the
do notation. For example, suppose we have a function
bump(), which increments the content of a
TVar by a given amount:
We can rewrite the transfer function we had for
MVar in terms of
Because we’ll execute all these steps in a single transaction, we don’t have to worry about acquiring the locks for both variables before hand, so we were able to combine independent functions but still have consistency guarantees.
Executing a STM. Even though we’re combining smaller STM actions into more complex ones, we’re still restricted to the STM land, but eventually we’ll have to interface with the real-world, which means we’ll have to convert a
STM action to an
IO action. That’s exactly what the combinator
This will execute the entire
STM action in a transaction.
We also might want to create
TVars in the
IO space instead of the
STM space, for example if we’re interfacing with
IO code that will execute the
STM action. For example, if we want to test our composite
transfer() function, we can define:
Here we create two sample
TVars, with 1 and 2 as values. Since we’re inside an
IO do block, we need to return the
TVars in the
IO space, so that’s why we use the
newTVarIO() variant. We can then execute the
transfer() function atomically and finally print the contents of the variables afterwards. Since
testTransfer() returns an IO action we can try this out on GHCI:
Rollback. How does
STM implements the transaction? What happens when two threads try to access the same
TVar at the same time? Consider two threads T1 and T2 calling
bump() on the same variable at the same time. It might happen that both will execute the first line of the function:
before writing the incremented value back which, if we didn’t have a transaction in place, it would cause a data consistency problem (the thread to write last would overwrite the results of the other). Because we’re executing using a transaction, if either of the threads realizes the contents of their variables changed since the transaction began, it will rollback and restart the transaction. In this case, suppose T2 manages to write back the the variable first. T1 will have to rollback because the state it had at beginning of the transaction changed. T1 will then keep retrying the transaction until it succeeds.
Custom rollbacks. We just saw that STM implements the automatic rollback to guarantee data consistency. But what if we want to force a rollback when some condition is not met? For example, in our transfer example, we could rollback if the balance of one of the variables would become negative. There is another function,
retry, which if called will cause the entire transaction to rollback.
Let’s change our
bump() function to rollback if the resulting value is less than 0.
To test the code above, we can have one thread trying to decrement a variable by an invalid amount, while another thread will increment the content. The decrementing thread will keep trying to run the transaction until the increment thread is done:
In this implementation, we fork a new thread that will try to decrement the
TVar a by 1, if it doesn’t succeed, it will try until the condition is met. We’ve added a delay to the main thread to make sure the child thread has a chance to execute first. Also, we’re using
MVars as a lock to make sure the
putStrLn() operation happens atomically, so the output doesn’t come out all scrambled (we discussed this technique before).
Let’s test calling
testValidBumps() in GHCI with 0:
The child transaction didn’t succeed at first. After a second, the main thread will increment the
TVar by 3, after which the child thread will be able to proceed. Now, if we start with an amount enough for the child thread to succeed, say 1, we get:
OR’ing STMs. Note that when combining multiple STM actions in a
do block, we’re essentially chaining them using the
bind Monad operator. If any STMs in that block can triggers a retry, the entire composed STM will rollback. In a sense, the success of a composed STM is a “AND” of the success of the sub-STMs.
Another way to combine multiple STMs is to “OR” such that an STM succeeds if any of its sub-STMs succeed. We can do that by using the
It tries to execute the first STM and if that is rolled back via
retry, then the second action is then executed. If the second one also triggers a
retry, then the composite STM retries. Let’s extend our previous example, by trying to bump two variables. If we fail to bump the first, we try the second.
We have two variables now and the second thread will try first to decrement
a and if it doesn’t succeed it tries
b. Let’s try with
a few cases:
In the code above, it succeed in decrementing the first variable.
This time, it failed to decrement the first variable, but succeeded on the second.
Finally, in this case, it failed decrementing both variables, so the whole transaction was rolled out, until the main thread eventually bumped the first variable, allowing it to be then decremented.
In this post we learned a new technique to deal with shared memory in a concurrent environment. The STM monad provides a neat design that allows composability and simplifies the work of using transactions in Haskell.
The main references I used for this post offers more practical and complex examples. I personally understand concepts better with toy examples, so I tried to use those in this post. The books offer interesting examples, which provide the power of this design.
After reading/writing about STMs, I feel like I’ve improved my understanding of the IO monad by seeing the differences between it and STM.
Further Reading.  offers a good introduction to the subject and describes the dinning philosophers problem and implements a solution using STM.
Last but not least, Marlow  offers a really nice discussion about performance, based on the current implementation of transactions by STM. The takeaway is to always minimize the amount of work we do inside an atomic block, since it increases the chance of rollbacks and those are not free.
Worth noting, in the same chapter , Marlow implements a deque data structure (a list-like structure which allows inserting/removing elements from either ends in O(1) - amortized in this case) based on Okasaki’s Purely Functional Data Structures, which I’m planning to start reading soon.